[gmx-users] Working on a GPU cluster with GROMACS 5

Ebert Maximilian m.ebert at umontreal.ca
Wed Jan 7 13:06:46 CET 2015


Hi Carsten,

thanks for your answer. I tried what you described and it is basically working except for letting multiple MPI workers use one GPU. In my setup I use 4 GPUs with 8 MPI workers and hence 8 CPUs and OpenMP 1.  This is how I start GROMACS:

mpirun -np 8 gmx_mpi mdrun -gpu_id 00112233 -v -x -deffnm run1ns -s ../run1ns.tpr

and I submit this using:

qsub -q @test -lnodes=1:ppn=4 -lwalltime=1:00:00 gromacs_run_gpu

Now I get the following errors (the output is longer but to keep it shorter I omitted the rest):

Using 8 MPI processes
Using 1 OpenMP thread per MPI process

7 GPUs detected on host ngpu-a4-06:
  #0: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC:  no, stat: compatible
  #1: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC:  no, stat: compatible
  #2: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC:  no, stat: compatible
  #3: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC:  no, stat: compatible
  #4: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC:  no, stat: compatible
  #5: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC:  no, stat: compatible
  #6: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC:  no, stat: compatible

4 GPUs user-selected for this run.
Mapping of GPUs to the 8 PP ranks in this node: #0, #0, #1, #1, #2, #2, #3, #3

NOTE: You assigned GPUs to multiple MPI processes.

-------------------------------------------------------
Program gmx_mpi, VERSION 5.0.1
Source code file: /RQusagers/rqchpbib/stubbsda/gromacs-5.0.1/src/gromacs/gmxlib/cuda_tools/pmalloc_cuda.cu, line: 61

Fatal error:
cudaMallocHost of size 4 bytes failed: all CUDA-capable devices are busy or unavailable

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------

Error on rank 1, will try to stop all ranks
Halting parallel program gmx_mpi on CPU 1 out of 8

-------------------------------------------------------
Program gmx_mpi, VERSION 5.0.1
Source code file: /RQusagers/rqchpbib/stubbsda/gromacs-5.0.1/src/gromacs/gmxlib/cuda_tools/pmalloc_cuda.cu, line: 61

Fatal error:
cudaMallocHost of size 4 bytes failed: all CUDA-capable devices are busy or unavailable

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------

Error on rank 3, will try to stop all ranks
Halting parallel program gmx_mpi on CPU 3 out of 8

-----Ursprüngliche Nachricht-----
Von: gromacs.org_gmx-users-bounces at maillist.sys.kth.se [mailto:gromacs.org_gmx-users-bounces at maillist.sys.kth.se] Im Auftrag von Carsten Kutzner
Gesendet: Donnerstag, 18. Dezember 2014 17:27
An: gmx-users at gromacs.org
Betreff: Re: [gmx-users] Working on a GPU cluster with GROMACS 5

Hi Max,

On 18 Dec 2014, at 15:30, Ebert Maximilian <m.ebert at umontreal.ca> wrote:

> Dear list,
> 
> I am benchmarking my system on a GPU cluster with 6 GPU's and two quad core CPUs for each node. First I am wondering if there is any output which confirms how many CPUs and GPUs were used during the run? I find the output for GPUs in the log file but only for a single node. When I use multiple nodes why don't the other nodes show up in the log file as hosts? For instance in this example I used two nodes and claimed 4 GPUs each but got this in my log file:
> 
> 6 GPUs detected on host ngpu-a4-01:
>  #0: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC:  no, stat: 
> compatible
>  #1: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC:  no, stat: 
> compatible
>  #2: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC:  no, stat: 
> compatible
>  #3: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC:  no, stat: 
> compatible
>  #4: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC:  no, stat: 
> compatible
>  #5: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC:  no, stat: 
> compatible
> 
> 4 GPUs auto-selected for this run.
> Mapping of GPUs to the 4 PP ranks in this node: #0, #1, #2, #3
This will be the same across all nodes. Gromacs will refuse to run if there are not enough GPUs on any of your other nodes.

> 
> 
> 
> ngpu-a4-02 is not shown here. Any idea? The job was submitted in the following way:
> 
> qsub -q @test -lnodes=2:ppn=4 -lwalltime=1:00:00 gromacs_run_gpu
> 
> and the gromacs_run_gpu file:
> 
> #!/bin/csh
> #
> 
> #PBS -o result_run10ns96-8.dat
> #PBS -j oe
> #PBS -W umask=022
> #PBS -r n
> 
> cd 8_gpu
> 
> module add CUDA
> module load gromacs/5.0.1-gpu
> 
> mpirun gmx_mpi mdrun -v -x -deffnm 10ns_rep1-8GPU
> 
> 
> Another question I had was how can I define the number of CPUs and check if they were really used?
Use -ntomp to control how many OpenMP threads each of your MPI processes will have.
This way you can make use of all cores you have on each node.

> I can't find any information about the number of CPUs in the log file.
Look for
"Using . MPI processes"
"Using . OpenMP threads per MPI process"
in the log file.

> I would also like to try combinations like 4 CPUs + 1 GPU
You can use the -gpu_id switch to supply a list of eligible GPUs (see mdrun -h).
If you just want to use the first GPU on you node with, e.g. 4 MPI processes, use -gpu_id 0000.

Best,
  Carsten



> or 2 CPUs + 2 GPU. How do I set this up?
> 
> Thank you very much for your help,
> 
> Max
> 
> --
> Gromacs Users mailing list
> 
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
> 
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> 
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.


--
Dr. Carsten Kutzner
Max Planck Institute for Biophysical Chemistry Theoretical and Computational Biophysics Am Fassberg 11, 37077 Goettingen, Germany Tel. +49-551-2012313, Fax: +49-551-2012302 http://www.mpibpc.mpg.de/grubmueller/kutzner
http://www.mpibpc.mpg.de/grubmueller/sppexa

--
Gromacs Users mailing list

* Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.


More information about the gromacs.org_gmx-users mailing list