[gmx-users] Working on a GPU cluster with GROMACS 5
Ebert Maximilian
m.ebert at umontreal.ca
Wed Jan 7 13:06:46 CET 2015
Hi Carsten,
thanks for your answer. I tried what you described and it is basically working except for letting multiple MPI workers use one GPU. In my setup I use 4 GPUs with 8 MPI workers and hence 8 CPUs and OpenMP 1. This is how I start GROMACS:
mpirun -np 8 gmx_mpi mdrun -gpu_id 00112233 -v -x -deffnm run1ns -s ../run1ns.tpr
and I submit this using:
qsub -q @test -lnodes=1:ppn=4 -lwalltime=1:00:00 gromacs_run_gpu
Now I get the following errors (the output is longer but to keep it shorter I omitted the rest):
Using 8 MPI processes
Using 1 OpenMP thread per MPI process
7 GPUs detected on host ngpu-a4-06:
#0: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC: no, stat: compatible
#1: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC: no, stat: compatible
#2: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC: no, stat: compatible
#3: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC: no, stat: compatible
#4: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC: no, stat: compatible
#5: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC: no, stat: compatible
#6: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC: no, stat: compatible
4 GPUs user-selected for this run.
Mapping of GPUs to the 8 PP ranks in this node: #0, #0, #1, #1, #2, #2, #3, #3
NOTE: You assigned GPUs to multiple MPI processes.
-------------------------------------------------------
Program gmx_mpi, VERSION 5.0.1
Source code file: /RQusagers/rqchpbib/stubbsda/gromacs-5.0.1/src/gromacs/gmxlib/cuda_tools/pmalloc_cuda.cu, line: 61
Fatal error:
cudaMallocHost of size 4 bytes failed: all CUDA-capable devices are busy or unavailable
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------
Error on rank 1, will try to stop all ranks
Halting parallel program gmx_mpi on CPU 1 out of 8
-------------------------------------------------------
Program gmx_mpi, VERSION 5.0.1
Source code file: /RQusagers/rqchpbib/stubbsda/gromacs-5.0.1/src/gromacs/gmxlib/cuda_tools/pmalloc_cuda.cu, line: 61
Fatal error:
cudaMallocHost of size 4 bytes failed: all CUDA-capable devices are busy or unavailable
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------
Error on rank 3, will try to stop all ranks
Halting parallel program gmx_mpi on CPU 3 out of 8
-----Ursprüngliche Nachricht-----
Von: gromacs.org_gmx-users-bounces at maillist.sys.kth.se [mailto:gromacs.org_gmx-users-bounces at maillist.sys.kth.se] Im Auftrag von Carsten Kutzner
Gesendet: Donnerstag, 18. Dezember 2014 17:27
An: gmx-users at gromacs.org
Betreff: Re: [gmx-users] Working on a GPU cluster with GROMACS 5
Hi Max,
On 18 Dec 2014, at 15:30, Ebert Maximilian <m.ebert at umontreal.ca> wrote:
> Dear list,
>
> I am benchmarking my system on a GPU cluster with 6 GPU's and two quad core CPUs for each node. First I am wondering if there is any output which confirms how many CPUs and GPUs were used during the run? I find the output for GPUs in the log file but only for a single node. When I use multiple nodes why don't the other nodes show up in the log file as hosts? For instance in this example I used two nodes and claimed 4 GPUs each but got this in my log file:
>
> 6 GPUs detected on host ngpu-a4-01:
> #0: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC: no, stat:
> compatible
> #1: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC: no, stat:
> compatible
> #2: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC: no, stat:
> compatible
> #3: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC: no, stat:
> compatible
> #4: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC: no, stat:
> compatible
> #5: NVIDIA GeForce GTX 580, compute cap.: 2.0, ECC: no, stat:
> compatible
>
> 4 GPUs auto-selected for this run.
> Mapping of GPUs to the 4 PP ranks in this node: #0, #1, #2, #3
This will be the same across all nodes. Gromacs will refuse to run if there are not enough GPUs on any of your other nodes.
>
>
>
> ngpu-a4-02 is not shown here. Any idea? The job was submitted in the following way:
>
> qsub -q @test -lnodes=2:ppn=4 -lwalltime=1:00:00 gromacs_run_gpu
>
> and the gromacs_run_gpu file:
>
> #!/bin/csh
> #
>
> #PBS -o result_run10ns96-8.dat
> #PBS -j oe
> #PBS -W umask=022
> #PBS -r n
>
> cd 8_gpu
>
> module add CUDA
> module load gromacs/5.0.1-gpu
>
> mpirun gmx_mpi mdrun -v -x -deffnm 10ns_rep1-8GPU
>
>
> Another question I had was how can I define the number of CPUs and check if they were really used?
Use -ntomp to control how many OpenMP threads each of your MPI processes will have.
This way you can make use of all cores you have on each node.
> I can't find any information about the number of CPUs in the log file.
Look for
"Using . MPI processes"
"Using . OpenMP threads per MPI process"
in the log file.
> I would also like to try combinations like 4 CPUs + 1 GPU
You can use the -gpu_id switch to supply a list of eligible GPUs (see mdrun -h).
If you just want to use the first GPU on you node with, e.g. 4 MPI processes, use -gpu_id 0000.
Best,
Carsten
> or 2 CPUs + 2 GPU. How do I set this up?
>
> Thank you very much for your help,
>
> Max
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
--
Dr. Carsten Kutzner
Max Planck Institute for Biophysical Chemistry Theoretical and Computational Biophysics Am Fassberg 11, 37077 Goettingen, Germany Tel. +49-551-2012313, Fax: +49-551-2012302 http://www.mpibpc.mpg.de/grubmueller/kutzner
http://www.mpibpc.mpg.de/grubmueller/sppexa
--
Gromacs Users mailing list
* Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
More information about the gromacs.org_gmx-users
mailing list