[gmx-users] multiple nodes, GPUs on a Cray system

Jan Domanski jandom at gmail.com
Fri Jun 3 17:05:50 CEST 2016


Hi,

This is a continuation of this thread – i'm unable to reply to the original
post, so I'll just post a new

https://mailman-1.sys.kth.se/pipermail/gromacs.org_gmx-users/2016-January/103044.html

The original issue in the thread above – *apparent* slow performance of
Gromacs on the TITAN Cray supercomputer – was something that just affected
me. After some back and forth with TITAN help people, here is a resolution:

Are you trying to run multiple MPI ranks per node all of them accessing the
> GPU? If so, you would need to first set in your submission script:
> export CRAY_CUDA_MPS=1 (bash syntax)
> This variable is needed to allow multiple processes to share the GPU.
> Details
> about this can be found in:
> https://www.olcf.ornl.gov/tutorials/cuda-proxy-managing-gpu-context/
> <https://owa.nexus.ox.ac.uk/owa/redir.aspx?SURL=sL4iYN6F9bR10sxuaJclOc5hfCX6Id-Tn5ZzGJs-800EDGj1v4vTCGgAdAB0AHAAcwA6AC8ALwB3AHcAdwAuAG8AbABjAGYALgBvAHIAbgBsAC4AZwBvAHYALwB0AHUAdABvAHIAaQBhAGwAcwAvAGMAdQBkAGEALQBwAHIAbwB4AHkALQBtAGEAbgBhAGcAaQBuAGcALQBnAHAAdQAtAGMAbwBuAHQAZQB4AHQALwA.&URL=https%3a%2f%2fwww.olcf.ornl.gov%2ftutorials%2fcuda-proxy-managing-gpu-context%2f>.
> The
> variable was formerly called CRAY_CUDA_PROXY, both still accomplish the
> same
> task.
> Then, depending on the number of MPI ranks per node, you want to tell
> Gromacs
> how many ranks will share the GPU. For example, if you want to run an 8 MPI
> rank job across 4 nodes, you would need to add:
> aprun -n 8 -N 2 mdrun_mpi -gpu_id 00
> When OpenMP threads are added in the mix, you also need to pass that value
> to
> Gromacs and to aprun (via the -d option):
> aprun -n 8 -N 2 -d 8 mdrun_mpi -ntomp 8 -gpu_id 00
> The Gromacs site has several recommendations for running on GPU, in which
> they
> recommend running more MPI ranks and fewer OpenMP threads:
>
> http://www.gromacs.org/Documentation/Acceleration_and_parallelization#Multiple_MPI_ranks_per_GPU
> <https://owa.nexus.ox.ac.uk/owa/redir.aspx?SURL=dKPL2p4qnigSIwOpavz1QKwyTXx5ZegohQkZ5fk5XtEEDGj1v4vTCGgAdAB0AHAAOgAvAC8AdwB3AHcALgBnAHIAbwBtAGEAYwBzAC4AbwByAGcALwBEAG8AYwB1AG0AZQBuAHQAYQB0AGkAbwBuAC8AQQBjAGMAZQBsAGUAcgBhAHQAaQBvAG4AXwBhAG4AZABfAHAAYQByAGEAbABsAGUAbABpAHoAYQB0AGkAbwBuACMATQB1AGwAdABpAHAAbABlAF8ATQBQAEkAXwByAGEAbgBrAHMAXwBwAGUAcgBfAEcAUABVAA..&URL=http%3a%2f%2fwww.gromacs.org%2fDocumentation%2fAcceleration_and_parallelization%23Multiple_MPI_ranks_per_GPU>
> More details on the different 'aprun' options can be found at:
> https://www.olcf.ornl.gov/kb_articles/using-the-aprun-command/
> <https://owa.nexus.ox.ac.uk/owa/redir.aspx?SURL=8jbsd-WaJrx-eU2TFpONsSyhZ97Dbj14VNdGElb3lPUEDGj1v4vTCGgAdAB0AHAAcwA6AC8ALwB3AHcAdwAuAG8AbABjAGYALgBvAHIAbgBsAC4AZwBvAHYALwBrAGIAXwBhAHIAdABpAGMAbABlAHMALwB1AHMAaQBuAGcALQB0AGgAZQAtAGEAcAByAHUAbgAtAGMAbwBtAG0AYQBuAGQALwA.&URL=https%3a%2f%2fwww.olcf.ornl.gov%2fkb_articles%2fusing-the-aprun-command%2f>


- Jan


More information about the gromacs.org_gmx-users mailing list