[gmx-users] multiple GPU on multiple nodes

Szilárd Páll pall.szilard at gmail.com
Mon Jan 27 23:55:36 CET 2014


You need to export the CRAY_CUDA_PROXY or CRAY_CUDA_MPS (e.g. on Blue
Waters) environment variable which enables multiple ranks to use the
same GPU.

You are running a binary compiled without CPU SIMD support (i.e.
GMX_CPU_ACCELERATION=None instead of AVX_128_FMA) as the console
output notes. This will cost you performance.

Also, I strongly suggest that you get the latest version, 4.6.5.

Some tips on getting good performance - XK7-s are often tricky when it
comes to getting good performance:
- try 2-8 ranks per node;
- with more than 4 nodes separate PME ranks typically help, a one to
one PP:PME ratio is what I'd try first;
- try both "aprun -cc none" and "aprun -cc cpu"; the former will be
faster at lower parallelization, but the latter will become faster
(I've seen up to 30% difference);

Cheers,
--
Szilárd


On Mon, Jan 27, 2014 at 11:16 PM, jhon michael espinosa duran
<cyberjhon at hotmail.com> wrote:
> Hi guys
> I am using a CrayXK7 machine with one GPU (Tesla K20) per node (one AMD Opteron 16-core Interlagos x86_64). Currently I am trying to run the gromacs gpu version using only two nodesbut it is not working.
> When I tried using one node and one gpu it works
> aprun -n 1 mdrun_mpi -deffnm filename
>
> when I try two nodes and two GPUs, it does not work (these are the ways that I had tried)
> aprun -n 2 mdrun_mpi -deffnm filename
> aprun -n 2 mdrun_mpi -gpu_id 00 -deffnm filename
> aprun -n 2 mdrun_mpi -gpu_if 0011 -deffnm filename
> aprun -n 32 mdrun_mpi -deffnm filename
> aprun -n 32 mdrun_mpi -gpu_id 00 -deffnm filename
> aprun -n 32 mdrun_mpi -gpu_if 0011 -deffnm filename
> Sometimes  I got errors like:
> Program mdrun_mpi, VERSION 4.6.2
> Source code file: /N/soft/cle4/gromacs/gromacs-4.6.2/src/gmxlib/gmx_detect_hardware.c, line: 580
>
> Fatal error:
> Some of the requested GPUs do not exist, behave strangely, or are not compatible:
>     GPU #0: insane
>     GPU #0: insane
>
> For more information and tips for troubleshooting, please check the GROMACS
> website at http://www.gromacs.org/Documentation/Errors
>
>
> Program mdrun_mpi, VERSION 4.6.2
> Source code file: /N/soft/cle4/gromacs/gromacs-4.6.2/src/gmxlib/statutil.c, line: 976
>
> Invalid command line argument:
> 0
> For more information and tips for troubleshooting, please check the GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> 1 GPU detected on host nid00900:
>   #0: NVIDIA Tesla K20, compute cap.: 3.5, ECC: yes, stat: compatible
>
>
> -------------------------------------------------------
> Program mdrun_mpi, VERSION 4.6.2
> Source code file: /N/soft/cle4/gromacs/gromacs-4.6.2/src/gmxlib/gmx_detect_hardware.c, line: 580
>
> Fatal error:
> Some of the requested GPUs do not exist, behave strangely, or are not compatible:
>     GPU #1: inexistent
>     GPU #1: inexistent
>
> For more information and tips for troubleshooting, please check the GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> 1 GPU detected on host nid00900:
>   #0: NVIDIA Tesla K20, compute cap.: 3.5, ECC: yes, stat: compatible
>
> Compiled acceleration: None (Gromacs could use AVX_128_FMA on this machine, which is better)
>
> -------------------------------------------------------
> Program mdrun_mpi, VERSION 4.6.2
> Source code file: /N/soft/cle4/gromacs/gromacs-4.6.2/src/gmxlib/gmx_detect_hardware.c, line: 356
>
> Fatal error:
> Incorrect launch configuration: mismatching number of PP MPI processes and GPUs per node.
> mdrun_mpi was started with 2 PP MPI processes per node, but only 1 GPU were detected.
> For more information and tips for troubleshooting, please check the GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> -------------------------------------------------------
> If you have any idea how make it work, please let me know
> John Michael
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.


More information about the gromacs.org_gmx-users mailing list