[gmx-users] Running gmx-4.6.x over multiple homogeneous nodes with GPU acceleration

Wed Jun 5 14:53:42 CEST 2013

Sorry to keep bugging you guys, but even after considering all you
suggested and reading the bugzilla thread Mark pointed out, I'm still
unable to make the simulation run over multiple nodes.
*Here is a template of a simple submission over 2 nodes:*

--- START ---
#!/bin/sh
#
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#
# Job name
#SBATCH -J md
#
# No. of nodes and no. of processors per node
#SBATCH -N 2
#SBATCH --exclusive
#
# Time needed to complete the job
#SBATCH -t 48:00:00
#
# Add modules
module load gcc/4.6.3
module load openmpi/1.6.3/gcc/4.6.3
module load cuda/5.0
module load gromacs/4.6
#
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#
grompp -f md.mdp -c npt.gro -t npt.cpt -p topol -o md.tpr
mpirun -np 4 mdrun_mpi -gpu_id 01 -deffnm md -v
#
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
--- END ---

*Here is an extract of the md.log:*

--- START ---
Using 4 MPI processes
Using 4 OpenMP threads per MPI process

Detecting CPU-specific acceleration.
Present hardware specification:
Vendor: GenuineIntel
Brand:  Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
Family:  6  Model: 45  Stepping:  7
Features: aes apic avx clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc
pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3
tdt x2apic
Acceleration most likely to fit this hardware: AVX_256
Acceleration selected at GROMACS compile time: AVX_256

2 GPUs detected on host en001:
  #0: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible
  #1: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible

-------------------------------------------------------
Program mdrun_mpi, VERSION 4.6
Source code file:
/lunarc/sw/erik/src/gromacs/gromacs-4.6/src/gmxlib/gmx_detect_hardware.c,
line: 322

Fatal error:
Incorrect launch configuration: mismatching number of PP MPI processes and
GPUs per node.
mdrun_mpi was started with 4 PP MPI processes per node, but you provided 2
GPUs.
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------
--- END ---

As you can see, gmx is having trouble understanding that there's a second
node available. Note that since I did not specify -ntomp, it assigned 4
threads to each of the 4 mpi processes (filling the entire avail. 16 CPUs *on
one node*).
For the same exact submission, if I do set "-ntomp 8" (since I have 4 MPI
procs * 8 OpenMP threads = 32 CPUs total on the 2 nodes) I get a warning
telling me that I'm hyperthreading, which can only mean that *gmx is
assigning all processes to the first node once again.*
Am I doing something wrong or is there some problem with gmx-4.6? I guess
it can only be my fault, since I've never seen anyone else complaining
about the same issue here.

*Here are the cluter specs details:*

http://www.lunarc.lu.se/Systems/ErikDetails

Thank you for your patience and expertise,
Best regards,
João Henriques

On Tue, Jun 4, 2013 at 6:30 PM, Szilárd Páll <szilard.pall at cbr.su.se> wrote:

> mdrun is not blind, just the current design does report the hardware
> of all compute nodes used. Whatever CPU/GPU hardware mdrun reports in
> the log/std output is *only* what rank 0, i.e. the first MPI process,
> detects. If you have a heterogeneous hardware configuration, in most
> cases you should be able to run just fine, but you'll still get only
> the hardware the first rank sits on reported.
>
> Hence, if you want to run on 5 of the nodes you mention, you just do:
> mpirun -np 10 mdrun_mpi [-gpu_id 01]
>
> You may want to try both -ntomp 8 and -ntomp 16 (using HyperThreading
> does not always help).
>
> Also note that if you use GPU sharing among ranks (in order to use <8
> threads/rank), (for some technical reasons) disabling dynamic load
> balancing may help - especially if you have a homogenous simulation
> system (and hardware setup).
>
>
> Cheers,
> --
> Szilárd
>
>
> On Tue, Jun 4, 2013 at 3:31 PM, João Henriques
> <joao.henriques.32353 at gmail.com> wrote:
> > Dear all,
> >
> > Since gmx-4.6 came out, I've been particularly interested in taking
> > advantage of the native GPU acceleration for my simulations. Luckily, I
> > have access to a cluster with the following specs PER NODE:
> >
> > CPU
> > 2 E5-2650 (2.0 Ghz, 8-core)
> >
> > GPU
> > 2 Nvidia K20
> >
> > I've become quite familiar with the "heterogenous parallelization" and
> > "multiple MPI ranks per GPU" schemes on a SINGLE NODE. Everything works
> > fine, no problems at all.
> >
> > Currently, I'm working with a nasty system comprising 608159 tip3p water
> > molecules and it would really help to accelerate things up a bit.
> > Therefore, I would really like to try to parallelize my system over
> > multiple nodes and keep the GPU acceleration.
> >
> > I've tried many different command combinations, but mdrun seems to be
> blind
> > towards the GPUs existing on other nodes. It always finds GPUs #0 and #1
> on
> > the first node and tries to fit everything into these, completely
> > disregarding the existence of the other GPUs on the remaining requested
> > nodes.
> >
> > Once again, note that all nodes have exactly the same specs.
> >
> > Literature on the official gmx website is not, well... you know...
> in-depth
> > and I would really appreciate if someone could shed some light into this
> > subject.
> >
> > Thank you,
> > Best regards,
> >
> > --
> > João Henriques
> > --
> > gmx-users mailing list    gmx-users at gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> > * Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-users-request at gromacs.org.
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> --
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>

-- 
João Henriques