[gmx-users] Mismatching number of PP MPI processes and GPUs per node

Szilárd Páll szilard.pall at cbr.su.se
Fri Mar 22 19:40:22 CET 2013


Hi,

Actually, if you don't want to run across the network, with those Westmere
processors you should be fine with running OpenMP across the two sockets,
i.e
mdrun -ntomp 24
or to run without HyperThreading (which can be sometimes faster) just use
mdrun -ntomp 12 -pin on

Now, when it comes to GPU runs, your CPUs are so much faster than that
rather slow Quadro card, that I doubt you will see any benefit from using
it, but you should try.

One other thing to consider is and even more complicated way of doing work
sharing between CPU and GPU, but for this you need domain decomposition,
hence MPI, hence multiple ranks per GPU, hence thread-MPI won't work:
mpirun -np 2 mdrun_mpi -ntomp 6 -gpu_id 00 -nb gpu_cpu
(that is without HT).

Cheers,


--
Szilárd


On Fri, Mar 22, 2013 at 3:20 AM, George Patargias <gpat at bioacademy.gr>wrote:

> Hello Szilard
>
> Many thanks for these very useful comments!
>
> We run jobs on a single node of a small Apple cluster (no Infiniband
> unfortunately). One node has two Intel(R) Xeon(R) X5650 Processors each
> with 6 cores and 12 threads, so in total 12 cores and 24 threads.
>
> I have compiled GROMACS 4.6.1 with OpenMP and CUDA support (-DGMX_MPI=OFF)
>
> Do I understand correctly that this build will work efficiently without gpu
> but not with gpu as you mention?
>
> In order to run gromacs according to your suggestion
>
> mpirun -np 2 -ntomp 6 mdrun_mpi -s test.tpr -deffnm test_out -gpu_id 00
>
> Do I need to compile with MPI (i.e. -DGMX_MPI=ON)?
>
> Would this then be the most efficient way to run GROMACS in the given
> hardware set-up (dual socket, total 12 cores and GPU)?
>
> Thanks again!
>
> Best
> George
>
>
>
>
> > FYI: On your machine running OpenMP across two sockets will probably not
> be
> > very efficient. Depending on the input and at how high paralleliation
> are
> > you running, you could be better off with running multiple MPI ranks per
> GPU. This is a bit of an unexplained feature due to it being complicated
> to
> > use and not fully supported (does not woth with thread-MPI), but you can
> essentially make  multiple MPI ranks use the same GPU by passing the ID
> of
> > the GPU you want to "overload" multiple times (and launching the correct
> number of MPI ranks).
> > For instance, in your case you can try putting one MPI rank per socket,
> both using GPU 0 by:
> > mpirun -np 2 -ntomp 6 mdrun_mpi -s test.tpr -deffnm test_out -nb gpu
> -gpu_id 00
> > This is briefly explained on the wiki as well:
> >
> http://www.gromacs.org/Documentation/Acceleration_and_parallelization#Multiple_MPI_ranks_per_GPU
> Let us know whether you are able to get useful speedup from GPUs!
> Cheers,
> > --
> > Szil?rd
> > On Tue, Mar 12, 2013 at 10:06 AM, George Patargias
> > <gpat at bioacademy.gr>wrote:
> >> Hi Carsten
> >> Thanks a lot for this tip. It worked!
> >> George
> >> > Hi,
> >> > On Mar 11, 2013, at 10:50 AM, George Patargias <gpat at bioacademy.gr>
> >> wrote:
> >> >> Hello
> >> >> Sorry for posting this again.
> >> >> I am trying to run GROMACS 4.6 compiled with MPI and GPU
> acceleration
> >> >> (CUDA 5.0 lib) using the following SGE batch script.
> >> >> #!/bin/sh
> >> >> #$ -V
> >> >> #$ -S /bin/sh
> >> >> #$ -N test-gpus
> >> >> #$ -l h="xgrid-node02"
> >> >> #$ -pe mpi_fill_up 12
> >> >> #$ -cwd
> >> >> source /opt/NetUsers/pgkeka/gromacs-4.6_gpu_mpi/bin/GMXRC
> >> >> export
> >> >> DYLD_LIBRARY_PATH=/Developer/NVIDIA/CUDA-5.0/lib:$DYLD_LIBRARY_PATH
> mpirun -np 12 mdrun_mpi -s test.tpr -deffnm test_out -nb gpu After
> detection of the installed GPU card
> >> >> 1 GPU detected on host xgrid-node02.xgrid:
> >> >>  #0: NVIDIA Quadro 4000, compute cap.: 2.0, ECC:  no, stat:
> >> compatible
> >> >> GROMACS issues the following error
> >> >> Incorrect launch configuration: mismatching number of PP MPI
> >> processes
> >> >> and
> >> >> GPUs per node. mdrun_mpi was started with 12 PP MPI processes per
> >> node,
> >> >> but only 1 GPU were detected.
> >> >> It can't be that we need to run GROMACS only on a single core so
> that
> >> it
> >> >> matches the single GPU card.
> >> > Have you compiled mdrun_mpi with OpenMP threads support? Then, if you
> do
> >> > mpirun -np 1 mdrun_mpi ?
> >> > it should start one MPI process with 12 OpenMP threads, which should
> >> give
> >> > you what you want. You can also manually specify the number of OpenMP
> threads
> >> > by adding
> >> > -ntomp 12
> >> > Carsten
> >> >> Do you have any idea what has to be done?
> >> >> Many thanks.
> >> >> Dr. George Patargias
> >> >> Postdoctoral Researcher
> >> >> Biomedical Research Foundation
> >> >> Academy of Athens
> >> >> 4, Soranou Ephessiou
> >> >> 115 27
> >> >> Athens
> >> >> Greece
> >> >> Office: +302106597568
> >> >> --
> >> >> gmx-users mailing list    gmx-users at gromacs.org
> >> >> http://lists.gromacs.org/mailman/listinfo/gmx-users
> >> >> * Please search the archive at
> >> >> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-request at gromacs.org.
> >> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >> > --
> >> > Dr. Carsten Kutzner
> >> > Max Planck Institute for Biophysical Chemistry
> >> > Theoretical and Computational Biophysics
> >> > Am Fassberg 11, 37077 Goettingen, Germany
> >> > Tel. +49-551-2012313, Fax: +49-551-2012302
> >> > http://www.mpibpc.mpg.de/grubmueller/kutzner
> >> > http://www.mpibpc.mpg.de/grubmueller/sppexa
> >> > --
> >> > gmx-users mailing list    gmx-users at gromacs.org
> >> > http://lists.gromacs.org/mailman/listinfo/gmx-users
> >> > * Please search the archive at
> >> > http://www.gromacs.org/Support/Mailing_Lists/Search before posting! *
> Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-request at gromacs.org.
> >> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >> Dr. George Patargias
> >> Postdoctoral Researcher
> >> Biomedical Research Foundation
> >> Academy of Athens
> >> 4, Soranou Ephessiou
> >> 115 27
> >> Athens
> >> Greece
> >> Office: +302106597568
> >> --
> >> gmx-users mailing list    gmx-users at gromacs.org
> >> http://lists.gromacs.org/mailman/listinfo/gmx-users
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/Search before posting! *
> Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-request at gromacs.org.
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > --
> > gmx-users mailing list    gmx-users at gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/Search before posting! *
> Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-users-request at gromacs.org.
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
>
> Dr. George Patargias
> Postdoctoral Researcher
> Biomedical Research Foundation
> Academy of Athens
> 4, Soranou Ephessiou
> 115 27
> Athens
> Greece
>
> Office: +302106597568
>
>
>
>
> --
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>



More information about the gromacs.org_gmx-users mailing list