[gmx-users] GPU and MPI

Da-Wei Li lidawei at gmail.com
Tue Sep 2 15:24:23 CEST 2014


I did a little more test. It is unexpected that mixed use of MPI and OPENMP
on 2 nodes will cause dramatic efficiency lose. That is, my previous
slowdown is not caused  by GPU.

BTW, each node on our cluster has two X5650 XEON cpu (6 cores each) and two
Nvidia M2070 GPU (not K40 as I thought before).


Test 1:   12 core on one node, 12 MPI rank, 50ns/day
Test 2:   12 core on one node,  2 MPI rank, 6 OPENMP threads per rank, 41
ns/day
Test 3:   24 core on two nodes, 24 MPI rank, 80ns./day
Test 4:   24 core on two nodes, 4 MPI rank, 6 OPENMP threads per rank, 15
ns/day



dawei


On Tue, Sep 2, 2014 at 6:20 AM, Szilárd Páll <pall.szilard at gmail.com> wrote:

> You may want to try other settings between 4x6 and 24x1 too, e.g. 12x2
> or 6x4 - especially if you have a dual-socket 6-core machine with
> HyperThreading. In my experience, using as many ranks as hardware
> threads with HT in GPU runs results in big slowdown compared to either
> not using HT (i.e. 12x1) or using 2 threads/rank (12x2).
>
> Cheers,
> --
> Szilárd
>
>
> On Mon, Sep 1, 2014 at 5:13 PM, Carsten Kutzner <ckutzne at gwdg.de> wrote:
> >
> > On 01 Sep 2014, at 15:58, Da-Wei Li <lidawei at gmail.com> wrote:
> >
> >> No. With GPU, both domain and PME domain are decomposited by 4X1X1,
> because
> >> I use 4 MPI ranks, in line with 4 GPUs. W/o GPU, domain decomposition is
> >> 20X1X1 and PME is 4X1X1.
> > So the difference in performance could be due to the different DD and
> > PME/PP settings. I would try to use exactly the same settings with and
> > without GPU. With GPUs, you then would need to specify something like
> >
> > mpirun -n 24 mdrun -dd 20 1 1 -npme 4 -gpu_id 0000011111
> >
> > So you get 10 DD plus 2 PME ranks per node and map the first 5 DD ranks
> > to GPU id 0, and the last 5 DD ranks to GPU id 1.
> >
> > Carsten
> >
> >
> >>
> >> dawei
> >>
> >>
> >> On Mon, Sep 1, 2014 at 8:39 AM, Carsten Kutzner <ckutzne at gwdg.de>
> wrote:
> >>
> >>> Hi Dawei,
> >>>
> >>> on two nodes, regarding the cases with and without GPUs,
> >>> do you use the same domain decomposition in both cases?
> >>>
> >>> Carsten
> >>>
> >>>
> >>> On 01 Sep 2014, at 14:30, Da-Wei Li <lidawei at gmail.com> wrote:
> >>>
> >>>> I have added " -resethway" but still the same result. Use two GPU and
> 12
> >>>> cores distributed in 2 nodes will result 33 ns/day, that is, it is
> about
> >>> 3
> >>>> time slower than MD run on one node (2GPU+12core).
> >>>>
> >>>> I have no idea what is wrong.
> >>>>
> >>>>
> >>>> dawei
> >>>>
> >>>>
> >>>> On Mon, Sep 1, 2014 at 5:34 AM, Carsten Kutzner <ckutzne at gwdg.de>
> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> take a look at mdrun’s hidden but sometimes useful options:
> >>>>>
> >>>>> mdrun -h -hiddden
> >>>>>
> >>>>> Carsten
> >>>>>
> >>>>>
> >>>>> On 01 Sep 2014, at 11:07, Oliver Schillinger <
> >>> o.schillinger at fz-juelich.de>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi,
> >>>>>> I did not know about the -resethway command line switch to mdrun.
> >>>>>> Why is it nowhere documented?
> >>>>>> Or am I blind/stupid?
> >>>>>> Cheers,
> >>>>>> Oliver
> >>>>>>
> >>>>>> On 08/29/2014 05:15 PM, Carsten Kutzner wrote:
> >>>>>>> Hi Dawei,
> >>>>>>>
> >>>>>>> On 29 Aug 2014, at 16:52, Da-Wei Li <lidawei at gmail.com> wrote:
> >>>>>>>
> >>>>>>>> Dear Carsten
> >>>>>>>>
> >>>>>>>> Thanks for the clarification. Here it is my benchmark for a small
> >>>>> protein
> >>>>>>>> system (18k atoms).
> >>>>>>>>
> >>>>>>>> (1) 1 node (12 cores/node, no GPU):   50 ns/day
> >>>>>>>> (2) 2 nodes (12 cores/node, no GPU): 80 ns/day
> >>>>>>>> (3) 1 node (12 cores/node, 2 K40 GPUs/node): 100 ns/day
> >>>>>>>> (4) 2 nodes (12 cores/node, 2 K40 GPUs/node): 40 ns/day
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I send out this question because the benchmark 4 above is very
> >>>>> suspicious.
> >>>>>>> Indeed, if you get 80 ns/day without GPUs, then it should not be
> less
> >>>>>>> with GPUs. For how many time steps do you run each of the
> >>>>>>> benchmarks? Do you use the -resethway command line switch to mdrun
> >>>>>>> to disregard the first half of the run (where initialization and
> >>>>>>> balancing is done, you don’t want to count that in a benchmark)?
> >>>>>>>
> >>>>>>> Carsten
> >>>>>>>
> >>>>>>>> But I agree size of my system may play a role.
> >>>>>>>>
> >>>>>>>> best,
> >>>>>>>>
> >>>>>>>> dawei
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Fri, Aug 29, 2014 at 10:36 AM, Carsten Kutzner <
> ckutzne at gwdg.de>
> >>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi Dawei,
> >>>>>>>>>
> >>>>>>>>> the mapping of GPUs to PP ranks is printed for the Master node
> only,
> >>>>>>>>> but if this node reports two GPUs, then all other PP ranks will
> also
> >>>>>>>>> use two GPUs (or an error is reported).
> >>>>>>>>>
> >>>>>>>>> The scaling will depend also on your system size, if this is too
> >>>>> small,
> >>>>>>>>> then you might be better off by using a single node.
> >>>>>>>>>
> >>>>>>>>> Carsten
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On 29 Aug 2014, at 16:24, Da-Wei Li <lidawei at gmail.com> wrote:
> >>>>>>>>>
> >>>>>>>>>> Dear users,
> >>>>>>>>>>
> >>>>>>>>>> I recently try to run Gromacs on two nodes, each of them has 12
> >>> cores
> >>>>>>>>> and 2
> >>>>>>>>>> GPUs. The nodes are connected with infiniband and scaling is
> pretty
> >>>>> good
> >>>>>>>>>> when no GPU is evolved.
> >>>>>>>>>>
> >>>>>>>>>> My command is like this:
> >>>>>>>>>>
> >>>>>>>>>> mpiexec  -npernode 2 -np 4 mdrun_mpi -ntomp 6
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> However, it looks like Gromacs only detected 2 GPUs on node 0,
> then
> >>>>> skip
> >>>>>>>>>> node 1. Part of the output looks like:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> ************************
> >>>>>>>>>>
> >>>>>>>>>> Using 4 MPI processes
> >>>>>>>>>>
> >>>>>>>>>> Using 6 OpenMP threads per MPI process
> >>>>>>>>>>
> >>>>>>>>>> 2 GPUs detected on host n0316.ten:
> >>>>>>>>>>
> >>>>>>>>>> #0: NVIDIA Tesla M2070, compute cap.: 2.0, ECC: yes, stat:
> >>> compatible
> >>>>>>>>>>
> >>>>>>>>>> #1: NVIDIA Tesla M2070, compute cap.: 2.0, ECC: yes, stat:
> >>> compatible
> >>>>>>>>>>
> >>>>>>>>>> 2 GPUs user-selected for this run.
> >>>>>>>>>>
> >>>>>>>>>> Mapping of GPUs to the 2 PP ranks in this node: #0, #1
> >>>>>>>>>>
> >>>>>>>>>> ****************************
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> The performance is about only 40% of the run, where I use only 1
> >>>>> node (12
> >>>>>>>>>> cores+2GPUs).
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Does I miss something?
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> thanks.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> dawei
> >>>>>>>>>> --
> >>>>>>>>>> Gromacs Users mailing list
> >>>>>>>>>>
> >>>>>>>>>> * Please search the archive at
> >>>>>>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List
> before
> >>>>>>>>> posting!
> >>>>>>>>>>
> >>>>>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>>>>>>>>>
> >>>>>>>>>> * For (un)subscribe requests visit
> >>>>>>>>>>
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> >>>>> or
> >>>>>>>>> send a mail to gmx-users-request at gromacs.org.
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Gromacs Users mailing list
> >>>>>>>>>
> >>>>>>>>> * Please search the archive at
> >>>>>>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List
> before
> >>>>>>>>> posting!
> >>>>>>>>>
> >>>>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>>>>>>>>
> >>>>>>>>> * For (un)subscribe requests visit
> >>>>>>>>>
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> >>> or
> >>>>>>>>> send a mail to gmx-users-request at gromacs.org.
> >>>>>>>>>
> >>>>>>>> --
> >>>>>>>> Gromacs Users mailing list
> >>>>>>>>
> >>>>>>>> * Please search the archive at
> >>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >>>>> posting!
> >>>>>>>>
> >>>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>>>>>>>
> >>>>>>>> * For (un)subscribe requests visit
> >>>>>>>>
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> >>> or
> >>>>> send a mail to gmx-users-request at gromacs.org.
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Dr. Carsten Kutzner
> >>>>>>> Max Planck Institute for Biophysical Chemistry
> >>>>>>> Theoretical and Computational Biophysics
> >>>>>>> Am Fassberg 11, 37077 Goettingen, Germany
> >>>>>>> Tel. +49-551-2012313, Fax: +49-551-2012302
> >>>>>>> http://www.mpibpc.mpg.de/grubmueller/kutzner
> >>>>>>> http://www.mpibpc.mpg.de/grubmueller/sppexa
> >>>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Oliver Schillinger
> >>>>>> PhD student
> >>>>>>
> >>>>>> ICS-6 - Structural Biochemistry
> >>>>>> Building 5.8v, Room 3010
> >>>>>> Phone:  +49 2461-61-9532
> >>>>>> Mobile: +49 172 53 27 914
> >>>>>>
> >>>>>> Forschungszentrum Juelich GmbH
> >>>>>> 52425 Juelich
> >>>>>> Sitz der Gesellschaft: Juelich
> >>>>>> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> >>>>>> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
> >>>>>> Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
> >>>>>> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> >>>>>> Prof. Dr. Sebastian M. Schmidt
> >>>>>> --
> >>>>>> Gromacs Users mailing list
> >>>>>>
> >>>>>> * Please search the archive at
> >>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >>>>> posting!
> >>>>>>
> >>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>>>>>
> >>>>>> * For (un)subscribe requests visit
> >>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> or
> >>>>> send a mail to gmx-users-request at gromacs.org.
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Dr. Carsten Kutzner
> >>>>> Max Planck Institute for Biophysical Chemistry
> >>>>> Theoretical and Computational Biophysics
> >>>>> Am Fassberg 11, 37077 Goettingen, Germany
> >>>>> Tel. +49-551-2012313, Fax: +49-551-2012302
> >>>>> http://www.mpibpc.mpg.de/grubmueller/kutzner
> >>>>> http://www.mpibpc.mpg.de/grubmueller/sppexa
> >>>>>
> >>>>> --
> >>>>> Gromacs Users mailing list
> >>>>>
> >>>>> * Please search the archive at
> >>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >>>>> posting!
> >>>>>
> >>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>>>>
> >>>>> * For (un)subscribe requests visit
> >>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> or
> >>>>> send a mail to gmx-users-request at gromacs.org.
> >>>>>
> >>>> --
> >>>> Gromacs Users mailing list
> >>>>
> >>>> * Please search the archive at
> >>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >>> posting!
> >>>>
> >>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>>>
> >>>> * For (un)subscribe requests visit
> >>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >>> send a mail to gmx-users-request at gromacs.org.
> >>>
> >>>
> >>> --
> >>> Dr. Carsten Kutzner
> >>> Max Planck Institute for Biophysical Chemistry
> >>> Theoretical and Computational Biophysics
> >>> Am Fassberg 11, 37077 Goettingen, Germany
> >>> Tel. +49-551-2012313, Fax: +49-551-2012302
> >>> http://www.mpibpc.mpg.de/grubmueller/kutzner
> >>> http://www.mpibpc.mpg.de/grubmueller/sppexa
> >>>
> >>> --
> >>> Gromacs Users mailing list
> >>>
> >>> * Please search the archive at
> >>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >>> posting!
> >>>
> >>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>>
> >>> * For (un)subscribe requests visit
> >>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >>> send a mail to gmx-users-request at gromacs.org.
> >>>
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
> >
> >
> > --
> > Dr. Carsten Kutzner
> > Max Planck Institute for Biophysical Chemistry
> > Theoretical and Computational Biophysics
> > Am Fassberg 11, 37077 Goettingen, Germany
> > Tel. +49-551-2012313, Fax: +49-551-2012302
> > http://www.mpibpc.mpg.de/grubmueller/kutzner
> > http://www.mpibpc.mpg.de/grubmueller/sppexa
> >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list