[gmx-users] simulation on 2 gpus

Stefano Guglielmo stefano.guglielmo at unito.it
Mon Aug 5 16:58:51 CEST 2019


Dear Paul,
thanks for suggestions. Following them I managed to run 91 ns/day for the
system I referred to in my previous post with the configuration:
gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 4 -ntmpi 7 -npme 1 -gputasks
0000111 -pin on (still 28 threads seems to be the best choice)

and 56 ns/day for two independent runs:
gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 4 -ntmpi 7 -npme 1 -gputasks
0000000 -pin on -pinoffset 0 -pinstride 1
gmx mdrun -deffnm run2 -nb gpu -pme gpu -ntomp 4 -ntmpi 7 -npme 1 -gputasks
1111111 -pin on -pinoffset 28 -pinstride 1
which is a fairly good result.
I am still wondering if somehow I should pin the threads in some different
way in order to reflect the cpu topology and if this can influence
performance (if I remember well NAMD allows the user to indicate explicitly
the cpu core/threads to use in a computation).

When I tried to run two simulations with the following configuration:
gmx mdrun -deffnm run -nb gpu -pme gpu -ntomp 4 -ntmpi 8 -npme 1 -gputasks
00001111 -pin on -pinoffset 0 -pinstride 1
gmx mdrun -deffnm run2 -nb gpu -pme gpu -ntomp 4 -ntmpi 8 -npme 1 -gputasks
00001111 -pin on -pinoffset 0 -pinstride 32
the system crashed down. Probably this is normal and I am missing something
quite obvious.

Thanks again for the valuable advices
Stefano



Il giorno dom 4 ago 2019 alle ore 01:40 paul buscemi <pbuscemi at q.com> ha
scritto:

> Stefano,
>
> A recent run with 140000 atoms, including 10000 isopropanol  molecules on
> top of  an end restrained PDMS surface of  74000 atoms  in a 20 20 30 nm
> box ran at 67 ns/d nvt with the mdrun conditions I posted. It took 120 ns
> for 100 molecules of an adsorbate  to go from solution to the surface.   I
> don't think this will set the world ablaze with any benchmarks but it is
> acceptable to get some work done.
>
> Linux Mint Mate 18, AMD Threadripper 32 core 2990wx 4.2Ghz, 32GB DDR4, 2x
> RTX 2080TI gmx2019 in the simplest gmx configuration for gpus,  CUDA
> version 10, Nvidia 410.7p loaded  from the repository
>
> Paul
>
> > On Aug 3, 2019, at 12:58 PM, paul buscemi <pbuscemi at q.com> wrote:
> >
> > Stefano,
> >
> > Here is a typical run
> >
> > fpr minimization mdrun -deffnm   grofile. -nn gpu
> >
> > and for other runs for a 32 core
> >
> > gmx -deffnm grofile.nvt  -nb gpu -pme gpu -ntomp  8  -ntmpi 8  -npme 1
> -gputasks 0000000011111111  -pin on
> >
> > Depending on the molecular system/model   -ntomp -4 -ntmpi 16  may be
> faster   - of course adjusting -gputasks
> >
> > Rarely do I find that not using ntomp and ntpmi is faster, but it is
> never bad
> >
> > Let me know how it goes.
> >
> > Paul
> >
> >> On Aug 3, 2019, at 4:41 AM, Stefano Guglielmo <
> stefano.guglielmo at unito.it> wrote:
> >>
> >> Hi Paul,
> >> thanks for the reply. Would you mind posting the command you used or
> >> telling how did you balance the work between cpu and gpu?
> >>
> >> What about pinning? Does anyone know how to deal with a cpu topology
> like
> >> the one reported in my previous post and if it is relevant for
> performance?
> >> Thanks
> >> Stefano
> >>
> >> Il giorno sabato 3 agosto 2019, Paul Buscemi <pbuscemi at q.com> ha
> scritto:
> >>
> >>> I run the same system and setup but no nvlink. Maestro runs both gpus
> at
> >>> 100 percent. Gromacs typically 50 --60 percent can do 600ns/d on 20000
> >>> atoms
> >>>
> >>> PB
> >>>
> >>>> On Jul 25, 2019, at 9:30 PM, Kevin Boyd <kevin.boyd at uconn.edu> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> I've done a lot of research/experimentation on this, so I can maybe
> get
> >>> you
> >>>> started - if anyone has any questions about the essay to follow, feel
> >>> free
> >>>> to email me personally, and I'll link it to the email thread if it
> ends
> >>> up
> >>>> being pertinent.
> >>>>
> >>>> First, there's some more internet resources to checkout. See Mark's
> talk
> >>> at
> >>>> -
> >>>> https://bioexcel.eu/webinar-performance-tuning-and-
> >>> optimization-of-gromacs/
> >>>> Gromacs development moves fast, but a lot of it is still relevant.
> >>>>
> >>>> I'll expand a bit here, with the caveat that Gromacs GPU development
> is
> >>>> moving very fast and so the correct commands for optimal performance
> are
> >>>> both system-dependent and a moving target between versions. This is a
> >>> good
> >>>> thing - GPUs have revolutionized the field, and with each iteration we
> >>> make
> >>>> better use of them. The downside is that it's unclear exactly what
> sort
> >>> of
> >>>> CPU-GPU balance you should look to purchase to take advantage of
> future
> >>>> developments, though the trend is certainly that more and more
> >>> computation
> >>>> is being offloaded to the GPUs.
> >>>>
> >>>> The most important consideration is that to get maximum total
> throughput
> >>>> performance, you should be running not one but multiple simulations
> >>>> simultaneously. You can do this through the -multidir option, but I
> don't
> >>>> recommend that in this case, as it requires compiling with MPI and
> limits
> >>>> some of your options. My run scripts usually use "gmx mdrun ... &" to
> >>>> initiate subprocesses, with combinations of -ntomp, -ntmpi, -pin
> >>>> -pinoffset, and -gputasks. I can give specific examples if you're
> >>>> interested.
> >>>>
> >>>> Another important point is that you can run more simulations than the
> >>>> number of GPUs you have. Depending on CPU-GPU balance and quality, you
> >>>> won't double your throughput by e.g. putting 4 simulations on 2 GPUs,
> but
> >>>> you might increase it up to 1.5x. This would involve targeting the
> same
> >>> GPU
> >>>> with -gputasks.
> >>>>
> >>>> Within a simulation, you should set up a benchmarking script to figure
> >>> out
> >>>> the best combination of thread-mpi ranks and open-mp threads - this
> can
> >>>> have pretty drastic effects on performance. For example, if you want
> to
> >>> use
> >>>> your entire machine for one simulation (not recommended for maximal
> >>>
> >>> --
> >>> Gromacs Users mailing list
> >>>
> >>> * Please search the archive at http://www.gromacs.org/
> >>> Support/Mailing_Lists/GMX-Users_List before posting!
> >>>
> >>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>>
> >>> * For (un)subscribe requests visit
> >>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >>> send a mail to gmx-users-request at gromacs.org.
> >>>
> >>
> >>
> >> --
> >> Stefano GUGLIELMO PhD
> >> Assistant Professor of Medicinal Chemistry
> >> Department of Drug Science and Technology
> >> Via P. Giuria 9
> >> 10125 Turin, ITALY
> >> ph. +39 (0)11 6707178
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
> >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


-- 
Stefano GUGLIELMO PhD
Assistant Professor of Medicinal Chemistry
Department of Drug Science and Technology
Via P. Giuria 9
10125 Turin, ITALY
ph. +39 (0)11 6707178


More information about the gromacs.org_gmx-users mailing list