[gmx-users] simulation on 2 gpus

Stefano Guglielmo stefano.guglielmo at unito.it
Sat Aug 3 11:41:44 CEST 2019


Hi Paul,
thanks for the reply. Would you mind posting the command you used or
telling how did you balance the work between cpu and gpu?

What about pinning? Does anyone know how to deal with a cpu topology like
the one reported in my previous post and if it is relevant for performance?
Thanks
Stefano

Il giorno sabato 3 agosto 2019, Paul Buscemi <pbuscemi at q.com> ha scritto:

> I run the same system and setup but no nvlink. Maestro runs both gpus at
> 100 percent. Gromacs typically 50 --60 percent can do 600ns/d on 20000
> atoms
>
> PB
>
> > On Jul 25, 2019, at 9:30 PM, Kevin Boyd <kevin.boyd at uconn.edu> wrote:
> >
> > Hi,
> >
> > I've done a lot of research/experimentation on this, so I can maybe get
> you
> > started - if anyone has any questions about the essay to follow, feel
> free
> > to email me personally, and I'll link it to the email thread if it ends
> up
> > being pertinent.
> >
> > First, there's some more internet resources to checkout. See Mark's talk
> at
> > -
> > https://bioexcel.eu/webinar-performance-tuning-and-
> optimization-of-gromacs/
> > Gromacs development moves fast, but a lot of it is still relevant.
> >
> > I'll expand a bit here, with the caveat that Gromacs GPU development is
> > moving very fast and so the correct commands for optimal performance are
> > both system-dependent and a moving target between versions. This is a
> good
> > thing - GPUs have revolutionized the field, and with each iteration we
> make
> > better use of them. The downside is that it's unclear exactly what sort
> of
> > CPU-GPU balance you should look to purchase to take advantage of future
> > developments, though the trend is certainly that more and more
> computation
> > is being offloaded to the GPUs.
> >
> > The most important consideration is that to get maximum total throughput
> > performance, you should be running not one but multiple simulations
> > simultaneously. You can do this through the -multidir option, but I don't
> > recommend that in this case, as it requires compiling with MPI and limits
> > some of your options. My run scripts usually use "gmx mdrun ... &" to
> > initiate subprocesses, with combinations of -ntomp, -ntmpi, -pin
> > -pinoffset, and -gputasks. I can give specific examples if you're
> > interested.
> >
> > Another important point is that you can run more simulations than the
> > number of GPUs you have. Depending on CPU-GPU balance and quality, you
> > won't double your throughput by e.g. putting 4 simulations on 2 GPUs, but
> > you might increase it up to 1.5x. This would involve targeting the same
> GPU
> > with -gputasks.
> >
> > Within a simulation, you should set up a benchmarking script to figure
> out
> > the best combination of thread-mpi ranks and open-mp threads - this can
> > have pretty drastic effects on performance. For example, if you want to
> use
> > your entire machine for one simulation (not recommended for maximal
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/
> Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


-- 
Stefano GUGLIELMO PhD
Assistant Professor of Medicinal Chemistry
Department of Drug Science and Technology
Via P. Giuria 9
10125 Turin, ITALY
ph. +39 (0)11 6707178


More information about the gromacs.org_gmx-users mailing list