[gmx-users] Question about GPU acceleration in GROMACS 5

Mark Abraham mark.j.abraham at gmail.com
Fri Dec 12 17:13:49 CET 2014


On Fri, Dec 12, 2014 at 3:47 PM, Tomy van Batis <tomyvanbatis at gmail.com>
wrote:
>
> Hi Mark
>
> Thanks for your detailed reposponce.
>
> I still don't see the reason for the GPU loading to be only around 50%, but
> also why does this number increases with increasing CPU cores.
>
> For example, when using 1 CPU (-ntomp 1 i nthe mdrun) , the GPU loading is
> only about 25-30%, although with 4 CPU cores the GPU loading is 55%.
>

Your system runs like this

1. Do forces; so short-range on the GPU (~50% of the time) and angles on
the CPU (~5% of the time, then ~45% idle)
2. Do constraints, updates, neighbour search and house keeping on the CPU
(~50% of the time, including data transfer costs) with the GPU idle (~50%)
3. Repeat

so adding more CPU cores makes 2 take less time. You can see this by doing
a diff on the tables at the ends of the log files. A PME simulation looks
rather different.


> Considering that the work done on the GPU takes a lot longer that the one
> on the CPU, I believe the GPU loading should not change when changing the
> number of openmp threads. Is this correct or do I miss something here?
>

True for 1, but not for 2.


> Addtionally, I don't really see the reason that the GPU is not loaded 100%.
> Is this because of the system size?
>

As Carsten said, we optimize for throughput, not utilization. On a single
node, you could do everything on the GPU (as e.g. AMBER 14 does) and now
utilization would approach peak (and throughput would go up in that case,
if someone wrote a big pile of code to make it happen). But that
implementation would struggle scale to more nodes with current hardware
technology, and is tough to make work well with multiple GPUs per node
(some WIP, but focused on 1).

Mark


> Tommy
>
>
>
> *Hi,*
> >
> > *Only the short-ranged non-bonded work is offloaded to the GPU, but
> that's*
> > *almost all the force-based work you are doing. So it is entirely*
> > *unsurprising that the work done on the GPU takes a lot longer than it
> > does*
> > *on the CPU. That warning is aimed at the more typical PME-based
> > simulation*
> > *where the long-ranged part is done on the CPU, and now there is load to*
> > *balance. Running constraints+update happens only on the CPU, which
> is**always
> > a bottleneck, and worse in your case.*
> >
> > *Ideally, we'd share some load that your simulations are doing solely on
> > the*
> > *GPU with the CPU, and/or do the update on the GPU, but none of
> the**infrastructure
> > is there for that.*
> > *Mark*
>
>
> On Fri, Dec 12, 2014 at 2:00 PM, Tomy van Batis <tomyvanbatis at gmail.com>
> wrote:
> >
> > Dear all
> >
> > I am working with a system of about 200.000 particles. All the non-bonded
> > interactions on the system are Lennard-Jones type (no Coulomb). I
> constrain
> > the bond-length with Lincs. No torsion or bending interactions are taken
> > into account.
> >
> >
> > I am running the simulations on a 4-core Xeon® E5-1620 vs @ 3.70GHz
> > together with an NVIDIA Tesla K20Xm. I observe a strange behavior when
> > looking to performance of the simulations:
> >
> >
> > 1. Running in 4 cores+gpu
> >
> > GPU/CPU force evaluation time=9.5 and GPU usage=58% (I see that with the
> > command nvidia-smi)
> >
> >
> > [image: Inline image 1]
> >
> >
> >
> > 2. Running in 2 cores+gpu
> >
> > GPU/CPU force evaluation time=9.9 and GPU usage=45-50% (Image is not
> > included due to size restrictions)
> >
> >
> >
> > The situation doesn't change if I include the option -nd gpu (or gpu_cpu)
> > in the mdrun.
> >
> >
> > I can see in the mailing list that the force evaluation time should be
> > about 1, that means that I am far away from the optimal performance.
> >
> >
> > Does anybody have any suggestions about how to improve the computational
> > speed?
> >
> >
> > Thanks in advance,
> >
> > Tommy
> >
> >
> >
> >
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>
>


More information about the gromacs.org_gmx-users mailing list