[gmx-users] Gromacs 2018b1: Computing PME on GPUs

Szilárd Páll pall.szilard at gmail.com
Tue Dec 12 01:17:57 CET 2017

If you had a previously balanced CPU-GPU setup, expect at max ~25%
improvement (with a decently fast >=Maxwell GPU) and at least a mid-sized
simulation system. GPUs are not great at doing small PME grid work and the
GROMACS PME code is really _fast_, so with small inputs and many/fast CPU
cores (<20-40k, depends on the GPU too!) you will often get better
performance with nonbonded-only offload (or only the FFTs, see the docs for
the mixed mode).

Where the new PME offload shows a large benefit is in cases where you'd
otherwise be strongly CPU-bound by only offloading non-bondeds. In these
cases, you can get up to 1.5-3x improvement. The new offload mode aims to
deliver in _most_ cases >80-90% of the performance you'd previously got
with a decently fast single-socket + GPU by using only 2-4 cores per GPU
(i.e. allowing to run in throughput mode with >=2-3 GPUs per socket).

More concretely, I'll throw in a few examples below (off the top of my
head, so not written in stone!; the official final release will come with
some performance recommendations (so please check those out for

- given 2xXeon 2640v4 with, say 2x1080 Ti, you can now plug in two more
GPUs and about double the aggregate ns/day per node.
- given a (slightly older) 6-core Haswell i7 5930x with a GTX 980Ti,
depending on your sim system you can plug in 1-2 more cards (say two more
1070s) and get ~2.5-3x of the aggregate ns/day.
- If you have a 4C desktop machine, say with an i7 7600K + 1080, you can
plug in a second GPU.

- However if you had a sever with, say 2x2630v3 + 2xtesla K80, even though
that's only ~4 low-end Haswell cores, Kepler and earlier cards aren't too
fast, so you'r better off running PME on the CPUs.
- Same goes for small simulation systems on balanced machines with decent
amount of CPU cores, keeping PME on the CPU will occasionally be beneficial.

Remember to always read the docs, understand how to use the new features,
and test performance on your system!


On Mon, Dec 11, 2017 at 7:30 AM, Alex <nedomacho at gmail.com> wrote:

> Sorry, I am unable to respond to your question, but could you please
> comment on performance difference with and without PME on GPU?
> Thanks,
> Alex
> On 12/10/2017 10:28 PM, Jernej Zidar wrote:
>> Hi,
>> I’ve been testing the first beta of the upcoming Gromacs 2018, where one
>> of
>> the main features is the ability to run compute PME on GPUs. I tried it
>> and
>> it runs well.
>> My question is: Would it make sense to get a bitcoin mining rig (4,5
>> Nvidia
>> cards, slow CPU) to run simulations in place of the typical dual Xeon
>> workstation with CUDA cards?
>> Thanks!
>> Jernej
> --
> Gromacs Users mailing list
> * Please search the archive at http://www.gromacs.org/Support
> /Mailing_Lists/GMX-Users_List before posting!
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.

More information about the gromacs.org_gmx-users mailing list