[gmx-users] Gromacs 2018 and GPU PME

Szilárd Páll pall.szilard at gmail.com
Fri Feb 9 16:25:52 CET 2018


Hi,

First of all,have you read the docs (admittedly somewhat brief):
http://manual.gromacs.org/documentation/2018/user-guide/mdrun-performance.html#types-of-gpu-tasks

The current PME GPU was optimized for single-GPU runs. Using multiple GPUs
with PME offloaded works, but this mode hasn't been an optimization target
and it will often not give very good performance. Using multiple GPUs
requires a separate PME rank (as you have realized), only one can be used
(as we don't support PME decomposition on GPUs) and it comes some inherent
scaling drawbacks. For this reason, unless you _need_ your single run to be
as fast as possible, you'll be better off running multiple simulations
side-by side.

A few tips for tuning the performance of a multi-GPU run with PME offload:
* expect to get at best 1.5 scaling to 2 GPUs (rarely 3 if the tasks allow)
* generally it's best to use about the same decomposition that you'd use
with nonbonded-only offload, e.g. in your case 6-8 ranks
* map the GPU task alone or at most together with 1 PP rank to a GPU, i.e.
use the new -gputasks option
e.g. for your case I'd expect the following to work ~best:
gmx mdrun -v -deffnm md -pme gpu -nb gpu -ntmpi 8 -ntomp 6 -npme 1
-gputasks 00000001
or
gmx mdrun -v -deffnm md -pme gpu -nb gpu -ntmpi 8 -ntomp 6 -npme 1
-gputasks 00000011


Let me know if that gave some improvement.

Cheers,

--
Szilárd

On Fri, Feb 9, 2018 at 8:51 AM, Gmx QA <gmxquestions at gmail.com> wrote:

> Hi list,
>
> I am trying out the new gromacs 2018 (really nice so far), but have a few
> questions about what command line options I should specify, specifically
> with the new gnu pme implementation.
>
> My computer has two CPUs (with 12 cores each, 24 with hyper threading) and
> two GPUs, and I currently (with 2018) start simulations like this:
>
> $ gmx mdrun -v -deffnm md -pme gpu -nb gpu -ntmpi 2 -npme 1 -ntomp 24
> -gpu_id 01
>
> this works, but gromacs prints the message that 24 omp threads per mpi rank
> is likely inefficient. However, trying to reduce the number of omp threads
> I see a reduction in performance. Is this message no longer relevant with
> gpu pme or am I overlooking something?
>
> Thanks
> /PK
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/
> Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list