[gmx-users] Gromacs 2018 and GPU PME

Fri Feb 9 18:04:38 CET 2018

Just to quickly jump in, because Mark suggested taken a look at the 
latest doc and unfortunately I must admit that I didn't understand what 
I read. I appear to be especially struggling with the idea of gputasks.

Can you please explain what is happening in this line?

> -pme gpu -nb gpu -ntmpi 8 -ntomp 6 -npme 1 -gputasks 00000001

I am seriously confused here. Also, the number of ranks is 8, while the number of threads is 6? Is -ntomp now specifying the _per-rank_ number of threads, i.e. the actual number of threads for this job would be 48?

Thank you,

Alex

On 2/9/2018 8:25 AM, Szilárd Páll wrote:
> Hi,
>
> First of all,have you read the docs (admittedly somewhat brief):
> http://manual.gromacs.org/documentation/2018/user-guide/mdrun-performance.html#types-of-gpu-tasks
>
> The current PME GPU was optimized for single-GPU runs. Using multiple GPUs
> with PME offloaded works, but this mode hasn't been an optimization target
> and it will often not give very good performance. Using multiple GPUs
> requires a separate PME rank (as you have realized), only one can be used
> (as we don't support PME decomposition on GPUs) and it comes some inherent
> scaling drawbacks. For this reason, unless you _need_ your single run to be
> as fast as possible, you'll be better off running multiple simulations
> side-by side.
>
> A few tips for tuning the performance of a multi-GPU run with PME offload:
> * expect to get at best 1.5 scaling to 2 GPUs (rarely 3 if the tasks allow)
> * generally it's best to use about the same decomposition that you'd use
> with nonbonded-only offload, e.g. in your case 6-8 ranks
> * map the GPU task alone or at most together with 1 PP rank to a GPU, i.e.
> use the new -gputasks option
> e.g. for your case I'd expect the following to work ~best:
> gmx mdrun -v -deffnm md -pme gpu -nb gpu -ntmpi 8 -ntomp 6 -npme 1
> -gputasks 00000001
> or
> gmx mdrun -v -deffnm md -pme gpu -nb gpu -ntmpi 8 -ntomp 6 -npme 1
> -gputasks 00000011
>
>
> Let me know if that gave some improvement.
>
> Cheers,
>
> --
> Szilárd
>
> On Fri, Feb 9, 2018 at 8:51 AM, Gmx QA <gmxquestions at gmail.com> wrote:
>
>> Hi list,
>>
>> I am trying out the new gromacs 2018 (really nice so far), but have a few
>> questions about what command line options I should specify, specifically
>> with the new gnu pme implementation.
>>
>> My computer has two CPUs (with 12 cores each, 24 with hyper threading) and
>> two GPUs, and I currently (with 2018) start simulations like this:
>>
>> $ gmx mdrun -v -deffnm md -pme gpu -nb gpu -ntmpi 2 -npme 1 -ntomp 24
>> -gpu_id 01
>>
>> this works, but gromacs prints the message that 24 omp threads per mpi rank
>> is likely inefficient. However, trying to reduce the number of omp threads
>> I see a reduction in performance. Is this message no longer relevant with
>> gpu pme or am I overlooking something?
>>
>> Thanks
>> /PK
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at http://www.gromacs.org/
>> Support/Mailing_Lists/GMX-Users_List before posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>>