[gmx-users] 2018-beta1: PME/GPU performance question

Jochen Hub jhub at gwdg.de
Fri Dec 1 10:30:07 CET 2017


Hi Szilárd,

thank you for the quick reply.

Yes, but Urey-Bradley makes only 0.2% of the M-Flops. 99.2% comes from 
"NxN Ewald Elec. + LJ [F]" or "NxN Ewald Elec. + LJ [V&F]".

Update: I tested Tip3 vs. Charmm-modified Tip3p - not the problem

But: The cutoff has a big influence on this effect: This goes so far 
that, with 4 CPU cores, one gets better performance with 1.4 nm cutoff 
than with 1.0 nm cutoff (!), see:

(New runs, now with Slipids, they also use UB.)

# 128 Slipids, 1nm cutoff (poor at small nt)
  4    58.70 <- !
  6    99.79
  8   123.81
10   142.46
12   148.26

# 128 Slipids, 1.4nm cutoff (seems ok)
  4    78.12 <- !
  6   106.48
  8   127.24
10   130.26
12   134.25


Something similar happens with a 4x larger system, yet not as extreme.

# 512 Slipids, 1nm cutoff (poor at small nt)
  4    21.10
  6    30.67
  8    40.06
10    48.01
12    51.66

# 512 Slipids, 1.4nm cutoff (seems ok)
  4    20.98
  6    29.98
  8    32.99
10    34.68
12    36.03

Do you still think this due to bonded work?

Thank you,
Jochen


Am 01.12.17 um 02:26 schrieb Szilárd Páll:
> Hi Jochen,
> 
> Short answer: (most likely) it is due to the large difference in the
> amount of bonded work (relative to the total step time). Does CHARMM36
> use UB?
> 
> Cheers,
> --
> Szilárd
> 
> 
> On Thu, Nov 30, 2017 at 5:33 PM, Jochen Hub <jhub at gwdg.de> wrote:
>> Dear all,
>>
>> I have a question on the performance of the new PME-on-GPU code (2018-beta1)
>> on a Xeon 12-core / GTX 1080 node (Cuda 8, gcc 4.85).
>>
>> With a 84 kAtoms system, I get that the simulations do not benefit from a
>> strong CPU any more. See, using 6 Xeon cores with a GTX 1080 is sufficient.
>>
>> #CPU  ns/day
>>   2    92.88
>>   4   113.18
>>   6   123.36
>>   8   122.62
>> 10   125.76
>> 12   128.84
>>
>> (This is nice, as we can buy cheap CPUs).
>>
>> (with pinning, pinstride 1, one GPU, -ntmpi 1)
>>
>> On a small system (Charmm36 lipid patch, 30 kAtoms), in contrast, the
>> simulations strongly benefit from more CPU cores.
>>
>> #CPU  ns/day
>>   4    84.11
>>   6   119.24
>>   8   150.84
>> 10   159.63
>> 12   171.30
>>
>> Is this the expected behaviour? Do you know why?
>>
>> Thank you for any hints,
>> Jochen
>>
>> --
>> ---------------------------------------------------
>> Dr. Jochen Hub
>> Computational Molecular Biophysics Group
>> Institute for Microbiology and Genetics
>> Georg-August-University of Göttingen
>> Justus-von-Liebig-Weg 11, 37077 Göttingen, Germany.
>> Phone: +49-551-39-14189
>> http://cmb.bio.uni-goettingen.de/
>> ---------------------------------------------------
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a
>> mail to gmx-users-request at gromacs.org.

-- 
---------------------------------------------------
Dr. Jochen Hub
Computational Molecular Biophysics Group
Institute for Microbiology and Genetics
Georg-August-University of Göttingen
Justus-von-Liebig-Weg 11, 37077 Göttingen, Germany.
Phone: +49-551-39-14189
http://cmb.bio.uni-goettingen.de/
---------------------------------------------------


More information about the gromacs.org_gmx-users mailing list