[gmx-users] Best performace with 0 core for PME calcuation

Sat Jan 10 20:42:04 CET 2009

Mark Abraham a écrit :
> Nicolas wrote:
>> Hello,
>>
>> I'm trying to do a benchmark with Gromacs 4 on our cluster, but I 
>> don't completely understand the results I obtain. The system I used 
>> is a 128 DOPC bilayer hydrated by ~18800 SPC for a total of ~70200 
>> atoms. The size of the system is 9.6x9.6x10.1 nm^3. I'm using the 
>> following parameters:
>>
>>        * nstlist = 10
>>        * rlist = 1
>>        * Coulombtype = PME
>>        * rcoulomb = 1
>>        * fourier spacing = 0.12
>>        * vdwtype = Cutoff
>>        * rvdw = 1
>>
>> The cluster itself has got 2 procs/node connected by Ethernet 100 MB/s.
>
> Ethernet and Gigabit ethernet are not fast enough to get reasonable 
> scaling. There've been quite a few posts on this topic in the last six 
> months.
>
> Hmm I see you've corrected your post to refer to Infiniband with four 
> cores/node. That should be reasonable, I understand (but search the 
> archive).
>
> You should also check that your benchmark calculation is long enough 
> that you are measuring a simulation time that isn't being dominated by 
> setup costs. Some years ago a non-MD sysadmin complained of poor 
> scaling when he was testing over 10 or so MD steps!
My computation are lasting at least 10 min (20000 steps). I think it's 
enough. By the way, could the message passing interface can 
significantly influence the performance? I'm using MPICH-1.2. Should I 
consider using LAM or MPICH2?

Nicolas
>
>> I'm using mpiexec to run Gromacs.  When I use -npme 2 -ddorder 
>> interleave, I get:
>> ncore    Perf (ns/day)    PME (%)
>>
>>    1    0,00    0
>>    2    0,00    0
>>    3    0,00    0
>>    4    1,35    28
>>    5    1,84    31
>>    6    2,08    27
>>    8    2,09    21
>>    10    2,25    17
>>    12    2,02    15
>>    14    2,20    13
>>    16    2,04    11
>>    18    2,18    10
>>    20    2,29    9
>>
>> So, above 6-8 cores, the PP nodes are spending too much time waiting 
>> for the PME nodes and the perf forms a plateau. 
>
> That's not surprising - the heuristic is that about a third to a 
> quarter of the cores want to be PME-only nodes. Of course, that 
> depends on the relative size of the real- and reciprocal-space parts 
> of the calculation.
>
>> When I use -npme 0, I get:
>>
>>     ncore    Perf (ns/day)    PME (%)
>>    1    0,43    33
>>    2    0,92    34
>>    3    1,34    35
>>    4    1,69    36
>>    5    2,17    33
>>    6    2,56    32
>>    8    3,24    33
>>    10    3,84    34
>>    12    4,34    35
>>    14    5,05    32
>>    16    5,47    34
>>    18    5,54    37
>>    20    6,13    36
>>
>> I obtain much better performances when there is no PME nodes, while I 
>> was expecting the opposite. Does someone have an explanation for 
>> that? Does that means domain decomposition is useless below a certain 
>> real space cutoff?  I'm quite confused.
>
> The relevant observations are for 4,5,6 and 8, for which shared-duty 
> is out-performing -npme 2. I think your observations support the 
> conclusion that your network hardware is more limiting for PME-only 
> nodes than shared-duty nodes. They don't support the conclusion that 
> DD is useless, since you haven't compared with PD.
>
> You can play with the PME parameters to shift more load into the 
> real-space part - IIRC Carsten suggested a heuristic a few months back.
>
> Mark
> _______________________________________________
> gmx-users mailing list    gmx-users at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/search before 
> posting!
> Please don't post (un)subscribe requests to the list. Use the www 
> interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: nsapay.vcf
Type: text/x-vcard
Size: 310 bytes
Desc: not available
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20090110/b4469ce6/attachment.vcf>