[gmx-users] results produced by auto-tuning of Coulomb cut-off/grid for PME can not be reproduced by manually setting the Coulomb cut-off and grid spacing

Tue Jan 20 23:48:20 CET 2015

Hi Szilard,

- I've tired 5.0.1 and it gives the same result. So 4.6.7 or 5.0.4 is 
better, but in what way?

- I've tired Verlet scheme and it gives small change of cutoff and grid. 
But what I really interested is to manually reproduce the result that 
tune_pme give me in the first case using Group scheme.

- I've also tired lincs-roder=4 and it makes no difference.

- My Fourier spacing is 25% finer, but that shouldn't affect the results 
right? if it do affect the results, then I want to find out how.

- I happen to use the PP/PME rank split (100+28) and it gives me 
interesting results (speed of performance is not bad actually). Then I'm 
very interested in how these cutoff and grid setting can affect my 
simulation results. So I tried to manually control the parameter 
(turning off tune_PME). But no matter how I tired, I can not reproduce 
the result given by tune_pme. So my biggest question is, how does 
tune_PME implemented in the code? What parameters does it actually tuned?

- When PME tuned the cutoff to such large value, the speed does not goes 
down noticeably. So what I suspect is that tune_PME does the direct 
space calculation without changing the neighbor list search distance.

Thank you

Best
Jiaqi

On 1/20/2015 3:54 PM, Szilárd Páll wrote:
> Not (all) directly related, but a few comments/questions:
>
> - Have you tried 4.6.7 or 5.0.4?
> - Have you considered using the Verlet scheme instead of doing manual buffering?
> - lincs-order=8 is very large for 2fs production runs - typically 4 is used.
> - Your fourier spacing is a lot (~25%) finer than it needs to be.
>
> - The PP/PME rank split of 100+28 is _very_ inconvenient and it is the
> main cause of the horrible PME performance together with the overly
> coarse grid. That's why you get such a huge cut-off after the PP-PME
> load balancing. Even if you want to stick to these parameters, you
> should tune the rank split (manually or with tune_pme).
> - The above contributes to the high neighbor search cost too.
>
> --
> Szilárd
>
>
> On Tue, Jan 20, 2015 at 9:18 PM, Jiaqi Lin <jqlin at mit.edu> wrote:
>> Hi Mark,
>>
>> Thanks for reply. I put the md.log files in the following link
>>
>> https://www.dropbox.com/sh/d1d2fbwreizr974/AABYhSRU03nmijbTIXKKr-rra?dl=0
>>
>> There are four log files
>>   1.GMX 4.6.5 -tunepme (the coulombic cutoff is tuned to 3.253)
>>   2.GMX 4.6.5 -notunepme  rcoulomb= 3.3 , fourierspace = 0.33
>>   3.GMX 4.6.5 -notunepme  rcoulomb= 3.3 , fourierspace = 0.14
>>   4.GMX 4.6.5 -notunepme  rcoulomb= 1.4 , fourierspace = 0.14
>>
>> Note that the LR Coulombic energy in the first one is almost twice the value
>> of that in the second one, whereas the grid spacing in both cases are nealy
>> the same.
>>
>> Only the first one gives a strong electrostatic interaction of a
>> nanoparticle with a lipid bilayer under ionic imbalance. In other cases I do
>> not observe such a strong interaction.
>>
>> GMX 5.0.1 give the same results as GMX 4.6.5 using Group cutoff. Thanks
>>
>>
>> Regards
>> Jiaqi
>>
>>
>>
>>
>> On 1/19/2015 3:22 PM, Mark Abraham wrote:
>>> On Thu, Jan 15, 2015 at 3:21 AM, Jiaqi Lin <jqlin at mit.edu> wrote:
>>>
>>>> Dear GMX developers,
>>>>
>>>> I've encounter a problem in GROMACS concerning the auto-tuning feature of
>>>> PME that bugged me for months. As stated in the title, the auto-tuning
>>>> feature of mdrun changed my coulomb cutoff from 1.4 nm to ~3.3 nm (stated
>>>> in md.log) when I set -npme to be 28 (128 total CPU cores), and this
>>>> giving
>>>> me interesting simulation results. When I use -notunepme, I found Coulomb
>>>> (SR) and recip. giving me same energy but the actual simulation result is
>>>> different. This i can understand: scaling between coulombic cut-off/grid
>>>> size theoretically give same accuracy to electrostatics (according to GMX
>>>> manual and PME papers), but there actually some numerical error due to
>>>> grid
>>>> mapping and even if the energy is the same that does not mean system
>>>> configuration has to be the same (NVE ensemble: constant energy,
>>>> different
>>>> configuration).
>>>>
>>> Total electrostatic energy should be approximately the same with different
>>> PME partitions.
>>>
>>>
>>>> However the thing i don't understand is the following. I am interested in
>>>> the result under large coulomb cut-off, so I try to manually set cut-off
>>>> and grid space with -notunepme, using the value tuned by mdrun
>>>> previously.
>>>> This give me complete different simulation result, and the energy is also
>>>> different. I've tried to set rlist, rlistlong, or both to equal rcoulomb
>>>> (~3.3) still does not give me the result produced by auto-tuning PME.
>>>
>>> In what sense is the result different?
>>>
>>>
>>>> In addition, simulation speed dramatically reduces when I set rcoulomb to
>>>> be ~3.3 (using -tunepme the speed remains nearly the same no matter how
>>>> large the cutoff is tuned to). I've tested this in both GMX 4.6.5 and
>>>> 5.0.1, same thing happens, so clearly it's not because of versions. Thus
>>>> the question is: what exactly happened to PME calcualtion using the
>>>> auto-tuning feature in mdrun, why it does give different results when I
>>>> manually set the coulomb cutoff and grid space to the value tuned by
>>>> mdrun
>>>> without the auto-tuning feature (using -notunepme)? Thank you for help.
>>>>
>>> For the group scheme, these should all lead to essentially the same result
>>> and (if tuned) performance. If you can share your various log files on a
>>> file-sharing service (rc 1.4, rc 3.3, various -tunepme settings, 4.6.5 and
>>> 5.0.1) then we can be in a position to comment further.
>>>
>>> Mark
>>>
>>>
>>>> additional info: I use Group cutoff-scheme , rvdw is  1.2.
>>>>
>>>>
>>>>    md.log file:
>>>> DD  step 9 load imb.: force 29.4%  pme mesh/force 3.627
>>>>
>>>> step   30: timed with pme grid 280 280 384, coulomb cutoff 1.400: 1026.4
>>>> M-cycles
>>>> step   50: timed with pme grid 256 256 324, coulomb cutoff 1.464: 850.3
>>>> M-cycles
>>>> step   70: timed with pme grid 224 224 300, coulomb cutoff 1.626: 603.6
>>>> M-cycles
>>>> step   90: timed with pme grid 200 200 280, coulomb cutoff 1.822: 555.2
>>>> M-cycles
>>>> step  110: timed with pme grid 160 160 208, coulomb cutoff 2.280: 397.0
>>>> M-cycles
>>>> step  130: timed with pme grid 144 144 192, coulomb cutoff 2.530: 376.0
>>>> M-cycles
>>>> step  150: timed with pme grid 128 128 160, coulomb cutoff 2.964: 343.7
>>>> M-cycles
>>>> step  170: timed with pme grid 112 112 144, coulomb cutoff 3.294: 334.8
>>>> M-cycles
>>>> Grid: 12 x 14 x 14 cells
>>>> step  190: timed with pme grid 84 84 108, coulomb cutoff 4.392: 346.2
>>>> M-cycles
>>>> step  190: the PME grid restriction limits the PME load balancing to a
>>>> coulomb cut-off of 4.392
>>>> step  210: timed with pme grid 128 128 192, coulomb cutoff 2.846: 360.6
>>>> M-cycles
>>>> step  230: timed with pme grid 128 128 160, coulomb cutoff 2.964: 343.6
>>>> M-cycles
>>>> step  250: timed with pme grid 120 120 160, coulomb cutoff 3.036: 340.4
>>>> M-cycles
>>>> step  270: timed with pme grid 112 112 160, coulomb cutoff 3.253: 334.3
>>>> M-cycles
>>>> step  290: timed with pme grid 112 112 144, coulomb cutoff 3.294: 334.7
>>>> M-cycles
>>>> step  310: timed with pme grid 84 84 108, coulomb cutoff 4.392: 348.0
>>>> M-cycles
>>>>                 optimal pme grid 112 112 160, coulomb cutoff 3.253
>>>> DD  step 999 load imb.: force 18.4%  pme mesh/force 0.918
>>>>
>>>> At step 1000 the performance loss due to force load imbalance is 6.3 %
>>>>
>>>> NOTE: Turning on dynamic load balancing
>>>>
>>>>              Step           Time         Lambda
>>>>              1000       20.00000        0.00000
>>>>
>>>>      Energies (kJ/mol)
>>>>              Bond       G96Angle        LJ (SR)   Coulomb (SR)   Coul.
>>>> recip.
>>>>       1.98359e+05    1.79181e+06   -1.08927e+07   -7.04736e+06
>>>> -2.32682e+05
>>>>    Position Rest.      Potential    Kinetic En.   Total Energy Temperature
>>>>       6.20627e+04   -1.61205e+07    4.34624e+06   -1.17743e+07 3.00659e+02
>>>>    Pressure (bar)   Constr. rmsd
>>>>       2.13582e+00    1.74243e-04
>>>>
>>>>
>>>> Best
>>>> Jiaqi
>>>>
>>>>
>>>>
>>>> --
>>>> Jiaqi Lin
>>>> postdoc fellow
>>>> The Langer Lab
>>>>
>>>> --
>>>> Gromacs Users mailing list
>>>>
>>>> * Please search the archive at http://www.gromacs.org/
>>>> Support/Mailing_Lists/GMX-Users_List before posting!
>>>>
>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>
>>>> * For (un)subscribe requests visit
>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>>> send a mail to gmx-users-request at gromacs.org.
>>>>
>> --
>> Jiaqi Lin
>> postdoc fellow
>> The Langer Lab
>>
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a
>> mail to gmx-users-request at gromacs.org.

-- 
Jiaqi Lin
postdoc fellow
The Langer Lab
David H. Koch Institute for Integrative Cancer Research
Massachusetts Institute of Technology