[gmx-users] results produced by auto-tuning of Coulomb cut-off/grid for PME can not be reproduced by manually setting the Coulomb cut-off and grid spacing

Jiaqi Lin jqlin at mit.edu
Thu Jan 22 20:09:39 CET 2015


Hi Mark

Thank you for your patient reply.

I've tried using the exact same set up (rcoulombic =3.258 fourierspacing 
= 0.325, rlist =1.4, rlistlong =3.258, rvdw=1.2) as the auto-tune of 
would have changed the parameters to, and still it won't reproduce the 
result obtained by auto-tune, just the result I tired previously using 
similar parameters. Therefore, I believe there is something going on in 
the code of auto-tune that is not what it is appear to be, or, the 
auto-tune change the parameters in a special way that in theory 
shouldn't affect simulation result but in my case the system is very 
sensitive to these changes. As you pointed out, the code is not 
straightforward, therefore much can happen at this stage. I'd very much 
like to figure out what going on, but spending two months on the code is 
not something I'm willing to do. So I think I will keep changing 
parameters and try to understand long range electrostatics in my system. 
Also, the simulation produced by auto-tune is unlikely to be a outlier. 
I kept changing -ntpme and when auto-tune give me rcoulombic larger than 
3, I can always observe the "interesting" result.

I also tired LJPME as you suggested for long range vdw, it compress the 
bilayer to a large extend. The whole force fields has to be refined if I 
use LJPME, but that's another story. I think electrostatics is the main 
driving force in my simulation system, and I'll keep trying to figure 
out what really happened.

Best
Jiaqi

On 1/20/2015 7:18 PM, Mark Abraham wrote:
> On Tue, Jan 20, 2015 at 11:48 PM, Jiaqi Lin <jqlin at mit.edu> wrote:
>
>> Hi Szilard,
>>
>> - I've tired 5.0.1 and it gives the same result. So 4.6.7 or 5.0.4 is
>> better, but in what way?
>>
>> - I've tired Verlet scheme and it gives small change of cutoff and grid.
>> But what I really interested is to manually reproduce the result that
>> tune_pme give me in the first case using Group scheme.
>>
> If that's the only situation you can observe it, it could be an outlier or
> your method could be unsuitable...
>
>
>> - I've also tired lincs-roder=4 and it makes no difference.
>>
>> - My Fourier spacing is 25% finer, but that shouldn't affect the results
>> right? if it do affect the results, then I want to find out how.
>>
> It affects your results because you could do some more sampling with the
> computer time you are spending on PME at the moment. Where to choose your
> balance of model quality vs amount of sampling is poorly understood, and
> problem-dependent.
>
> - I happen to use the PP/PME rank split (100+28) and it gives me
>> interesting results (speed of performance is not bad actually). Then I'm
>> very interested in how these cutoff and grid setting can affect my
>> simulation results.
>
> If the implementation and model is right, then they have no effect. That's
> why we auto-tune with them. You're going to need a lot of replicates to
> show that -npme 28 gives a statistically different result from a different
> value, and you won't yet have established that it's somehow a more valid
> observation to pursue.
>
> So I tried to manually control the parameter (turning off tune_PME). But no
>> matter how I tired, I can not reproduce the result given by tune_pme. So my
>> biggest question is, how does tune_PME implemented in the code? What
>> parameters does it actually tuned?
>>
> Like it says, it varies rcoulomb and the Fourier grid, keeping rvdw and
> beta fixed. Details of how rlist and rlistlong behave are a bit messy, but
> nothing you asked for is ever missed out.
>
>
>> - When PME tuned the cutoff to such large value, the speed does not goes
>> down noticeably. So what I suspect is that tune_PME
>
> Please be careful with names. gmx tune_pme is a different thing from the
> mdrun auto-tuning.
>
>
>> does the direct space calculation without changing the neighbor list
>> search distance.
>>
> Your auto-tuned run number 1 had rlist = rcoulomb at the start, so mdrun
> knows you wanted a PME model with an unbuffered list whose size equals
> rcoulomb, and a buffered VDW model with rlist 1.4 and rvdw 1.2. Thus,
> rlistlong will stay equal to rcoulomb as it changes. The details and code
> are horrible, though, and I am looking forward to nothing so much as
> ripping it all out in about 2 months!
>
> And like Szilard suggested, your runs are probably a long way from maximum
> throughput. Aim for lots of sampling, don't chase replicating rare events
> with brute-force simulation!
>
> Mark
>
> Thank you
>> Best
>> Jiaqi
>>
>>
>> On 1/20/2015 3:54 PM, Szilárd Páll wrote:
>>
>>> Not (all) directly related, but a few comments/questions:
>>>
>>> - Have you tried 4.6.7 or 5.0.4?
>>> - Have you considered using the Verlet scheme instead of doing manual
>>> buffering?
>>> - lincs-order=8 is very large for 2fs production runs - typically 4 is
>>> used.
>>> - Your fourier spacing is a lot (~25%) finer than it needs to be.
>>>
>>> - The PP/PME rank split of 100+28 is _very_ inconvenient and it is the
>>> main cause of the horrible PME performance together with the overly
>>> coarse grid. That's why you get such a huge cut-off after the PP-PME
>>> load balancing. Even if you want to stick to these parameters, you
>>> should tune the rank split (manually or with tune_pme).
>>> - The above contributes to the high neighbor search cost too.
>>>
>>> --
>>> Szilárd
>>>
>>>
>>> On Tue, Jan 20, 2015 at 9:18 PM, Jiaqi Lin <jqlin at mit.edu> wrote:
>>>
>>>> Hi Mark,
>>>>
>>>> Thanks for reply. I put the md.log files in the following link
>>>>
>>>> https://www.dropbox.com/sh/d1d2fbwreizr974/
>>>> AABYhSRU03nmijbTIXKKr-rra?dl=0
>>>>
>>>> There are four log files
>>>>    1.GMX 4.6.5 -tunepme (the coulombic cutoff is tuned to 3.253)
>>>>    2.GMX 4.6.5 -notunepme  rcoulomb= 3.3 , fourierspace = 0.33
>>>>    3.GMX 4.6.5 -notunepme  rcoulomb= 3.3 , fourierspace = 0.14
>>>>    4.GMX 4.6.5 -notunepme  rcoulomb= 1.4 , fourierspace = 0.14
>>>>
>>>> Note that the LR Coulombic energy in the first one is almost twice the
>>>> value
>>>> of that in the second one, whereas the grid spacing in both cases are
>>>> nealy
>>>> the same.
>>>>
>>>> Only the first one gives a strong electrostatic interaction of a
>>>> nanoparticle with a lipid bilayer under ionic imbalance. In other cases
>>>> I do
>>>> not observe such a strong interaction.
>>>>
>>>> GMX 5.0.1 give the same results as GMX 4.6.5 using Group cutoff. Thanks
>>>>
>>>>
>>>> Regards
>>>> Jiaqi
>>>>
>>>>
>>>>
>>>>
>>>> On 1/19/2015 3:22 PM, Mark Abraham wrote:
>>>>
>>>>> On Thu, Jan 15, 2015 at 3:21 AM, Jiaqi Lin <jqlin at mit.edu> wrote:
>>>>>
>>>>>   Dear GMX developers,
>>>>>> I've encounter a problem in GROMACS concerning the auto-tuning feature
>>>>>> of
>>>>>> PME that bugged me for months. As stated in the title, the auto-tuning
>>>>>> feature of mdrun changed my coulomb cutoff from 1.4 nm to ~3.3 nm
>>>>>> (stated
>>>>>> in md.log) when I set -npme to be 28 (128 total CPU cores), and this
>>>>>> giving
>>>>>> me interesting simulation results. When I use -notunepme, I found
>>>>>> Coulomb
>>>>>> (SR) and recip. giving me same energy but the actual simulation result
>>>>>> is
>>>>>> different. This i can understand: scaling between coulombic
>>>>>> cut-off/grid
>>>>>> size theoretically give same accuracy to electrostatics (according to
>>>>>> GMX
>>>>>> manual and PME papers), but there actually some numerical error due to
>>>>>> grid
>>>>>> mapping and even if the energy is the same that does not mean system
>>>>>> configuration has to be the same (NVE ensemble: constant energy,
>>>>>> different
>>>>>> configuration).
>>>>>>
>>>>>>   Total electrostatic energy should be approximately the same with
>>>>> different
>>>>> PME partitions.
>>>>>
>>>>>
>>>>>   However the thing i don't understand is the following. I am interested
>>>>>> in
>>>>>> the result under large coulomb cut-off, so I try to manually set
>>>>>> cut-off
>>>>>> and grid space with -notunepme, using the value tuned by mdrun
>>>>>> previously.
>>>>>> This give me complete different simulation result, and the energy is
>>>>>> also
>>>>>> different. I've tried to set rlist, rlistlong, or both to equal
>>>>>> rcoulomb
>>>>>> (~3.3) still does not give me the result produced by auto-tuning PME.
>>>>>>
>>>>> In what sense is the result different?
>>>>>
>>>>>
>>>>>   In addition, simulation speed dramatically reduces when I set rcoulomb
>>>>>> to
>>>>>> be ~3.3 (using -tunepme the speed remains nearly the same no matter how
>>>>>> large the cutoff is tuned to). I've tested this in both GMX 4.6.5 and
>>>>>> 5.0.1, same thing happens, so clearly it's not because of versions.
>>>>>> Thus
>>>>>> the question is: what exactly happened to PME calcualtion using the
>>>>>> auto-tuning feature in mdrun, why it does give different results when I
>>>>>> manually set the coulomb cutoff and grid space to the value tuned by
>>>>>> mdrun
>>>>>> without the auto-tuning feature (using -notunepme)? Thank you for help.
>>>>>>
>>>>>>   For the group scheme, these should all lead to essentially the same
>>>>> result
>>>>> and (if tuned) performance. If you can share your various log files on a
>>>>> file-sharing service (rc 1.4, rc 3.3, various -tunepme settings, 4.6.5
>>>>> and
>>>>> 5.0.1) then we can be in a position to comment further.
>>>>>
>>>>> Mark
>>>>>
>>>>>
>>>>>   additional info: I use Group cutoff-scheme , rvdw is  1.2.
>>>>>>
>>>>>>     md.log file:
>>>>>> DD  step 9 load imb.: force 29.4%  pme mesh/force 3.627
>>>>>>
>>>>>> step   30: timed with pme grid 280 280 384, coulomb cutoff 1.400:
>>>>>> 1026.4
>>>>>> M-cycles
>>>>>> step   50: timed with pme grid 256 256 324, coulomb cutoff 1.464: 850.3
>>>>>> M-cycles
>>>>>> step   70: timed with pme grid 224 224 300, coulomb cutoff 1.626: 603.6
>>>>>> M-cycles
>>>>>> step   90: timed with pme grid 200 200 280, coulomb cutoff 1.822: 555.2
>>>>>> M-cycles
>>>>>> step  110: timed with pme grid 160 160 208, coulomb cutoff 2.280: 397.0
>>>>>> M-cycles
>>>>>> step  130: timed with pme grid 144 144 192, coulomb cutoff 2.530: 376.0
>>>>>> M-cycles
>>>>>> step  150: timed with pme grid 128 128 160, coulomb cutoff 2.964: 343.7
>>>>>> M-cycles
>>>>>> step  170: timed with pme grid 112 112 144, coulomb cutoff 3.294: 334.8
>>>>>> M-cycles
>>>>>> Grid: 12 x 14 x 14 cells
>>>>>> step  190: timed with pme grid 84 84 108, coulomb cutoff 4.392: 346.2
>>>>>> M-cycles
>>>>>> step  190: the PME grid restriction limits the PME load balancing to a
>>>>>> coulomb cut-off of 4.392
>>>>>> step  210: timed with pme grid 128 128 192, coulomb cutoff 2.846: 360.6
>>>>>> M-cycles
>>>>>> step  230: timed with pme grid 128 128 160, coulomb cutoff 2.964: 343.6
>>>>>> M-cycles
>>>>>> step  250: timed with pme grid 120 120 160, coulomb cutoff 3.036: 340.4
>>>>>> M-cycles
>>>>>> step  270: timed with pme grid 112 112 160, coulomb cutoff 3.253: 334.3
>>>>>> M-cycles
>>>>>> step  290: timed with pme grid 112 112 144, coulomb cutoff 3.294: 334.7
>>>>>> M-cycles
>>>>>> step  310: timed with pme grid 84 84 108, coulomb cutoff 4.392: 348.0
>>>>>> M-cycles
>>>>>>                  optimal pme grid 112 112 160, coulomb cutoff 3.253
>>>>>> DD  step 999 load imb.: force 18.4%  pme mesh/force 0.918
>>>>>>
>>>>>> At step 1000 the performance loss due to force load imbalance is 6.3 %
>>>>>>
>>>>>> NOTE: Turning on dynamic load balancing
>>>>>>
>>>>>>               Step           Time         Lambda
>>>>>>               1000       20.00000        0.00000
>>>>>>
>>>>>>       Energies (kJ/mol)
>>>>>>               Bond       G96Angle        LJ (SR)   Coulomb (SR)   Coul.
>>>>>> recip.
>>>>>>        1.98359e+05    1.79181e+06   -1.08927e+07   -7.04736e+06
>>>>>> -2.32682e+05
>>>>>>     Position Rest.      Potential    Kinetic En.   Total Energy
>>>>>> Temperature
>>>>>>        6.20627e+04   -1.61205e+07    4.34624e+06   -1.17743e+07
>>>>>> 3.00659e+02
>>>>>>     Pressure (bar)   Constr. rmsd
>>>>>>        2.13582e+00    1.74243e-04
>>>>>>
>>>>>>
>>>>>> Best
>>>>>> Jiaqi
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Jiaqi Lin
>>>>>> postdoc fellow
>>>>>> The Langer Lab
>>>>>>
>>>>>> --
>>>>>> Gromacs Users mailing list
>>>>>>
>>>>>> * Please search the archive at http://www.gromacs.org/
>>>>>> Support/Mailing_Lists/GMX-Users_List before posting!
>>>>>>
>>>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>>>
>>>>>> * For (un)subscribe requests visit
>>>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>>>>> send a mail to gmx-users-request at gromacs.org.
>>>>>>
>>>>>>   --
>>>> Jiaqi Lin
>>>> postdoc fellow
>>>> The Langer Lab
>>>>
>>>> --
>>>> Gromacs Users mailing list
>>>>
>>>> * Please search the archive at
>>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>>>> posting!
>>>>
>>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>>>
>>>> * For (un)subscribe requests visit
>>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>>>> send a
>>>> mail to gmx-users-request at gromacs.org.
>>>>
>> --
>> Jiaqi Lin
>> postdoc fellow
>> The Langer Lab
>> David H. Koch Institute for Integrative Cancer Research
>> Massachusetts Institute of Technology
>>
>>
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at http://www.gromacs.org/
>> Support/Mailing_Lists/GMX-Users_List before posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>>

-- 
Jiaqi Lin
postdoc fellow
The Langer Lab



More information about the gromacs.org_gmx-users mailing list