[gmx-users] Is there a way to omit particles with, q=0, from Coulomb-/PME-calculations?
Thomas Schlesier
schlesi at uni-mainz.de
Tue Jan 17 10:29:49 CET 2012
But would there be a way to optimize it further?
In my real simulation i would have a charged solute and the uncharged
solvent (both have nearly the same number of particles). If i could omit
the uncharged solvent from the long-ranged coulomb-calculation (PME) it
would save much time.
Or is there a reason that some of the PME stuff is also calculated for
uncharged particles?
(Ok, i know that this is a rather specical system, in so far that in
most md-simulations the number of uncharged particles is negligible.)
Would it be probably better to move the question to the developer-list?
Greetings
Thomas
> On 17/01/2012 7:32 PM, Thomas Schlesier wrote:
>> On 17/01/2012 4:55 AM, Thomas Schlesier wrote:
>>>> Dear all,
>>>> Is there a way to omit particles with zero charge from calculations
>>>> for Coulomb-interactions or PME?
>>>> In my calculations i want to coarse-grain my solvent, but the solute
>>>> should be still represented by atoms. In doing so the
>>>> solvent-molecules have a zero charge. I noticed that for a simulation
>>>> with only the CG-solvent significant time was spent for the PME-part
>>>> of the simulation.
>>>> If i would simulate the complete system (atomic solute +
>>>> coarse-grained solvent), i would save only time for the reduced
>> number
>>>> of particles (compared to atomistic solvent). But if i could omit the
>>>> zero-charge solvent from the Coulomb-/PME-part, it would save much
>>>> additional time.
>>>>
>>>> Is there an easy way for the omission, or would one have to hack the
>>>> code? If the latter is true, how hard would it be and where do i have
>>>> to look?
>>>> (First idea would be to create an index-file group with all
>>>> non-zero-charged particles and then run in the loops needed for
>>>> Coulomb/PME only over this subset of particles.)
>>>> I have only experience with Fortran and not with C++.
>>>>
>>>> Only other solution which comes to my mind would be to use plain
>>>> cut-offs for the Coulomb-part. This would save time required for
>> doing
>>>> PME but will in turn cost time for the calculations of zeros
>>>> (Coulomb-interaction for the CG-solvent). But more importantly would
>>>> introduce artifacts from the plain cut-off :(
>>
>>> Particles with zero charge are not included in neighbour lists used
>>> for calculating Coulomb interactions. The statistics in the "M E G A
>> ->F L O P S A C C O U N T I N G" section of the .log file will show
>>> that there is significant use of loops that do not have "Coul"
>>> component. So already these have no effect on half of the PME
>>> calculation. I don't know whether the grid part is similarly
>>> optimized, but you can test this yourself by comparing timing of runs
>>> with and without charged solvent.
>>>
>>> Mark
>>
>> Ok, i will test this.
>> But here is the data i obtained for two simulations, one with plain
>> cut-off and the other with PME. As one sees the simulation with plain
>> cut-offs is much faster (by a factor of 6).
>
> Yes. I think I have seen this before for PME when (some grid cells) are
> lacking (many) charged particles.
>
> You will see that the nonbonded loops are always "VdW(T)" for tabulated
> VdW - you have no charges at all in this system and GROMACS has already
> optimized its choice of nonbonded loops accordingly. You would see
> "Coul(T) + VdW(T)" if your solvent had charge.
>
> It's not a meaningful test of the performance of PME vs cut-off, either,
> because there are no charges.
>
> Mark
>
>>
>>
>> ---------------------------------------------------------------------------
>>
>> With PME:
>>
>> M E G A - F L O P S A C C O U N T I N G
>>
>> RF=Reaction-Field FE=Free Energy SCFE=Soft-Core/Free Energy
>> T=Tabulated W3=SPC/TIP3p W4=TIP4p (single or pairs)
>> NF=No Forces
>>
>> Computing: M-Number M-Flops % Flops
>> -----------------------------------------------------------------------
>> VdW(T) 1132.029152 61129.574 0.1
>> Outer nonbonded loop 1020.997718 10209.977 0.0
>> Calc Weights 16725.001338 602100.048 0.6
>> Spread Q Bspline 356800.028544 713600.057 0.7
>> Gather F Bspline 356800.028544 4281600.343 4.4
>> 3D-FFT 9936400.794912 79491206.359 81.6
>> Solve PME 180000.014400 11520000.922 11.8
>> NS-Pairs 2210.718786 46425.095 0.0
>> Reset In Box 1115.000000 3345.000 0.0
>> CG-CoM 1115.000446 3345.001 0.0
>> Virial 7825.000626 140850.011 0.1
>> Ext.ens. Update 5575.000446 301050.024 0.3
>> Stop-CM 5575.000446 55750.004 0.1
>> Calc-Ekin 5575.000892 150525.024 0.2
>> -----------------------------------------------------------------------
>> Total 97381137.440 100.0
>> -----------------------------------------------------------------------
>> D O M A I N D E C O M P O S I T I O N S T A T I S T I C S
>>
>> av. #atoms communicated per step for force: 2 x 94.1
>>
>> Average load imbalance: 10.7 %
>> Part of the total run time spent waiting due to load imbalance: 0.1 %
>>
>>
>> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>>
>> Computing: Nodes Number G-Cycles Seconds %
>> -----------------------------------------------------------------------
>> Domain decomp. 4 2500000 903.835 308.1 1.8
>> Comm. coord. 4 12500001 321.930 109.7 0.6
>> Neighbor search 4 2500001 1955.330 666.5 3.8
>> Force 4 12500001 696.668 237.5 1.4
>> Wait + Comm. F 4 12500001 384.107 130.9 0.7
>> PME mesh 4 12500001 43854.818 14948.2 85.3
>> Write traj. 4 5001 1.489 0.5 0.0
>> Update 4 12500001 1137.630 387.8 2.2
>> Comm. energies 4 12500001 1074.541 366.3 2.1
>> Rest 4 1093.194 372.6 2.1
>> -----------------------------------------------------------------------
>> Total 4 51423.541 17528.0 100.0
>> -----------------------------------------------------------------------
>>
>> Parallel run - timing based on wallclock.
>>
>> NODE (s) Real (s) (%)
>> Time: 4382.000 4382.000 100.0
>> 1h13:02
>> (Mnbf/s) (GFlops) (ns/day) (hour/ns)
>> Performance: 0.258 22.223 492.926 0.049
>>
>> -----------------------------------------------------------------------------------
>>
>> -----------------------------------------------------------------------------------
>>
>>
>> With plain cut-offs
>>
>> M E G A - F L O P S A C C O U N T I N G
>>
>> RF=Reaction-Field FE=Free Energy SCFE=Soft-Core/Free Energy
>> T=Tabulated W3=SPC/TIP3p W4=TIP4p (single or pairs)
>> NF=No Forces
>>
>> Computing: M-Number M-Flops % Flops
>> -----------------------------------------------------------------------
>> VdW(T) 1137.009596 61398.518 7.9
>> Outer nonbonded loop 1020.973338 10209.733 1.3
>> NS-Pairs 2213.689975 46487.489 6.0
>> Reset In Box 1115.000000 3345.000 0.4
>> CG-CoM 1115.000446 3345.001 0.4
>> Virial 7825.000626 140850.011 18.2
>> Ext.ens. Update 5575.000446 301050.024 38.9
>> Stop-CM 5575.000446 55750.004 7.2
>> Calc-Ekin 5575.000892 150525.024 19.5
>> -----------------------------------------------------------------------
>> Total 772960.806 100.0
>> -----------------------------------------------------------------------
>> D O M A I N D E C O M P O S I T I O N S T A T I S T I C S
>>
>> av. #atoms communicated per step for force: 2 x 93.9
>>
>> Average load imbalance: 16.0 %
>> Part of the total run time spent waiting due to load imbalance: 0.9 %
>>
>>
>> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>>
>> Computing: Nodes Number G-Cycles Seconds %
>> -----------------------------------------------------------------------
>> Domain decomp. 4 2500000 856.561 291.8 12.7
>> Comm. coord. 4 12500001 267.036 91.0 3.9
>> Neighbor search 4 2500001 2077.236 707.6 30.7
>> Force 4 12500001 377.606 128.6 5.6
>> Wait + Comm. F 4 12500001 347.270 118.3 5.1
>> Write traj. 4 5001 1.166 0.4 0.0
>> Update 4 12500001 1109.008 377.8 16.4
>> Comm. energies 4 12500001 841.530 286.7 12.4
>> Rest 4 886.195 301.9 13.1
>> -----------------------------------------------------------------------
>> Total 4 6763.608 2304.0 100.0
>> -----------------------------------------------------------------------
>>
>> NOTE: 12 % of the run time was spent communicating energies,
>> you might want to use the -nosum option of mdrun
>>
>>
>> Parallel run - timing based on wallclock.
>>
>> NODE (s) Real (s) (%)
>> Time: 576.000 576.000 100.0
>> 9:36
>> (Mnbf/s) (GFlops) (ns/day) (hour/ns)
>> Performance: 1.974 1.342 3750.001 0.006
More information about the gromacs.org_gmx-users
mailing list