[gmx-users] Is there a way to omit particles with, q=0, from Coulomb-/PME-calculations?
Carsten Kutzner
ckutzne at gwdg.de
Tue Jan 17 10:59:03 CET 2012
Hi Thomas,
Am Jan 17, 2012 um 10:29 AM schrieb Thomas Schlesier:
> But would there be a way to optimize it further?
> In my real simulation i would have a charged solute and the uncharged solvent (both have nearly the same number of particles). If i could omit the uncharged solvent from the long-ranged coulomb-calculation (PME) it would save much time.
> Or is there a reason that some of the PME stuff is also calculated for uncharged particles?
For PME you need the Fourier-transformed charge grid and you get back the potential
grid from which you interpolate the forces on the charged atoms. The charges are spread
each on typically 4x4x4 (=PME order) grid points, and in this spreading only
charged atoms will take part. So the spreading part (and also the force interpolation part)
will become faster with less charges. However, the rest of PME (the Fourier transforms
and calculations in reciprocal space) are unaffected by the number of charges. For
this only the size of the whole PME grid matters. You could try to lower the number of
PME grid points (enlarge fourierspacing) and at the same time enhance the PME order
(to 6 for example) to keep a comparable force accuracy. You could also try to shift
more load to real space, which will also lower the number of PME grid points (g_tune_pme
can do that for you). But I am not shure that you can get large performance benefits
from that.
Best,
Carsten
> (Ok, i know that this is a rather specical system, in so far that in most md-simulations the number of uncharged particles is negligible.)
> Would it be probably better to move the question to the developer-list?
>
> Greetings
> Thomas
>
>
>> On 17/01/2012 7:32 PM, Thomas Schlesier wrote:
>>> On 17/01/2012 4:55 AM, Thomas Schlesier wrote:
>>>>> Dear all,
>>>>> Is there a way to omit particles with zero charge from calculations
>>>>> for Coulomb-interactions or PME?
>>>>> In my calculations i want to coarse-grain my solvent, but the solute
>>>>> should be still represented by atoms. In doing so the
>>>>> solvent-molecules have a zero charge. I noticed that for a simulation
>>>>> with only the CG-solvent significant time was spent for the PME-part
>>>>> of the simulation.
>>>>> If i would simulate the complete system (atomic solute +
>>>>> coarse-grained solvent), i would save only time for the reduced
>>> number
>>>>> of particles (compared to atomistic solvent). But if i could omit the
>>>>> zero-charge solvent from the Coulomb-/PME-part, it would save much
>>>>> additional time.
>>>>>
>>>>> Is there an easy way for the omission, or would one have to hack the
>>>>> code? If the latter is true, how hard would it be and where do i have
>>>>> to look?
>>>>> (First idea would be to create an index-file group with all
>>>>> non-zero-charged particles and then run in the loops needed for
>>>>> Coulomb/PME only over this subset of particles.)
>>>>> I have only experience with Fortran and not with C++.
>>>>>
>>>>> Only other solution which comes to my mind would be to use plain
>>>>> cut-offs for the Coulomb-part. This would save time required for
>>> doing
>>>>> PME but will in turn cost time for the calculations of zeros
>>>>> (Coulomb-interaction for the CG-solvent). But more importantly would
>>>>> introduce artifacts from the plain cut-off :(
>>>
>>>> Particles with zero charge are not included in neighbour lists used
>>>> for calculating Coulomb interactions. The statistics in the "M E G A
>>> ->F L O P S A C C O U N T I N G" section of the .log file will show
>>>> that there is significant use of loops that do not have "Coul"
>>>> component. So already these have no effect on half of the PME
>>>> calculation. I don't know whether the grid part is similarly
>>>> optimized, but you can test this yourself by comparing timing of runs
>>>> with and without charged solvent.
>>>>
>>>> Mark
>>>
>>> Ok, i will test this.
>>> But here is the data i obtained for two simulations, one with plain
>>> cut-off and the other with PME. As one sees the simulation with plain
>>> cut-offs is much faster (by a factor of 6).
>>
>> Yes. I think I have seen this before for PME when (some grid cells) are
>> lacking (many) charged particles.
>>
>> You will see that the nonbonded loops are always "VdW(T)" for tabulated
>> VdW - you have no charges at all in this system and GROMACS has already
>> optimized its choice of nonbonded loops accordingly. You would see
>> "Coul(T) + VdW(T)" if your solvent had charge.
>>
>> It's not a meaningful test of the performance of PME vs cut-off, either,
>> because there are no charges.
>>
>> Mark
>>
>>>
>>>
>>> ---------------------------------------------------------------------------
>>>
>>> With PME:
>>>
>>> M E G A - F L O P S A C C O U N T I N G
>>>
>>> RF=Reaction-Field FE=Free Energy SCFE=Soft-Core/Free Energy
>>> T=Tabulated W3=SPC/TIP3p W4=TIP4p (single or pairs)
>>> NF=No Forces
>>>
>>> Computing: M-Number M-Flops % Flops
>>> -----------------------------------------------------------------------
>>> VdW(T) 1132.029152 61129.574 0.1
>>> Outer nonbonded loop 1020.997718 10209.977 0.0
>>> Calc Weights 16725.001338 602100.048 0.6
>>> Spread Q Bspline 356800.028544 713600.057 0.7
>>> Gather F Bspline 356800.028544 4281600.343 4.4
>>> 3D-FFT 9936400.794912 79491206.359 81.6
>>> Solve PME 180000.014400 11520000.922 11.8
>>> NS-Pairs 2210.718786 46425.095 0.0
>>> Reset In Box 1115.000000 3345.000 0.0
>>> CG-CoM 1115.000446 3345.001 0.0
>>> Virial 7825.000626 140850.011 0.1
>>> Ext.ens. Update 5575.000446 301050.024 0.3
>>> Stop-CM 5575.000446 55750.004 0.1
>>> Calc-Ekin 5575.000892 150525.024 0.2
>>> -----------------------------------------------------------------------
>>> Total 97381137.440 100.0
>>> -----------------------------------------------------------------------
>>> D O M A I N D E C O M P O S I T I O N S T A T I S T I C S
>>>
>>> av. #atoms communicated per step for force: 2 x 94.1
>>>
>>> Average load imbalance: 10.7 %
>>> Part of the total run time spent waiting due to load imbalance: 0.1 %
>>>
>>>
>>> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>>>
>>> Computing: Nodes Number G-Cycles Seconds %
>>> -----------------------------------------------------------------------
>>> Domain decomp. 4 2500000 903.835 308.1 1.8
>>> Comm. coord. 4 12500001 321.930 109.7 0.6
>>> Neighbor search 4 2500001 1955.330 666.5 3.8
>>> Force 4 12500001 696.668 237.5 1.4
>>> Wait + Comm. F 4 12500001 384.107 130.9 0.7
>>> PME mesh 4 12500001 43854.818 14948.2 85.3
>>> Write traj. 4 5001 1.489 0.5 0.0
>>> Update 4 12500001 1137.630 387.8 2.2
>>> Comm. energies 4 12500001 1074.541 366.3 2.1
>>> Rest 4 1093.194 372.6 2.1
>>> -----------------------------------------------------------------------
>>> Total 4 51423.541 17528.0 100.0
>>> -----------------------------------------------------------------------
>>>
>>> Parallel run - timing based on wallclock.
>>>
>>> NODE (s) Real (s) (%)
>>> Time: 4382.000 4382.000 100.0
>>> 1h13:02
>>> (Mnbf/s) (GFlops) (ns/day) (hour/ns)
>>> Performance: 0.258 22.223 492.926 0.049
>>>
>>> -----------------------------------------------------------------------------------
>>>
>>> -----------------------------------------------------------------------------------
>>>
>>>
>>> With plain cut-offs
>>>
>>> M E G A - F L O P S A C C O U N T I N G
>>>
>>> RF=Reaction-Field FE=Free Energy SCFE=Soft-Core/Free Energy
>>> T=Tabulated W3=SPC/TIP3p W4=TIP4p (single or pairs)
>>> NF=No Forces
>>>
>>> Computing: M-Number M-Flops % Flops
>>> -----------------------------------------------------------------------
>>> VdW(T) 1137.009596 61398.518 7.9
>>> Outer nonbonded loop 1020.973338 10209.733 1.3
>>> NS-Pairs 2213.689975 46487.489 6.0
>>> Reset In Box 1115.000000 3345.000 0.4
>>> CG-CoM 1115.000446 3345.001 0.4
>>> Virial 7825.000626 140850.011 18.2
>>> Ext.ens. Update 5575.000446 301050.024 38.9
>>> Stop-CM 5575.000446 55750.004 7.2
>>> Calc-Ekin 5575.000892 150525.024 19.5
>>> -----------------------------------------------------------------------
>>> Total 772960.806 100.0
>>> -----------------------------------------------------------------------
>>> D O M A I N D E C O M P O S I T I O N S T A T I S T I C S
>>>
>>> av. #atoms communicated per step for force: 2 x 93.9
>>>
>>> Average load imbalance: 16.0 %
>>> Part of the total run time spent waiting due to load imbalance: 0.9 %
>>>
>>>
>>> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>>>
>>> Computing: Nodes Number G-Cycles Seconds %
>>> -----------------------------------------------------------------------
>>> Domain decomp. 4 2500000 856.561 291.8 12.7
>>> Comm. coord. 4 12500001 267.036 91.0 3.9
>>> Neighbor search 4 2500001 2077.236 707.6 30.7
>>> Force 4 12500001 377.606 128.6 5.6
>>> Wait + Comm. F 4 12500001 347.270 118.3 5.1
>>> Write traj. 4 5001 1.166 0.4 0.0
>>> Update 4 12500001 1109.008 377.8 16.4
>>> Comm. energies 4 12500001 841.530 286.7 12.4
>>> Rest 4 886.195 301.9 13.1
>>> -----------------------------------------------------------------------
>>> Total 4 6763.608 2304.0 100.0
>>> -----------------------------------------------------------------------
>>>
>>> NOTE: 12 % of the run time was spent communicating energies,
>>> you might want to use the -nosum option of mdrun
>>>
>>>
>>> Parallel run - timing based on wallclock.
>>>
>>> NODE (s) Real (s) (%)
>>> Time: 576.000 576.000 100.0
>>> 9:36
>>> (Mnbf/s) (GFlops) (ns/day) (hour/ns)
>>> Performance: 1.974 1.342 3750.001 0.006
> --
> gmx-users mailing list gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
More information about the gromacs.org_gmx-users
mailing list