[gmx-users] Is there a way to omit particles with, q=0, from Coulomb-/PME-calculations?

Thomas Schlesier schlesi at uni-mainz.de
Tue Jan 17 10:29:49 CET 2012


But would there be a way to optimize it further?
In my real simulation i would have a charged solute and the uncharged 
solvent (both have nearly the same number of particles). If i could omit 
the uncharged solvent from the long-ranged coulomb-calculation (PME) it 
would save much time.
Or is there a reason that some of the PME stuff is also calculated for 
uncharged particles?
(Ok, i know that this is a rather specical system, in so far that in 
most md-simulations the number of uncharged particles is negligible.)
Would it be probably better to move the question to the developer-list?

Greetings
Thomas


> On 17/01/2012 7:32 PM, Thomas Schlesier wrote:
>> On 17/01/2012 4:55 AM, Thomas Schlesier wrote:
>>>> Dear all,
>>>> Is there a way to omit particles with zero charge from calculations
>>>> for Coulomb-interactions or PME?
>>>> In my calculations i want to coarse-grain my solvent, but the solute
>>>> should be still represented by atoms. In doing so the
>>>> solvent-molecules have a zero charge. I noticed that for a simulation
>>>> with only the CG-solvent significant time was spent for the PME-part
>>>> of the simulation.
>>>> If i would simulate the complete system (atomic solute +
>>>> coarse-grained solvent), i would save only time for the reduced
>> number
>>>> of particles (compared to atomistic solvent). But if i could omit the
>>>> zero-charge solvent from the Coulomb-/PME-part, it would save much
>>>> additional time.
>>>>
>>>> Is there an easy way for the omission, or would one have to hack the
>>>> code? If the latter is true, how hard would it be and where do i have
>>>> to look?
>>>> (First idea would be to create an index-file group with all
>>>> non-zero-charged particles and then run in the loops needed for
>>>> Coulomb/PME only over this subset of particles.)
>>>> I have only experience with Fortran and not with C++.
>>>>
>>>> Only other solution which comes to my mind would be to use plain
>>>> cut-offs for the Coulomb-part. This would save time required for
>> doing
>>>> PME but will in turn cost time for the calculations of zeros
>>>> (Coulomb-interaction for the CG-solvent). But more importantly would
>>>> introduce artifacts from the plain cut-off :(
>>
>>> Particles with zero charge are not included in neighbour lists used
>>> for calculating Coulomb interactions. The statistics in the "M E G A
>> ->F L O P S   A C C O U N T I N G" section of the .log file will show
>>> that there is significant use of loops that do not have "Coul"
>>> component. So already these have no effect on half of the PME
>>> calculation. I don't know whether the grid part is similarly
>>> optimized, but you can test this yourself by comparing timing of runs
>>> with and without charged solvent.
>>>
>>> Mark
>>
>> Ok, i will test this.
>> But here is the data i obtained for two simulations, one with plain
>> cut-off and the other with PME. As one sees the simulation with plain
>> cut-offs is much faster (by a factor of 6).
>
> Yes. I think I have seen this before for PME when (some grid cells) are
> lacking (many) charged particles.
>
> You will see that the nonbonded loops are always "VdW(T)" for tabulated
> VdW - you have no charges at all in this system and GROMACS has already
> optimized its choice of nonbonded loops accordingly. You would see
> "Coul(T) + VdW(T)" if your solvent had charge.
>
> It's not a meaningful test of the performance of PME vs cut-off, either,
> because there are no charges.
>
> Mark
>
>>
>>
>> ---------------------------------------------------------------------------
>>
>> With PME:
>>
>>          M E G A - F L O P S   A C C O U N T I N G
>>
>>     RF=Reaction-Field  FE=Free Energy  SCFE=Soft-Core/Free Energy
>>     T=Tabulated        W3=SPC/TIP3p    W4=TIP4p (single or pairs)
>>     NF=No Forces
>>
>>   Computing:                         M-Number         M-Flops  % Flops
>> -----------------------------------------------------------------------
>>   VdW(T)                          1132.029152       61129.574     0.1
>>   Outer nonbonded loop            1020.997718       10209.977     0.0
>>   Calc Weights                   16725.001338      602100.048     0.6
>>   Spread Q Bspline              356800.028544      713600.057     0.7
>>   Gather F Bspline              356800.028544     4281600.343     4.4
>>   3D-FFT                       9936400.794912    79491206.359    81.6
>>   Solve PME                     180000.014400    11520000.922    11.8
>>   NS-Pairs                        2210.718786       46425.095     0.0
>>   Reset In Box                    1115.000000        3345.000     0.0
>>   CG-CoM                          1115.000446        3345.001     0.0
>>   Virial                          7825.000626      140850.011     0.1
>>   Ext.ens. Update                 5575.000446      301050.024     0.3
>>   Stop-CM                         5575.000446       55750.004     0.1
>>   Calc-Ekin                       5575.000892      150525.024     0.2
>> -----------------------------------------------------------------------
>>   Total                                          97381137.440   100.0
>> -----------------------------------------------------------------------
>>      D O M A I N   D E C O M P O S I T I O N   S T A T I S T I C S
>>
>>   av. #atoms communicated per step for force:  2 x 94.1
>>
>>   Average load imbalance: 10.7 %
>>   Part of the total run time spent waiting due to load imbalance: 0.1 %
>>
>>
>>       R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
>>
>>   Computing:         Nodes     Number     G-Cycles    Seconds     %
>> -----------------------------------------------------------------------
>>   Domain decomp.         4    2500000      903.835      308.1     1.8
>>   Comm. coord.           4   12500001      321.930      109.7     0.6
>>   Neighbor search        4    2500001     1955.330      666.5     3.8
>>   Force                  4   12500001      696.668      237.5     1.4
>>   Wait + Comm. F         4   12500001      384.107      130.9     0.7
>>   PME mesh               4   12500001    43854.818    14948.2    85.3
>>   Write traj.            4       5001        1.489        0.5     0.0
>>   Update                 4   12500001     1137.630      387.8     2.2
>>   Comm. energies         4   12500001     1074.541      366.3     2.1
>>   Rest                   4                1093.194      372.6     2.1
>> -----------------------------------------------------------------------
>>   Total                  4               51423.541    17528.0   100.0
>> -----------------------------------------------------------------------
>>
>>          Parallel run - timing based on wallclock.
>>
>>                 NODE (s)   Real (s)      (%)
>>         Time:   4382.000   4382.000    100.0
>>                         1h13:02
>>                 (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
>> Performance:      0.258     22.223    492.926      0.049
>>
>> -----------------------------------------------------------------------------------
>>
>> -----------------------------------------------------------------------------------
>>
>>
>> With plain cut-offs
>>
>>          M E G A - F L O P S   A C C O U N T I N G
>>
>>     RF=Reaction-Field  FE=Free Energy  SCFE=Soft-Core/Free Energy
>>     T=Tabulated        W3=SPC/TIP3p    W4=TIP4p (single or pairs)
>>     NF=No Forces
>>
>>   Computing:                         M-Number         M-Flops  % Flops
>> -----------------------------------------------------------------------
>>   VdW(T)                          1137.009596       61398.518     7.9
>>   Outer nonbonded loop            1020.973338       10209.733     1.3
>>   NS-Pairs                        2213.689975       46487.489     6.0
>>   Reset In Box                    1115.000000        3345.000     0.4
>>   CG-CoM                          1115.000446        3345.001     0.4
>>   Virial                          7825.000626      140850.011    18.2
>>   Ext.ens. Update                 5575.000446      301050.024    38.9
>>   Stop-CM                         5575.000446       55750.004     7.2
>>   Calc-Ekin                       5575.000892      150525.024    19.5
>> -----------------------------------------------------------------------
>>   Total                                            772960.806   100.0
>> -----------------------------------------------------------------------
>>      D O M A I N   D E C O M P O S I T I O N   S T A T I S T I C S
>>
>>   av. #atoms communicated per step for force:  2 x 93.9
>>
>>   Average load imbalance: 16.0 %
>>   Part of the total run time spent waiting due to load imbalance: 0.9 %
>>
>>
>>       R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
>>
>>   Computing:         Nodes     Number     G-Cycles    Seconds     %
>> -----------------------------------------------------------------------
>>   Domain decomp.         4    2500000      856.561      291.8    12.7
>>   Comm. coord.           4   12500001      267.036       91.0     3.9
>>   Neighbor search        4    2500001     2077.236      707.6    30.7
>>   Force                  4   12500001      377.606      128.6     5.6
>>   Wait + Comm. F         4   12500001      347.270      118.3     5.1
>>   Write traj.            4       5001        1.166        0.4     0.0
>>   Update                 4   12500001     1109.008      377.8    16.4
>>   Comm. energies         4   12500001      841.530      286.7    12.4
>>   Rest                   4                 886.195      301.9    13.1
>> -----------------------------------------------------------------------
>>   Total                  4                6763.608     2304.0   100.0
>> -----------------------------------------------------------------------
>>
>> NOTE: 12 % of the run time was spent communicating energies,
>>        you might want to use the -nosum option of mdrun
>>
>>
>>          Parallel run - timing based on wallclock.
>>
>>                 NODE (s)   Real (s)      (%)
>>         Time:    576.000    576.000    100.0
>>                         9:36
>>                 (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
>> Performance:      1.974      1.342   3750.001      0.006



More information about the gromacs.org_gmx-users mailing list