[gmx-users] Re: Is there a way to omit particles with q=0, from Coulomb-/PME-calculations?

Mark Abraham Mark.Abraham at anu.edu.au
Tue Jan 17 09:55:07 CET 2012

On 17/01/2012 7:32 PM, Thomas Schlesier wrote:
> On 17/01/2012 4:55 AM, Thomas Schlesier wrote:
> > > Dear all,
> > > Is there a way to omit particles with zero charge from calculations
> > > for Coulomb-interactions or PME?
> > > In my calculations i want to coarse-grain my solvent, but the solute
> > > should be still represented by atoms. In doing so the
> > > solvent-molecules have a zero charge. I noticed that for a simulation
> > > with only the CG-solvent significant time was spent for the PME-part
> > > of the simulation.
> > > If i would simulate the complete system (atomic solute +
> > > coarse-grained solvent), i would save only time for the reduced 
> number
> > > of particles (compared to atomistic solvent). But if i could omit the
> > > zero-charge solvent from the Coulomb-/PME-part, it would save much
> > > additional time.
> > >
> > > Is there an easy way for the omission, or would one have to hack the
> > > code? If the latter is true, how hard would it be and where do i have
> > > to look?
> > > (First idea would be to create an index-file group with all
> > > non-zero-charged particles and then run in the loops needed for
> > > Coulomb/PME only over this subset of particles.)
> > > I have only experience with Fortran and not with C++.
> > >
> > > Only other solution which comes to my mind would be to use plain
> > > cut-offs for the Coulomb-part. This would save time required for 
> doing
> > > PME but will in turn cost time for the calculations of zeros
> > > (Coulomb-interaction for the CG-solvent). But more importantly would
> > > introduce artifacts from the plain cut-off :(
> >Particles with zero charge are not included in neighbour lists used 
> >for calculating Coulomb interactions. The statistics in the "M E G A 
> - >F L O P S   A C C O U N T I N G" section of the .log file will show 
> >that there is significant use of loops that do not have "Coul" 
> >component. So already these have no effect on half of the PME 
> >calculation. I don't know whether the grid part is similarly 
> >optimized, but you can test this yourself by comparing timing of runs 
> >with and without charged solvent.
> >
> >Mark
> Ok, i will test this.
> But here is the data i obtained for two simulations, one with plain 
> cut-off and the other with PME. As one sees the simulation with plain 
> cut-offs is much faster (by a factor of 6).

Yes. I think I have seen this before for PME when (some grid cells) are 
lacking (many) charged particles.

You will see that the nonbonded loops are always "VdW(T)" for tabulated 
VdW - you have no charges at all in this system and GROMACS has already 
optimized its choice of nonbonded loops accordingly. You would see 
"Coul(T) + VdW(T)" if your solvent had charge.

It's not a meaningful test of the performance of PME vs cut-off, either, 
because there are no charges.


> --------------------------------------------------------------------------- 
> With PME:
>         M E G A - F L O P S   A C C O U N T I N G
>    RF=Reaction-Field  FE=Free Energy  SCFE=Soft-Core/Free Energy
>    T=Tabulated        W3=SPC/TIP3p    W4=TIP4p (single or pairs)
>    NF=No Forces
>  Computing:                         M-Number         M-Flops  % Flops
> -----------------------------------------------------------------------
>  VdW(T)                          1132.029152       61129.574     0.1
>  Outer nonbonded loop            1020.997718       10209.977     0.0
>  Calc Weights                   16725.001338      602100.048     0.6
>  Spread Q Bspline              356800.028544      713600.057     0.7
>  Gather F Bspline              356800.028544     4281600.343     4.4
>  3D-FFT                       9936400.794912    79491206.359    81.6
>  Solve PME                     180000.014400    11520000.922    11.8
>  NS-Pairs                        2210.718786       46425.095     0.0
>  Reset In Box                    1115.000000        3345.000     0.0
>  CG-CoM                          1115.000446        3345.001     0.0
>  Virial                          7825.000626      140850.011     0.1
>  Ext.ens. Update                 5575.000446      301050.024     0.3
>  Stop-CM                         5575.000446       55750.004     0.1
>  Calc-Ekin                       5575.000892      150525.024     0.2
> -----------------------------------------------------------------------
>  Total                                          97381137.440   100.0
> -----------------------------------------------------------------------
>     D O M A I N   D E C O M P O S I T I O N   S T A T I S T I C S
>  av. #atoms communicated per step for force:  2 x 94.1
>  Average load imbalance: 10.7 %
>  Part of the total run time spent waiting due to load imbalance: 0.1 %
>      R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
>  Computing:         Nodes     Number     G-Cycles    Seconds     %
> -----------------------------------------------------------------------
>  Domain decomp.         4    2500000      903.835      308.1     1.8
>  Comm. coord.           4   12500001      321.930      109.7     0.6
>  Neighbor search        4    2500001     1955.330      666.5     3.8
>  Force                  4   12500001      696.668      237.5     1.4
>  Wait + Comm. F         4   12500001      384.107      130.9     0.7
>  PME mesh               4   12500001    43854.818    14948.2    85.3
>  Write traj.            4       5001        1.489        0.5     0.0
>  Update                 4   12500001     1137.630      387.8     2.2
>  Comm. energies         4   12500001     1074.541      366.3     2.1
>  Rest                   4                1093.194      372.6     2.1
> -----------------------------------------------------------------------
>  Total                  4               51423.541    17528.0   100.0
> -----------------------------------------------------------------------
>         Parallel run - timing based on wallclock.
>                NODE (s)   Real (s)      (%)
>        Time:   4382.000   4382.000    100.0
>                        1h13:02
>                (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
> Performance:      0.258     22.223    492.926      0.049
> ----------------------------------------------------------------------------------- 
> ----------------------------------------------------------------------------------- 
> With plain cut-offs
>         M E G A - F L O P S   A C C O U N T I N G
>    RF=Reaction-Field  FE=Free Energy  SCFE=Soft-Core/Free Energy
>    T=Tabulated        W3=SPC/TIP3p    W4=TIP4p (single or pairs)
>    NF=No Forces
>  Computing:                         M-Number         M-Flops  % Flops
> -----------------------------------------------------------------------
>  VdW(T)                          1137.009596       61398.518     7.9
>  Outer nonbonded loop            1020.973338       10209.733     1.3
>  NS-Pairs                        2213.689975       46487.489     6.0
>  Reset In Box                    1115.000000        3345.000     0.4
>  CG-CoM                          1115.000446        3345.001     0.4
>  Virial                          7825.000626      140850.011    18.2
>  Ext.ens. Update                 5575.000446      301050.024    38.9
>  Stop-CM                         5575.000446       55750.004     7.2
>  Calc-Ekin                       5575.000892      150525.024    19.5
> -----------------------------------------------------------------------
>  Total                                            772960.806   100.0
> -----------------------------------------------------------------------
>     D O M A I N   D E C O M P O S I T I O N   S T A T I S T I C S
>  av. #atoms communicated per step for force:  2 x 93.9
>  Average load imbalance: 16.0 %
>  Part of the total run time spent waiting due to load imbalance: 0.9 %
>      R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
>  Computing:         Nodes     Number     G-Cycles    Seconds     %
> -----------------------------------------------------------------------
>  Domain decomp.         4    2500000      856.561      291.8    12.7
>  Comm. coord.           4   12500001      267.036       91.0     3.9
>  Neighbor search        4    2500001     2077.236      707.6    30.7
>  Force                  4   12500001      377.606      128.6     5.6
>  Wait + Comm. F         4   12500001      347.270      118.3     5.1
>  Write traj.            4       5001        1.166        0.4     0.0
>  Update                 4   12500001     1109.008      377.8    16.4
>  Comm. energies         4   12500001      841.530      286.7    12.4
>  Rest                   4                 886.195      301.9    13.1
> -----------------------------------------------------------------------
>  Total                  4                6763.608     2304.0   100.0
> -----------------------------------------------------------------------
> NOTE: 12 % of the run time was spent communicating energies,
>       you might want to use the -nosum option of mdrun
>         Parallel run - timing based on wallclock.
>                NODE (s)   Real (s)      (%)
>        Time:    576.000    576.000    100.0
>                        9:36
>                (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
> Performance:      1.974      1.342   3750.001      0.006

More information about the gromacs.org_gmx-users mailing list