[gmx-users] Re: Is there a way to omit particles with q=0, from Coulomb-/PME-calculations?
Thomas Schlesier
schlesi at uni-mainz.de
Tue Jan 17 09:32:39 CET 2012
On 17/01/2012 4:55 AM, Thomas Schlesier wrote:
> > Dear all,
> > Is there a way to omit particles with zero charge from calculations
> > for Coulomb-interactions or PME?
> > In my calculations i want to coarse-grain my solvent, but the solute
> > should be still represented by atoms. In doing so the
> > solvent-molecules have a zero charge. I noticed that for a simulation
> > with only the CG-solvent significant time was spent for the PME-part
> > of the simulation.
> > If i would simulate the complete system (atomic solute +
> > coarse-grained solvent), i would save only time for the reduced number
> > of particles (compared to atomistic solvent). But if i could omit the
> > zero-charge solvent from the Coulomb-/PME-part, it would save much
> > additional time.
> >
> > Is there an easy way for the omission, or would one have to hack the
> > code? If the latter is true, how hard would it be and where do i have
> > to look?
> > (First idea would be to create an index-file group with all
> > non-zero-charged particles and then run in the loops needed for
> > Coulomb/PME only over this subset of particles.)
> > I have only experience with Fortran and not with C++.
> >
> > Only other solution which comes to my mind would be to use plain
> > cut-offs for the Coulomb-part. This would save time required for doing
> > PME but will in turn cost time for the calculations of zeros
> > (Coulomb-interaction for the CG-solvent). But more importantly would
> > introduce artifacts from the plain cut-off :(
>Particles with zero charge are not included in neighbour lists used
>for calculating Coulomb interactions. The statistics in the "M E G A -
>F L O P S A C C O U N T I N G" section of the .log file will show
>that there is significant use of loops that do not have "Coul"
>component. So already these have no effect on half of the PME
>calculation. I don't know whether the grid part is similarly
>optimized, but you can test this yourself by comparing timing of runs
>with and without charged solvent.
>
>Mark
Ok, i will test this.
But here is the data i obtained for two simulations, one with plain
cut-off and the other with PME. As one sees the simulation with plain
cut-offs is much faster (by a factor of 6).
---------------------------------------------------------------------------
With PME:
M E G A - F L O P S A C C O U N T I N G
RF=Reaction-Field FE=Free Energy SCFE=Soft-Core/Free Energy
T=Tabulated W3=SPC/TIP3p W4=TIP4p (single or pairs)
NF=No Forces
Computing: M-Number M-Flops % Flops
-----------------------------------------------------------------------
VdW(T) 1132.029152 61129.574 0.1
Outer nonbonded loop 1020.997718 10209.977 0.0
Calc Weights 16725.001338 602100.048 0.6
Spread Q Bspline 356800.028544 713600.057 0.7
Gather F Bspline 356800.028544 4281600.343 4.4
3D-FFT 9936400.794912 79491206.359 81.6
Solve PME 180000.014400 11520000.922 11.8
NS-Pairs 2210.718786 46425.095 0.0
Reset In Box 1115.000000 3345.000 0.0
CG-CoM 1115.000446 3345.001 0.0
Virial 7825.000626 140850.011 0.1
Ext.ens. Update 5575.000446 301050.024 0.3
Stop-CM 5575.000446 55750.004 0.1
Calc-Ekin 5575.000892 150525.024 0.2
-----------------------------------------------------------------------
Total 97381137.440 100.0
-----------------------------------------------------------------------
D O M A I N D E C O M P O S I T I O N S T A T I S T I C S
av. #atoms communicated per step for force: 2 x 94.1
Average load imbalance: 10.7 %
Part of the total run time spent waiting due to load imbalance: 0.1 %
R E A L C Y C L E A N D T I M E A C C O U N T I N G
Computing: Nodes Number G-Cycles Seconds %
-----------------------------------------------------------------------
Domain decomp. 4 2500000 903.835 308.1 1.8
Comm. coord. 4 12500001 321.930 109.7 0.6
Neighbor search 4 2500001 1955.330 666.5 3.8
Force 4 12500001 696.668 237.5 1.4
Wait + Comm. F 4 12500001 384.107 130.9 0.7
PME mesh 4 12500001 43854.818 14948.2 85.3
Write traj. 4 5001 1.489 0.5 0.0
Update 4 12500001 1137.630 387.8 2.2
Comm. energies 4 12500001 1074.541 366.3 2.1
Rest 4 1093.194 372.6 2.1
-----------------------------------------------------------------------
Total 4 51423.541 17528.0 100.0
-----------------------------------------------------------------------
Parallel run - timing based on wallclock.
NODE (s) Real (s) (%)
Time: 4382.000 4382.000 100.0
1h13:02
(Mnbf/s) (GFlops) (ns/day) (hour/ns)
Performance: 0.258 22.223 492.926 0.049
-----------------------------------------------------------------------------------
-----------------------------------------------------------------------------------
With plain cut-offs
M E G A - F L O P S A C C O U N T I N G
RF=Reaction-Field FE=Free Energy SCFE=Soft-Core/Free Energy
T=Tabulated W3=SPC/TIP3p W4=TIP4p (single or pairs)
NF=No Forces
Computing: M-Number M-Flops % Flops
-----------------------------------------------------------------------
VdW(T) 1137.009596 61398.518 7.9
Outer nonbonded loop 1020.973338 10209.733 1.3
NS-Pairs 2213.689975 46487.489 6.0
Reset In Box 1115.000000 3345.000 0.4
CG-CoM 1115.000446 3345.001 0.4
Virial 7825.000626 140850.011 18.2
Ext.ens. Update 5575.000446 301050.024 38.9
Stop-CM 5575.000446 55750.004 7.2
Calc-Ekin 5575.000892 150525.024 19.5
-----------------------------------------------------------------------
Total 772960.806 100.0
-----------------------------------------------------------------------
D O M A I N D E C O M P O S I T I O N S T A T I S T I C S
av. #atoms communicated per step for force: 2 x 93.9
Average load imbalance: 16.0 %
Part of the total run time spent waiting due to load imbalance: 0.9 %
R E A L C Y C L E A N D T I M E A C C O U N T I N G
Computing: Nodes Number G-Cycles Seconds %
-----------------------------------------------------------------------
Domain decomp. 4 2500000 856.561 291.8 12.7
Comm. coord. 4 12500001 267.036 91.0 3.9
Neighbor search 4 2500001 2077.236 707.6 30.7
Force 4 12500001 377.606 128.6 5.6
Wait + Comm. F 4 12500001 347.270 118.3 5.1
Write traj. 4 5001 1.166 0.4 0.0
Update 4 12500001 1109.008 377.8 16.4
Comm. energies 4 12500001 841.530 286.7 12.4
Rest 4 886.195 301.9 13.1
-----------------------------------------------------------------------
Total 4 6763.608 2304.0 100.0
-----------------------------------------------------------------------
NOTE: 12 % of the run time was spent communicating energies,
you might want to use the -nosum option of mdrun
Parallel run - timing based on wallclock.
NODE (s) Real (s) (%)
Time: 576.000 576.000 100.0
9:36
(Mnbf/s) (GFlops) (ns/day) (hour/ns)
Performance: 1.974 1.342 3750.001 0.006
More information about the gromacs.org_gmx-users
mailing list