[gmx-users] PME grid parameters in large system run in parallel in Gromacs 4 CVS

Wed Feb 27 11:48:30 CET 2008

Ok, this will be a long one.

Currently I am simulating large systems (20-35 nm sides; rhombic  
dodecahedron or rectangular boxes) in the latest CVS version of  
Gromacs on 70-500 nodes using PME to calculate long range  
electrostatic interactions. Now I am looking for some advice  
regarding how to set up my system for good performance.

There is a correlation between the short range cutoff (rlist/rcoulomb/ 
rvdw), PME grid spacing (fourierspacing, fourier_nx/fourier_ny/ 
fourier_nz) and the PME interpolation order (pme_order). Grompp tells  
me that for best performance there should be about 25-33% load on the  
dedicated PME nodes compared to the regular particle-particle nodes.  
I guess this value is due to the communication overhead of the PME  
nodes.

Two of my setups are:

SYSTEM 1
Dimentions: 24.12032  24.12032  17.05213   0.00000   0.00000    
0.00000   0.00000  12.06016  12.06016
Fourier grid spacing: 0.137 x 0.137 x 0.137
Cutoffs: 1.1
Interpolation order: 4
Estimated PME load: 0.28

SYSTEM 2
Dimentions: 23.53603  24.05188  35.23882
Fourier grid spacing: 0.134 0.137 0.133
Cutoffs: 1.1
Interpolation order: 4
Estimated PME load: 0.51

So I am interested to hear from you how far I can push the fourier  
grid spacing before I loose accuracy in terms of force and energy.  
How much must I compensate with increased interpolation and increased  
cutoff? Cutoff will put more load on the PP nodes which is what I  
want in this case, but would an increased interpolation order also be  
required/advantageous? I use the fft optimization option.

In the original paper on PME (Darden et al. 1993) there is a  
comparison of PME order 1-4 and grid size of 0.1-0.05 nm which shows  
that a grid spacing of 0.075 nm and an interpolation order of 3 gives  
an rms force error of 2x10^-4, which they say is reasonable. Does  
anyone of you know of a more recent equivalent test on larger systems?

A somewhat unrelated question is about the heuristics of mdrun when  
it decides how many PME nodes versus PP nodes it should use and when  
it does the PP domain decomposition. It would be quite useful with a  
tool that could suggest a desirable number of nodes within a given  
interval. If I can understand in which order the decisions are made  
and which constrains are imposed I will gladly write such a tool  
myself. (More than once has my simulations failed after a long time  
in the queue due to an incompatible number of nodes. A feature of  
mdrun would be to choose not to use x number of nodes if it would  
improve performance rather than just die.)

---

Daniel Larsson
Molecular Biophysics group
Department of Cell and Molecular Biology
Uppsala University

+46-18-471 4006 (phone)
+46-18-511 755  (fax)
http://xray.bmc.uu.se/~larsson
larsson at xray.bmc.uu.se