[gmx-users] Energies in simulation and rerun using different core counts

Richard Broadbent richard.broadbent09 at imperial.ac.uk
Fri Sep 7 18:55:32 CEST 2012


Dear All,

I've been having some issues with energies with gromacs running on 
various core counts for a 7469 polymer in solvent system, constraining 
all bonds and running with a 2fs time step. I used PME-shift (1.05nm, 
1.10nm), and a shift with the same parameters for the VdW, I am using 
the OPLS-AA force field with fourierspacing  = 0.10. and the md-vv 
integrator.

I am running gromacs 4.5.5 compiled from the tarball on gromacs.org

To try and track it down I ran a 100ps  NVE simulation outputting 
coordinates and velocities every 1ps. I then used the trr trajectory 
file and ran:

$ mdrun_d -s nve_short.tpr -rerun reference.trr  -deffnm 8_cores -reprod 
-nt 8
$ mdrun_d -s nve_short.tpr -rerun reference.trr  -deffnm 4_cores -reprod 
-nt 4
$ mdrun_d -s nve_short.tpr -rerun reference.trr  -deffnm 1_cores -reprod 
-nt 1

then:

$  g_energy_d -f 8_cores.edr -o 8_cores.xvg << EOF
10
11
12
13
14
15
EOF

etc.
Where nve_short.tpr is the input file used for the original simulation 
and reference.trr is the trajectory it produced (I did not output forces 
into this which was an oversight).

These were run on my machine a quad core hyper-threaded intel xeon. I 
also used performed the rerun on our local cluster on 12, 24, and 36 
cores (dual 6 core intel xeon nodes with infiniband interconnects)

The resulting energy files are significantly different energies for these
snapshots summarised in this table:

number of cores,      Potential Energy,        Standard Deviation
------------------------------------------------------------------------------------
reference                 7912.74479607 180.525445863
1_cores                    9635.92644669 180.525445891
4_cores                    1061.8467459 244.14154375
8_cores                      776.470114871 208.368028756
12_cores                    374.243502525 204.012539953
24_cores                    667.44876102 502.041766722
36_cores                    616.93105205 476.190500738

(reference is the energy extracted from the original simulation run on a 
12 core node)

The large variation in standard deviation means that these energies are 
not simply shifted but behaving differently which is apparent from a 
plot of the potential energies.

has anyone else noticed any sort of inconsistency? Does anyone have any 
advice about what might cause this? Am I doing anything stupid?

Thanks,

Richard







More information about the gromacs.org_gmx-users mailing list