[gmx-users] Energy/temperature drifts in Gromacs 4.0 / inconsistencies with Gromacs 3.3.1
Pietro Amodeo
pamodeo at icmib.na.cnr.it
Fri Mar 13 17:07:33 CET 2009
Hi Gromacs users/developers,
we have two Gromacs installations on two different clusters with the
following sw versions:
1) Cluster: OLD(Myrinet)
Gromacs 3.3.1
(CentOS 4 / Rocks 4.1)
kernel 2.6.9-22.ELsmp
gcc 3.4.4
fftw 3.1.2
mpich-gm 1.2.7p1..18
2) Cluster: NEW(Infiniband)
Gromacs 4.0.4 / 4.0.3
(CentOS 5)
kernel 2.6.18-53.el5
gcc 4.1.2 20070626 (Red Hat 4.1.2-14) / icc 10.1 (Build 20070913
Pack.ID: l_cc_p_10.1.008)
fftw 3.2.1
ofed131 - openmpi 1.2.6
Both serial and parallel, both single- and double-precision versions of
Gromacs 4.0.3 and 4.0.4 were compiled with gcc (deprecated 4.1.2, but
tests were either passed or failed with minor discrepancies) and with
Intel 10.1 compilers).
We tried to reproduce on cluster NEW simple MD equilibrations on two
different systems (proteins solvated in SPC water + counterions)
successfully run on cluster OLD. We used as starting tpr files either the
same ones used and produced in 3.3.1, or new 4.0.4 files.
Although the starting energies for both systems were substantially equal:
------------------------------------------------------------------------------------------------------
Cluster NEW system 2:
Step Time Lambda
0 0.00000 0.00000
Energies (kJ/mol)
G96Angle Proper Dih. Improper Dih. LJ-14 Coulomb-14
7.90343e+02 7.80369e+02 2.11086e+02 4.58020e+02 1.92904e+04
LJ (SR) LJ (LR) Coulomb (SR) Coul. recip. Position Rest.
1.09286e+05 -1.43221e+03 -4.54033e+05 -5.15749e+04 2.74030e-01
Potential Kinetic En. Total Energy Temperature Pressure (bar)
-3.76224e+05 7.83268e+04 -2.97897e+05 3.12792e+02 9.77797e+03
Cons. rmsd ()
2.19464e-05
------------------------------------------------------------------------------------------------------
Cluster OLD system 2:
Rel. Constraint Deviation: Max between atoms RMS
Before LINCS 0.098014 1670 1671 0.006831
After LINCS 0.000104 509 511 0.000022
Energies (kJ/mol)
G96Angle Proper Dih. Improper Dih. LJ-14 Coulomb-14
7.90348e+02 7.80369e+02 2.11085e+02 4.58017e+02 1.92904e+04
LJ (SR) LJ (LR) Coulomb (SR) Coul. recip. Position Rest.
1.09286e+05 -1.43221e+03 -4.54033e+05 -5.15750e+04 2.74027e-01
Potential Kinetic En. Total Energy Temperature Pressure (bar)
-3.76224e+05 7.83267e+04 -2.97897e+05 3.12791e+02 9.78296e+03
------------------------------------------------------------------------------------------------------
in both cases the simulations with Gromacs 3.3.1 ran without any problem
(and provided good starting points for very stable production runs), while
those performed with Gromacs 4.0.3 or 4.0.4 after 2 ps or less
systematically started exhibiting total energy and temperature wide
oscillations with a net increasing drift in energy on both systems, and
very rapidly increasing temperature variations in system 1, that led to
premature run terminations with errors on LINCS or routines to calculate
1-4 interactions for all runs on system 1. System 2 exhibited a smaller
energy drift and rather steady, but still significant, temperature
oscillations, so the 100 ps run (8 cores, double-precision parallel
version complied with Intel compiler, starting from original 3.3.1 tpr
file) ended (apparently) regularly.
However, avg. energy was higher than in corresponding 3.3.1 simulation and
avg. temperature failed to reach the targeted 300K value. In particular
protein suffered from poor thermal relaxation under the same conditions
that in 3.3.1 simulations worked flawlessly.
The final, average and r.m.s. values from log files of the two
corresponding runs on system 2 with 3.3.1 and 4.0.4 are:
----------------------------------------------------------------------------
Cluster NEW system 2:
Step Time Lambda
50000 100.00000 0.00000
Writing checkpoint, step 50000 at Thu Mar 12 16:21:03 2009
Energies (kJ/mol)
G96Angle Proper Dih. Improper Dih. LJ-14 Coulomb-14
5.20200e+03 1.40303e+03 1.56905e+03 5.80966e+02 1.89786e+04
LJ (SR) LJ (LR) Coulomb (SR) Coul. recip. Position Rest.
7.93006e+04 -1.43727e+03 -4.96019e+05 -6.18250e+04 2.12927e+03
Potential Kinetic En. Total Energy Temperature Pressure (bar)
-4.50118e+05 8.53539e+04 -3.64764e+05 3.40853e+02 7.37194e+03
Cons. rmsd ()
6.33529e-05
<====== ############### ==>
<==== A V E R A G E S ====>
<== ############### ======>
Energies (kJ/mol)
G96Angle Proper Dih. Improper Dih. LJ-14 Coulomb-14
5.22068e+03 1.38036e+03 1.53871e+03 1.11820e+03 1.90723e+04
LJ (SR) LJ (LR) Coulomb (SR) Coul. recip. Position Rest.
7.38412e+04 -1.40961e+03 -4.82837e+05 -6.16567e+04 4.79992e+03
Potential Kinetic En. Total Energy Temperature Pressure (bar)
-4.38932e+05 8.23804e+04 -3.56552e+05 3.28979e+02 2.00692e+02
Cons. rmsd ()
0.00000e+00
Box-X Box-Y Box-Z Volume Density (SI)
7.45269e+00 7.02652e+00 6.08533e+00 3.18779e+02 9.90264e+02
pV
-6.99836e+03
Total Virial (kJ/mol)
3.09153e+04 3.94735e+01 -3.38472e+00
3.94745e+01 3.11413e+04 2.26993e+02
-3.38879e+00 2.26991e+02 3.08214e+04
Pressure (bar)
1.19126e+02 3.73568e+00 8.31997e+00
3.73558e+00 4.54768e+02 -1.26308e+01
8.32041e+00 -1.26307e+01 2.81831e+01
Total Dipole (Debye)
1.44566e+02 4.10437e+02 1.05756e+02
Epot (kJ/mol) Coul-SR LJ-SR LJ-LR
Coul-14 LJ-14
Protein-Protein -6.23914e+03 -5.96547e+03 -1.93349e+02
1.90723e+04 1.11820e+03
Protein-Non-Protein -5.33140e+03 -1.38636e+03 -1.87522e+02
0.00000e+00 0.00000e+00
Non-Protein-Non-Protein -4.71266e+05 8.11930e+04 -1.02874e+03
0.00000e+00 0.00000e+00
T-Protein T-SOL T-CL-
5.93312e+02 3.12714e+02 3.24327e+02
<====== ############################### ==>
<==== R M S - F L U C T U A T I O N S ====>
<== ############################### ======>
Energies (kJ/mol)
G96Angle Proper Dih. Improper Dih. LJ-14 Coulomb-14
9.66238e+02 1.07074e+02 1.99039e+02 8.27177e+02 7.63636e+02
LJ (SR) LJ (LR) Coulomb (SR) Coul. recip. Position Rest.
2.05222e+04 4.49488e+01 2.39437e+04 2.70027e+02 3.11921e+03
Potential Kinetic En. Total Energy Temperature Pressure (bar)
7.21911e+03 4.02779e+03 6.92839e+03 1.60846e+01 1.78298e+04
Cons. rmsd ()
0.00000e+00
Box-X Box-Y Box-Z Volume Density (SI)
8.06489e-02 7.60371e-02 6.58520e-02 1.03515e+01 3.21181e+01
pV
3.41845e+05
Total Virial (kJ/mol)
1.45509e+05 1.37533e+03 1.65462e+03
1.37534e+03 2.49318e+05 1.79716e+03
1.65462e+03 1.79717e+03 1.14997e+05
Pressure (bar)
1.52957e+04 1.45547e+02 1.74844e+02
1.45548e+02 2.61231e+04 1.90420e+02
1.74843e+02 1.90421e+02 1.21129e+04
Total Dipole (Debye)
3.03524e+02 2.43084e+02 2.30363e+02
Epot (kJ/mol) Coul-SR LJ-SR LJ-LR
Coul-14 LJ-14
Protein-Protein 5.44055e+02 1.91174e+02 4.60987e+00
7.63636e+02 8.27177e+02
Protein-Non-Protein 2.99090e+02 2.19500e+02 7.77253e+00
0.00000e+00 0.00000e+00
Non-Protein-Non-Protein 2.32076e+04 2.01585e+04 3.30115e+01
0.00000e+00 0.00000e+00
T-Protein T-SOL T-CL-
6.32406e+01 1.61761e+01 6.73077e+01
----------------------------------------------------------------------------
Cluster OLD system 2:
Step Time Lambda
50000 100.00001 0.00000
Rel. Constraint Deviation: Max between atoms RMS
Before LINCS 0.062015 369 370 0.007971
After LINCS 0.000087 231 233 0.000021
Energies (kJ/mol)
G96Angle Proper Dih. Improper Dih. LJ-14 Coulomb-14
2.70775e+03 1.04157e+03 7.93145e+02 5.36208e+02 1.91215e+04
LJ (SR) LJ (LR) Coulomb (SR) Coul. recip. Position Rest.
7.67921e+04 -1.45280e+03 -4.98410e+05 -6.19374e+04 6.32109e+02
Potential Kinetic En. Total Energy Temperature Pressure (bar)
-4.60176e+05 7.54371e+04 -3.84739e+05 3.01252e+02 -1.61958e+02
Total NODE time on node 0: 2449.05
Average NODE time: 306.131
Load imbalance reduced performance to 800% of max
<====== ############### ==>
<==== A V E R A G E S ====>
<== ############### ======>
Energies (kJ/mol)
G96Angle Proper Dih. Improper Dih. LJ-14 Coulomb-14
2.65581e+03 1.10522e+03 8.58218e+02 5.56582e+02 1.91397e+04
LJ (SR) LJ (LR) Coulomb (SR) Coul. recip. Position Rest.
7.76235e+04 -1.45624e+03 -4.98446e+05 -6.19220e+04 6.48663e+02
Potential Kinetic En. Total Energy Temperature Pressure (bar)
-4.59237e+05 7.52388e+04 -3.83998e+05 3.00460e+02 2.82522e-01
Box-X Box-Y Box-Z Volume Density (SI)
7.36717e+00 6.94587e+00 6.01552e+00 3.07828e+02 1.02446e+03
pV
-2.19380e+01
Total Virial (kJ/mol)
2.50625e+04 3.74364e+01 -1.09807e+02
-1.21909e+02 2.52114e+04 -2.24475e+01
-1.09718e+02 6.28763e+01 2.49978e+04
Pressure (bar)
5.85471e+00 -9.46889e-01 1.42992e+01
1.55935e+01 -1.28647e+01 5.65603e+00
1.42304e+01 -3.13115e+00 7.85754e+00
Total Dipole (Debye)
-4.27471e+02 1.56256e+03 1.39198e+02
Epot (kJ/mol) Coul-SR LJ-SR LJ-LR
Coul-14 LJ-14
Protein-Protein -6.43284e+03 -6.06852e+03 -1.93594e+02
1.91397e+04 5.56582e+02
Protein-Non-Protein -6.07176e+03 -1.49620e+03 -1.99600e+02
0.00000e+00 0.00000e+00
Non-Protein-Non-Protein -4.85942e+05 8.51882e+04 -1.06305e+03
0.00000e+00 0.00000e+00
T-Protein T-SOL T-CL-
2.99892e+02 3.00493e+02 3.02237e+02
<====== ############################### ==>
<==== R M S - F L U C T U A T I O N S ====>
<== ############################### ======>
Energies (kJ/mol)
G96Angle Proper Dih. Improper Dih. LJ-14 Coulomb-14
7.87191e+01 3.92754e+01 3.97988e+01 4.07835e+01 6.59014e+01
LJ (SR) LJ (LR) Coulomb (SR) Coul. recip. Position Rest.
1.42174e+03 9.85270e+00 3.66032e+03 1.38822e+02 5.13795e+01
Potential Kinetic En. Total Energy Temperature Pressure (bar)
2.79433e+03 1.14191e+03 3.71906e+03 4.56013e+00 4.15482e+02
Box-X Box-Y Box-Z Volume Density (SI)
1.73750e-02 1.63839e-02 1.41834e-02 2.20002e+00 7.10821e+00
pV
7.74026e+03
Total Virial (kJ/mol)
4.40353e+03 3.95656e+03 3.53923e+03
4.91163e+03 6.62942e+03 4.84215e+03
3.02705e+03 3.46706e+03 3.99411e+03
Pressure (bar)
4.80956e+02 4.27103e+02 3.82114e+02
5.31242e+02 7.15522e+02 5.21285e+02
3.27257e+02 3.74634e+02 4.39706e+02
Total Dipole (Debye)
2.65079e+02 3.01101e+02 2.66824e+02
Epot (kJ/mol) Coul-SR LJ-SR LJ-LR
Coul-14 LJ-14
Protein-Protein 6.55075e+01 4.63548e+01 6.28258e-01
6.59014e+01 4.07835e+01
Protein-Non-Protein 2.40165e+02 1.09021e+02 3.92781e+00
0.00000e+00 0.00000e+00
Non-Protein-Non-Protein 3.50328e+03 1.43654e+03 6.95241e+00
0.00000e+00 0.00000e+00
T-Protein T-SOL T-CL-
5.26064e+00 4.78751e+00 5.93233e+01
----------------------------------------------------------------------------
What could be the origin of such discrepancies between 3.3.1 and 4.0.3/4?
Is any change in MD protocol strongly suggested on converting input/script
files from 3.3 to 4.0?
I searched Gromacs mailing-lists and docs, but I could not identify any
useful hint or other cases of the same problem, so I apologize in advance
if I may have missed this information.
Best regards,
Pietro
--
Dr. Pietro Amodeo, PhD.
Istituto di Chimica Biomolecolare del CNR
Comprensorio "A. Olivetti", Edificio 70
Via Campi Flegrei 34
I-80078 Pozzuoli (Napoli) - Italy
Phone +39-0818675072
Fax +39-0818041770
Email pamodeo at icmib.na.cnr.it
More information about the gromacs.org_gmx-users
mailing list