[gmx-users] How exact are exact restarts?

Tue Jul 12 08:22:02 CEST 2005

Hi,

I found some curious behaviour while verifying that "exact restarts" are 
indeed that.

Using version 3.3_beta_20050202 on Itanium2, if I run

grompp -v -f simulate_60 -p 1oeh.top -c 1oeh_0 -t 1oeh_0 -e 1oeh_0 -o 
1oeh_simulate -np 1
mpirun mdrun -v -s 1oeh_simulate -o 1oeh_1 -c 1oeh_1 -g 1oeh_1 -e 1oeh_1

to do a 60 step simulation using the *_0 starting point I get different 
results from step 40 onwards than if I run a 30 step simulation using 
*_0 starting point

grompp -v -f simulate_30 -p 1oeh.top -c 1oeh_0 -t 1oeh_0 -e 1oeh_0 -o 
1oeh_simulate -np 1
mpirun mdrun -v -s 1oeh_simulate -o 1oeh_1 -c 1oeh_1 -g 1oeh_1 -e 1oeh_1

followed by another 30 step simulation using the *_1 starting point

grompp -v -f simulate_30 -p 1oeh.top -c 1oeh_1 -t 1oeh_1 -e 1oeh_1 -o 
1oeh_simulate -np 1
mpirun mdrun -v -s 1oeh_simulate -o 1oeh_2 -c 1oeh_2 -g 1oeh_2 -e 1oeh_2

even though all the .mdp scripts have

constraints         =  all-bonds
unconstrained_start =  yes
gen_vel             =  no

which is supposed to permit exact restarts, and that the relevant 
information that has to come from the .edr energy files (for exact 
restarts) was available for grompp (see the command lines).

The log file of the 60-step includes

            Step           Time         Lambda
              30        0.06000        0.00000

    Rel. Constraint Deviation:  Max    between atoms     RMS
        Before LINCS         0.022406     38     39   0.004794
         After LINCS         0.000076     91     93   0.000023

    Energies (kJ/mol)
           Angle    Proper Dih. Ryckaert-Bell.          LJ-14     Coulomb-14
     2.66279e+02    8.93354e+00    4.45304e+01    1.37763e+02    1.03153e+03
         LJ (SR)   Coulomb (SR)   Coul. recip.      Potential    Kinetic En.
     7.32826e+04   -4.43550e+05   -3.45184e+04   -4.03297e+05    7.26535e+04
    Total Energy    Temperature Pressure (bar)
    -3.30644e+05    2.98271e+02   -1.54504e+01

which is identical to the final step 30 for the first of the two-stage 
30 step simulation. The start of the second of the two-stage 30 step 
simulations has

    Energies (kJ/mol)
           Angle    Proper Dih. Ryckaert-Bell.          LJ-14     Coulomb-14
     2.66279e+02    8.93354e+00    4.45304e+01    1.37763e+02    1.03153e+03
         LJ (SR)   Coulomb (SR)   Coul. recip.      Potential    Kinetic En.
     7.32826e+04   -4.43550e+05   -3.45184e+04   -4.03297e+05    7.26535e+04
    Total Energy    Temperature Pressure (bar)
    -3.30643e+05    2.98271e+02   -1.54774e+01

This differs only in the Pressure output, however by the next output 
(step "40") all the energy component values differ (and T and P) between 
the two ostensibly equivalent states.

I can understand the butterfly effect making exact reproduction 
impossible, unless complete machine precision is stored in the .trr and 
.edr files. The fact that all the energy components are correct (to 
output precision) suggests that the problem does not lie in getting the 
correct information from the .trr file. Is it possible that the relevant 
information in the .edr file necessary to reproduce the instantaneous 
pressure is not at full machine precision (either there, or after 
filtering through grompp)? The chip is an Itanium 2, so it has 64-bit 
words... I can see how an erroneous 32-bit assumption in (for example) 
grompp might cause this.

Even if so, there is clearly a system perturbation such that the two 
step "40" configurations are in different regions of phase space. How 
has this occurred?

Regards,

Mark