[gmx-users] RMSD truncation Restart simulation problems

Henri Mone henriMone at gmail.com
Tue Mar 8 11:41:00 CET 2011


Hi All, hi Mark,
Here are some more details. The outputs and error messages are
attached at the end of the e-mail. After truncation I get the error
message [1a], gromacs has problems with the checksum of the trr fles.
After truncation the trajectories (xtc, trr) have the same length of
27752 frames [1b]. All the edr files have the same length of 277518
frames [1b]. The cpt files used after truncation have a step =
138762700 and t = 277525.400000 [1c].
Before truncation I got the error message [2], gromacs complains that
the 32 subsystems are not compatible.
Anyone a idea was is going wrong?

Thanks,
Henri



====1a: AFTER TRUNCATION: ERROR MESSAGE
Reading checkpoint file state1.cpt generated: Thu Jan 27 02:19:50 2011
  #PME-nodes mismatch,
    current program: -1
    checkpoint file: 0
Reading checkpoint file state2.cpt generated: Thu Jan 27 02:19:50 2011
  #PME-nodes mismatch,
    current program: -1
    checkpoint file: 0
Gromacs binary or parallel settings not identical to previous run.
Continuation is exact, but is not guaranteed to be binary identical.
...
-------------------------------------------------------
Program mdrun_mpi, VERSION 4.5.3
Source code file: checkpoint.c, line: 1767
Fatal error:
Can't read 1048576 bytes of 'traj1.trr' to compute checksum. The file
has been replaced or its contents has been modified.
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------
-------------------------------------------------------
Program mdrun_mpi, VERSION 4.5.3
Source code file: checkpoint.c, line: 1767
Fatal error:
Can't read 1048576 bytes of 'traj2.trr' to compute checksum. The file
has been replaced or its contents has been modified.
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------
Error on node 1, will try to stop all the nodes
Halting parallel program mdrun_mpi on CPU 1 out of 32
gcq#307: "Good Music Saves your Soul" (Lemmy)
[n030212:18418] MPI_ABORT invoked on rank 1 in communicator
MPI_COMM_WORLD with errorcode -1



====1b: AFTER TRUNCATION: XTC TRR
$ gmxcheck -f traj0.xtc
Checking file traj0.xtc
Reading frame       0 time    0.000
# Atoms  224
Precision 0.001 (nm)
Reading frame   27000 time 270000.000
Item        #frames Timestep (ps)
Step         27752    10
Time         27752    10
Lambda           0
Coords       27752    10
Velocities       0
Forces           0
Box          27752    10
...
$ gmxcheck -f traj31.xtc
Checking file traj31.xtc
Reading frame       0 time    0.000
# Atoms  224
Precision 0.001 (nm)
Reading frame   27000 time 270000.000
Item        #frames Timestep (ps)
Step         27752    10
Time         27752    10
Lambda           0
Coords       27752    10
Velocities       0
Forces           0
Box          27752    10

$ gmxcheck -f traj0.trr
Checking file traj0.trr
trn version: GMX_trn_file (single precision)
Reading frame       0 time    0.000
# Atoms  6647
Reading frame   27000 time 270000.000
Item        #frames Timestep (ps)
Step         27752    10
Time         27752    10
Lambda       27752    10
Coords       27752    10
Velocities   27752    10
Forces           0
Box          27752    10
$ gmxcheck -f traj1.trr
Checking file traj1.trr
trn version: GMX_trn_file (single precision)
Reading frame       0 time    0.000
# Atoms  6647
Reading frame   27000 time 270000.000
Item        #frames Timestep (ps)
Step         27752    10
Time         27752    10
Lambda       27752    10
Coords       27752    10
Velocities   27752    10
Forces           0
Box          27752    10
...
$ gmxcheck -f traj31.trr
Checking file traj31.trr
trn version: GMX_trn_file (single precision)
Reading frame       0 time    0.000
# Atoms  6647
Reading frame   27000 time 270000.000
Item        #frames Timestep (ps)
Step         27752    10
Time         27752    10
Lambda       27752    10
Coords       27752    10
Velocities   27752    10
Forces           0
Box          27752    10

$ eneconv -f ener0.edr
Reading energy frame      0 time    0.000
Continue writing frames from t=0, step=0
Last energy frame read 138759 time 277518.000         iting frame time
276000
Last step written from ener0.edr: t 277518, step 138759000
Last frame written was at step 138759000, time 277518.000000
Wrote 138760 frames
...
$ eneconv -f ener31.edr
Reading energy frame      0 time    0.000
Continue writing frames from t=0, step=0
Last energy frame read 138759 time 277518.000         iting frame time
276000
Last step written from ener31.edr: t 277518, step 138759000
Last frame written was at step 138759000, time 277518.000000
Wrote 138760 frames





====1c: AFTER TRUNCATION: CPT
state0.cpt:
generation time = Thu Jan 27 02:19:50 2011
step = 138762700
t = 277525.400000
...
state31.cpt:
generation time = Thu Jan 27 02:19:50 2011
step = 138762700
t = 277525.400000


$ gmxdump -cp state0.cpt|less
GROMACS version = 4.5.3
GROMACS build time = Fri Dec  3 03:20:53 CET 2010
GROMACS build user = user at cluster
GROMACS build machine = Linux 2.6.18-194.17.4.el5 x86_64
generating program = /opt/gromacs-4.5.3/bin/mdrun_mpi
generation time = Thu Jan 27 02:19:50 2011
checkpoint file version = 12
generating host = n040407
#atoms = 6647
#T-coupling groups = 1
#Nose-Hoover T-chains = 0
#Nose-Hoover T-chains for barostat  = 0
integrator = 0
simulation part # = 18
step = 138762700
t = 277525.400000
#PP-nodes = 1
dd_nc[x] = 1
dd_nc[y] = 1
dd_nc[z] = 1
#PME-only nodes = 0
state flags = 6594
ekin data flags = 0
energy history flags = 255


====2: BEFORE TRUNCATION
$ less md2.log
Initializing Replica Exchange
Repl  There are 32 replicas:
Multi-checking the number of atoms ... OK
Multi-checking the integrator ... OK
Multi-checking init_step+nsteps ... OK
Multi-checking first exchange step: init_step/-replex ...
first exchange step: init_step/-replex is not equal for all subsystems
  subsystem 0: 70425
  subsystem 1: 70437
  subsystem 2: 70437
  subsystem 3: 70437
  subsystem 4: 70437
  subsystem 5: 70437
  subsystem 6: 70437
  subsystem 7: 70437
  subsystem 8: 70437
  subsystem 9: 70437
  subsystem 10: 70437
  subsystem 11: 70437
  subsystem 12: 70437
  subsystem 13: 70437
  subsystem 14: 70437
  subsystem 15: 70437
  subsystem 16: 70425
  subsystem 17: 70437
  subsystem 18: 70437
  subsystem 19: 70437
  subsystem 20: 70437
  subsystem 21: 70437
  subsystem 22: 70437
  subsystem 23: 70437
  subsystem 24: 70425
  subsystem 25: 70437
  subsystem 26: 70437
  subsystem 27: 70437
  subsystem 28: 70437
  subsystem 29: 70437
  subsystem 30: 70437
  subsystem 31: 70437
-------------------------------------------------------
Program mdrun_mpi, VERSION 4.5.3
Source code file: main.c, line: 189
Fatal error:
The 32 subsystems are not compatible
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------



$ gmxdump -cp state0.cpt|less
GROMACS version = 4.5.3
GROMACS build time = Fri Dec  3 03:20:53 CET 2010
GROMACS build user = user at cluster
GROMACS build machine = Linux 2.6.18-194.17.4.el5 x86_64
generating program = /opt/gromacs-4.5.3/bin/mdrun_mpi
generation time = Thu Jan 27 15:08:32 2011
checkpoint file version = 12
generating host = n040407
#atoms = 6647
#T-coupling groups = 1
#Nose-Hoover T-chains = 0
#Nose-Hoover T-chains for barostat  = 0
integrator = 0
simulation part # = 19
step = 140849180
t = 281698.360000
#PP-nodes = 1
dd_nc[x] = 1
dd_nc[y] = 1
dd_nc[z] = 1
#PME-only nodes = 0
state flags = 6594
ekin data flags = 0
...


$ gmxcheck -f traj0.xtc
Reading frame       0 time    0.000
# Atoms  224
Precision 0.001 (nm)
Reading frame   28000 time 280000.000
Item        #frames Timestep (ps)
Step         28170    10
Time         28170    10
Lambda           0
Coords       28170    10
Velocities       0
Forces           0
Box          28170    10


$ gmxcheck -f traj1.xtc
Reading frame       0 time    0.000
# Atoms  224
Precision 0.001 (nm)
Reading frame   28000 time 280000.000
Item        #frames Timestep (ps)
Step         28175    10
Time         28175    10
Lambda           0
Coords       28175    10
Velocities       0
Forces           0
Box          28175    10



More information about the gromacs.org_gmx-users mailing list