[gmx-users] continue replica exchange MD
Kukol, Andreas
a.kukol at herts.ac.uk
Wed Mar 21 16:36:10 CET 2012
Hello,
Upon continuing a replica exchange MD simulation using the command
mdrun -cpi state.cpt -append -s tpr_remd20ns_.tpr -multi 48 -replex 10000 -cpt 60 -x xtcRemd_20ns.xtc -c afterRemd_20ns.gro -g logRemd_20ns.log -v -e edrRemd_20ns.edr -stepout 2000
I get the following output:
**************************************
...
...
5000000 steps, 10000.0 ps (continuing from step 49430, 98.9 ps).
5000000 steps, 10000.0 ps (continuing from step 49430, 98.9 ps).
step 49430, will finish Wed Sep 12 16:09:33 2012
step 50000, will finish Thu May 24 11:23:04 2012
Step 47546: resetting all time and cycle counters
=>> PBS: job killed: walltime 604823 exceeded limit 604800
Terminated
******************************************
Apparently, the job runs for one week on a computer cluster (that is the maximum time allowed), but it does not progress very much beyond step 49430.
Also the log-file does not show any more steps:
************************************************
Step Time Lambda
46455 92.91000 0.00000
Grid: 18 x 17 x 25 cells
Energies (kJ/mol)
G96Angle Proper Dih. Ryckaert-Bell. Improper Dih. LJ-14
5.83095e+04 3.70277e+04 2.14102e+03 8.83853e+03 -7.33070e+02
Coulomb-14 LJ (SR) LJ (LR) Disper. corr. Coulomb (SR)
2.29503e+05 3.04138e+05 -2.66781e+04 -8.51221e+03 -2.74692e+06
Coul. recip. Position Rest. Potential Kinetic En. Total Energy
-9.59421e+05 5.41532e+03 -3.09689e+06 5.18959e+05 -2.57793e+06
Temperature Pres. DC (bar) Pressure (bar) Constr. rmsd
2.97550e+02 -1.14933e+02 5.41944e+01 0.00000e+00
Writing checkpoint, step 49430 at Fri Jan 27 09:43:23 2012
-----------------------------------------------------------
Restarting from checkpoint, appending to previous log file.
...
...
Started mdrun on node 0 Tue Mar 6 16:40:10 2012
Step Time Lambda
49430 98.86000 0.00000
Grid: 18 x 17 x 25 cells
Energies (kJ/mol)
G96Angle Proper Dih. Ryckaert-Bell. Improper Dih. LJ-14
5.84241e+04 3.69121e+04 2.09533e+03 8.80916e+03 -4.67086e+02
Coulomb-14 LJ (SR) LJ (LR) Disper. corr. Coulomb (SR)
2.29528e+05 2.99825e+05 -2.67028e+04 -8.51334e+03 -2.74410e+06
Coul. recip. Position Rest. Potential Kinetic En. Total Energy
-9.59506e+05 5.47116e+03 -3.09823e+06 5.18993e+05 -2.57923e+06
Temperature Pres. DC (bar) Pressure (bar) Constr. rmsd
2.97570e+02 -1.14963e+02 2.67842e+00 0.00000e+00
Step Time Lambda
50000 100.00000 0.00000
Energies (kJ/mol)
G96Angle Proper Dih. Ryckaert-Bell. Improper Dih. LJ-14
5.86161e+04 3.71585e+04 2.15336e+03 8.92946e+03 -4.84684e+02
Coulomb-14 LJ (SR) LJ (LR) Disper. corr. Coulomb (SR)
2.29950e+05 3.01014e+05 -2.66724e+04 -8.51306e+03 -2.74349e+06
Coul. recip. Position Rest. Potential Kinetic En. Total Energy
-9.59537e+05 5.56712e+03 -3.09531e+06 5.19371e+05 -2.57594e+06
Temperature Pres. DC (bar) Pressure (bar) Constr. rmsd
2.97787e+02 -1.14956e+02 2.36460e+01 1.50068e-05
[End of log-file]
***********************************
I wonder, if this is my mistake (using the mdrun wrongly), a Gromacs problem or maybe a problem of the computer cluster (MPI, etc). I would be grateful for any help.
Many thanks
Andreas
More information about the gromacs.org_gmx-users
mailing list