[gmx-users] try to restart a md run using cpt file

Thielges, Sabine Sabine.Thielges at cnrc-nrc.gc.ca
Tue Nov 3 17:53:40 CET 2009


Hi,

One of my trajectory stop because i didn't have enough space so I made some space and i wanted to restart it using 

mpiexec -np 8 mdrun.MPI -s md_20_100_s_IP_beta.tpr -cpi md_20_100_s_IP_beta.cpt -append 

and also 
 
mpiexec -np 8 mdrun.MPI -s md_20_100_s_IP_beta.tpr -cpi md_20_100_s_IP_beta_prev.cpt -append 

but each time it give me an error message about the .trr file not being properly truncated!!!

there is the error.log file:

nohup: appending output to `nohup.out'
NNODES=8, MYRANK=0, HOSTNAME=fnode21
NODEID=0 argc=6
NNODES=8, MYRANK=1, HOSTNAME=fnode21
NODEID=1 argc=6
NNODES=8, MYRANK=2, HOSTNAME=fnode21
NODEID=2 argc=6
NNODES=8, MYRANK=3, HOSTNAME=fnode21
NODEID=3 argc=6
NNODES=8, MYRANK=4, HOSTNAME=fnode21
NODEID=4 argc=6
NNODES=8, MYRANK=5, HOSTNAME=fnode21
NODEID=5 argc=6
NNODES=8, MYRANK=6, HOSTNAME=fnode21
NODEID=6 argc=6
NNODES=8, MYRANK=7, HOSTNAME=fnode21
NODEID=7 argc=6
                         :-)  G  R  O  M  A  C  S  (-:

             Gallium Rubidium Oxygen Manganese Argon Carbon Silicon

                            :-)  VERSION 4.0.3  (-:


      Written by David van der Spoel, Erik Lindahl, Berk Hess, and others.
       Copyright (c) 1991-2000, University of Groningen, The Netherlands.
             Copyright (c) 2001-2008, The GROMACS development team,
            check out http://www.gromacs.org for more information.

         This program is free software; you can redistribute it and/or
          modify it under the terms of the GNU General Public License
         as published by the Free Software Foundation; either version 2
             of the License, or (at your option) any later version.

                              :-)  mdrun.MPI  (-:

Option     Filename  Type         Description
------------------------------------------------------------
  -s md_20_100_s_IP_beta.tpr  Input        Run input file: tpr tpb tpa
  -o       traj.trr  Output       Full precision trajectory: trr trj cpt
  -x       traj.xtc  Output, Opt. Compressed trajectory (portable xdr format)
-cpi md_20_100_s_IP_beta_prev.cpt  Input, Opt!  Checkpoint file
-cpo      state.cpt  Output, Opt. Checkpoint file
  -c    confout.gro  Output       Structure file: gro g96 pdb
  -e       ener.edr  Output       Energy file: edr ene
  -g         md.log  Output       Log file
-dgdl      dgdl.xvg  Output, Opt. xvgr/xmgr file
-field    field.xvg  Output, Opt. xvgr/xmgr file
-table    table.xvg  Input, Opt.  xvgr/xmgr file
-tablep  tablep.xvg  Input, Opt.  xvgr/xmgr file
-tableb   table.xvg  Input, Opt.  xvgr/xmgr file
-rerun    rerun.xtc  Input, Opt.  Trajectory: xtc trr trj gro g96 pdb cpt
-tpi        tpi.xvg  Output, Opt. xvgr/xmgr file
-tpid   tpidist.xvg  Output, Opt. xvgr/xmgr file
 -ei        sam.edi  Input, Opt.  ED sampling input
 -eo        sam.edo  Output, Opt. ED sampling output
  -j       wham.gct  Input, Opt.  General coupling stuff
 -jo        bam.gct  Output, Opt. General coupling stuff
-ffout      gct.xvg  Output, Opt. xvgr/xmgr file
-devout   deviatie.xvg  Output, Opt. xvgr/xmgr file
-runav  runaver.xvg  Output, Opt. xvgr/xmgr file
 -px      pullx.xvg  Output, Opt. xvgr/xmgr file
 -pf      pullf.xvg  Output, Opt. xvgr/xmgr file
-mtx         nm.mtx  Output, Opt. Hessian matrix
 -dn     dipole.ndx  Output, Opt. Index file

Option       Type   Value   Description
------------------------------------------------------
-[no]h       bool   no      Print help info and quit
-nice        int    0       Set the nicelevel
-deffnm      string         Set the default filename for all file options
-[no]xvgr    bool   yes     Add specific codes (legends etc.) in the output
                            xvg files for the xmgrace program
-[no]pd      bool   no      Use particle decompostion
-dd          vector 0 0 0   Domain decomposition grid, 0 is optimize
-npme        int    -1      Number of separate nodes to be used for PME, -1
                            is guess
-ddorder     enum   interleave  DD node order: interleave, pp_pme or cartesian
-[no]ddcheck bool   yes     Check for all bonded interactions with DD
-rdd         real   0       The maximum distance for bonded interactions with
                            DD (nm), 0 is determine from initial coordinates
-rcon        real   0       Maximum distance for P-LINCS (nm), 0 is estimate
-dlb         enum   auto    Dynamic load balancing (with DD): auto, no or yes
-dds         real   0.8     Minimum allowed dlb scaling of the DD cell size
-[no]sum     bool   yes     Sum the energies at every step
-[no]v       bool   no      Be loud and noisy
-[no]compact bool   yes     Write a compact log file
-[no]seppot  bool   no      Write separate V and dVdl terms for each
                            interaction type and node to the log file(s)
-pforce      real   -1      Print all forces larger than this (kJ/mol nm)
-[no]reprod  bool   no      Try to avoid optimizations that affect binary
                            reproducibility
-cpt         real   15      Checkpoint interval (minutes)
-[no]append  bool   yes     Append to previous output files when restarting
                            from checkpoint
-maxh        real   -1      Terminate after 0.99 times this time (hours)
-multi       int    0       Do multiple simulations in parallel
-replex      int    0       Attempt replica exchange every # steps
-reseed      int    -1      Seed for replica exchange, -1 is generate a seed
-[no]glas    bool   no      Do glass simulation with special long range
                            corrections
-[no]ionize  bool   no      Do a simulation including the effect of an X-Ray
                            bombardment on your system

Reading file md_20_100_s_IP_beta.tpr, VERSION 4.0.3 (single precision)

Reading checkpoint file md_20_100_s_IP_beta_prev.cpt generated: Sat Oct 31 12:34:46 2009


-------------------------------------------------------
Program mdrun.MPI, VERSION 4.0.3
Source code file: checkpoint.c, line: 1238

Fatal error:
Truncation of file md_20_100_s_IP_beta.trr failed.
-------------------------------------------------------

"Do You Have a Mind of Your Own ?" (Garbage)

Error on node 0, will try to stop all the nodes
Halting parallel program mdrun.MPI on CPU 0 out of 8

gcq#280: "Do You Have a Mind of Your Own ?" (Garbage)

[fnode21:27263] MPI_ABORT invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode -1
mpiexec noticed that job rank 1 with PID 27267 on node fnode21 exited on signal 15 (Terminated). 


Can anyone help me understand what is the problem?

thank you in advance.

Sabine
-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 5655 bytes
Desc: not available
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20091103/17e31f7d/attachment.bin>


More information about the gromacs.org_gmx-users mailing list