[gmx-users] Re: Count mismatch for state entry SDx, code count is 754728, file count is 0

Chris Neale chris.neale at utoronto.ca
Thu May 5 18:38:38 CEST 2011


Apologies: my.tpr and my_prev.tpr should have read my.cpt and my_prev.cpt.

On 11-05-05 12:36 PM, Chris Neale wrote:
> Dear Users:
>
> Using gromacs 4.0.5, I find that there are at least some cases where 
> some type of disk error can get propagated through both my.tpr and 
> my_prev.tpr, complicating restarts. This used to be a bigger problem 
> in gromacs 3, and I don't recall ever seeing it in gromacs 4 so I 
> thought I would post a notification.
>
> I'm just going to extract some coordinates and restart, but ideally 
> this wouldn't happen. A google search for the relevant error "Count 
> mismatch for state entry" only turns up some online source code.
>
> I don't know if this error occurs in 4.5.3, and it's not binary 
> reproducible so that would be difficult to check. Still, the error 
> checking that regularly occurs prior to overwriting the previous (and 
> without error) _prev.cpt file with a new (and with error) _prev.cpt 
> file seemed to not catch this problem, at least with gromacs 4.0.5.
>
> The run that wrote out the .tpr finished normally due to -maxh, with a 
> stderr that looked like this:
>
> ... < snip > ...
> starting mdrun 'Generated by genbox'
> 10000000 steps,  20000.0 ps (continuing from step 3769350,   7538.7 ps).
> [gpc-f138n034:06165] 15 more processes have sent help message 
> help-mpi-btl-base.txt / btl:no-nics
> [gpc-f138n034:06165] Set MCA parameter "orte_base_help_aggregate" to 0 
> to see all help / error messages
>
> Step 5036590: Run time exceeded 47.322 hours, will terminate the run
>
> Step 5036600: Run time exceeded 47.322 hours, will terminate the run
>
>  Average load imbalance: 0.2 %
>  Part of the total run time spent waiting due to load imbalance: 0.2 %
>  Steps where the load balancing was limited by -rdd, -rcon and/or 
> -dds: X 0 % Z 0 %
>  Average PME mesh/force load: 0.745
>  Part of the total run time spent waiting due to PP/PME imbalance: 4.9 %
>
>
>         Parallel run - timing based on wallclock.
>
>                NODE (s)   Real (s)      (%)
>        Time: 170485.000 170485.000    100.0
>                        1d23h21:25
>                (Mnbf/s)   (GFlops)   (ns/day)  (hour/ns)
> Performance:    625.583     31.889      1.284     18.685
>
> gcq#165: "I'm a Jerk" (F. Black)
>
>
> gcq#165: "I'm a Jerk" (F. Black)
>
> #############################################
>
> And then when I gmxcheck both of the .cpt files I get the exact same 
> error, although the files do differ:
>
> $ diff md1.cpt md1_prev.cpt
> Binary files md1.cpt and md1_prev.cpt differ
>
>
> $ gmxcheck  -f md1.cpt
>                          :-)  G  R  O  M  A  C  S  (-:
>
>                               S  C  A  M  O  R  G
>
>                             :-)  VERSION 4.0.5  (-:
>
>
>       Written by David van der Spoel, Erik Lindahl, Berk Hess, and 
> others.
>        Copyright (c) 1991-2000, University of Groningen, The Netherlands.
>              Copyright (c) 2001-2008, The GROMACS development team,
>             check out http://www.gromacs.org for more information.
>
>          This program is free software; you can redistribute it and/or
>           modify it under the terms of the GNU General Public License
>          as published by the Free Software Foundation; either version 2
>              of the License, or (at your option) any later version.
>
>                                :-)  gmxcheck  (-:
>
> Option     Filename  Type         Description
> ------------------------------------------------------------
>   -f        md1.cpt  Input, Opt!  Trajectory: xtc trr trj gro g96 pdb cpt
>  -f2       traj.xtc  Input, Opt.  Trajectory: xtc trr trj gro g96 pdb cpt
>  -s1       top1.tpr  Input, Opt.  Run input file: tpr tpb tpa
>  -s2       top2.tpr  Input, Opt.  Run input file: tpr tpb tpa
>   -c      topol.tpr  Input, Opt.  Structure+mass(db): tpr tpb tpa gro 
> g96 pdb
>   -e       ener.edr  Input, Opt.  Energy file: edr ene
>  -e2      ener2.edr  Input, Opt.  Energy file: edr ene
>   -n      index.ndx  Input, Opt.  Index file
>   -m        doc.tex  Output, Opt. LaTeX file
>
> Option       Type   Value   Description
> ------------------------------------------------------
> -[no]h       bool   no      Print help info and quit
> -nice        int    0       Set the nicelevel
> -vdwfac      real   0.8     Fraction of sum of VdW radii used as warning
>                             cutoff
> -bonlo       real   0.4     Min. fract. of sum of VdW radii for bonded 
> atoms
> -bonhi       real   0.7     Max. fract. of sum of VdW radii for bonded 
> atoms
> -tol         real   0.001   Relative tolerance for comparing real values
>                             defined as 2*(a-b)/(|a|+|b|)
> -[no]ab      bool   no      Compare the A and B topology from one file
> -lastener    string         Last energy term to compare (if not given 
> all are
>                             tested). It makes sense to go up until the
>                             Pressure.
>
> Checking file md1.cpt
>
> -------------------------------------------------------
> Program gmxcheck, VERSION 4.0.5
> Source code file: checkpoint.c, line: 186
>
> Fatal error:
> Count mismatch for state entry SDx, code count is 754728, file count is 0
>
> -------------------------------------------------------
>
> "Confirmed" (Star Trek)
>
> ############################ and the same thing for the _prev.cpt file:
>
> # gmxcheck  -f md1_prev.cpt
>                          :-)  G  R  O  M  A  C  S  (-:
>
>                        GRowing Old MAkes el Chrono Sweat
>
>                             :-)  VERSION 4.0.5  (-:
>
>
>       Written by David van der Spoel, Erik Lindahl, Berk Hess, and 
> others.
>        Copyright (c) 1991-2000, University of Groningen, The Netherlands.
>              Copyright (c) 2001-2008, The GROMACS development team,
>             check out http://www.gromacs.org for more information.
>
>          This program is free software; you can redistribute it and/or
>           modify it under the terms of the GNU General Public License
>          as published by the Free Software Foundation; either version 2
>              of the License, or (at your option) any later version.
>
>                                :-)  gmxcheck  (-:
>
> Option     Filename  Type         Description
> ------------------------------------------------------------
>   -f   md1_prev.cpt  Input, Opt!  Trajectory: xtc trr trj gro g96 pdb cpt
>  -f2       traj.xtc  Input, Opt.  Trajectory: xtc trr trj gro g96 pdb cpt
>  -s1       top1.tpr  Input, Opt.  Run input file: tpr tpb tpa
>  -s2       top2.tpr  Input, Opt.  Run input file: tpr tpb tpa
>   -c      topol.tpr  Input, Opt.  Structure+mass(db): tpr tpb tpa gro 
> g96 pdb
>   -e       ener.edr  Input, Opt.  Energy file: edr ene
>  -e2      ener2.edr  Input, Opt.  Energy file: edr ene
>   -n      index.ndx  Input, Opt.  Index file
>   -m        doc.tex  Output, Opt. LaTeX file
>
> Option       Type   Value   Description
> ------------------------------------------------------
> -[no]h       bool   no      Print help info and quit
> -nice        int    0       Set the nicelevel
> -vdwfac      real   0.8     Fraction of sum of VdW radii used as warning
>                             cutoff
> -bonlo       real   0.4     Min. fract. of sum of VdW radii for bonded 
> atoms
> -bonhi       real   0.7     Max. fract. of sum of VdW radii for bonded 
> atoms
> -tol         real   0.001   Relative tolerance for comparing real values
>                             defined as 2*(a-b)/(|a|+|b|)
> -[no]ab      bool   no      Compare the A and B topology from one file
> -lastener    string         Last energy term to compare (if not given 
> all are
>                             tested). It makes sense to go up until the
>                             Pressure.
>
> Checking file md1_prev.cpt
>
> -------------------------------------------------------
> Program gmxcheck, VERSION 4.0.5
> Source code file: checkpoint.c, line: 186
>
> Fatal error:
> Count mismatch for state entry SDx, code count is 754728, file count is 0
>
> -------------------------------------------------------
>
> "I'm Only Faking When I Get It Right" (Soundgarden)
>




More information about the gromacs.org_gmx-users mailing list