[gmx-users] Build time/Build user mismatch, fatal error truncation of file *.xtc failed

Mark Abraham mark.j.abraham at gmail.com
Thu Jun 16 09:32:49 CEST 2016


Hi,

On Thu, Jun 16, 2016 at 9:30 AM Husen R <hus3nr at gmail.com> wrote:

> Hi,
>
> Thank you for your reply !
>
> md_test.xtc is exist and writable.
>

OK, but it needs to be seen that way from the set of compute nodes you are
using, and organizing that is up to you and your job scheduler, etc.


> I tried to restart from checkpoint file by excluding other node than
> compute-node and it works.
>

Go do that, then :-)


> only '--exclude=compute-node' that produces this error.
>

Then there's something about that node that is special with respect to the
file system - there's nothing about any particular node that GROMACS cares
about.

Mark


> is this has the same issue with this thread ?
> http://comments.gmane.org/gmane.science.biology.gromacs.user/40984
>
> regards,
>
> Husen
>
> On Thu, Jun 16, 2016 at 2:20 PM, Mark Abraham <mark.j.abraham at gmail.com>
> wrote:
>
> > Hi,
> >
> > The stuff about different nodes or numbers of nodes doesn't matter - it's
> > merely an advisory note from mdrun. mdrun failed when it tried to operate
> > upon md_test.xtc, so perhaps you need to consider whether the file
> exists,
> > is writable, etc.
> >
> > Mark
> >
> > On Thu, Jun 16, 2016 at 6:48 AM Husen R <hus3nr at gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > I got the following error message when I tried to restart gromacs
> > > simulation from checkpoint file.
> > > I restart the simulation using fewer nodes and processes, and also I
> > > exclude one node using '--exclude=' option (in slurm) for experimental
> > > purpose.
> > >
> > > I'm sure fewer nodes and processes are not the cause of this error as I
> > > already test that.
> > > I have checked that the cause of this error is '--exclude=' usage. I
> > > excluded 1 node named 'compute-node' when restart from checkpoint (at
> > first
> > > run, I use all node including 'compute-node').
> > >
> > >
> > > it seems that at first run, the submit job script was built at
> > > compute-node. So, at restart, build user mismatch appeared because
> > > compute-node was not found (excluded).
> > >
> > > Am I right ? is this behavior normal ?
> > > or is that a way to avoid this, so I can freely restart from checkpoint
> > > using any nodes without limitation.
> > >
> > > thank you in advance
> > >
> > > Regards,
> > >
> > >
> > > Husen
> > >
> > > ==========================restart script=================
> > > #!/bin/bash
> > > #SBATCH -J ayo
> > > #SBATCH -o md%j.out
> > > #SBATCH -A necis
> > > #SBATCH -N 2
> > > #SBATCH -n 16
> > > #SBATCH --exclude=compute-node
> > > #SBATCH --time=144:00:00
> > > #SBATCH --mail-user=hus3nr at gmail.com
> > > #SBATCH --mail-type=begin
> > > #SBATCH --mail-type=end
> > >
> > > mpirun gmx_mpi mdrun -cpi md_test.cpt -deffnm md_test
> > > =====================================================
> > >
> > >
> > >
> > >
> > > ==================================output error========================
> > > Reading checkpoint file md_test.cpt generated: Wed Jun 15 16:30:44 2016
> > >
> > >
> > >   Build time mismatch,
> > >     current program: Sel Apr  5 13:37:32 WIB 2016
> > >     checkpoint file: Rab Apr  6 09:44:51 WIB 2016
> > >
> > >   Build user mismatch,
> > >     current program: pro at head-node [CMAKE]
> > >     checkpoint file: pro at compute-node [CMAKE]
> > >
> > >   #ranks mismatch,
> > >     current program: 16
> > >     checkpoint file: 24
> > >
> > >   #PME-ranks mismatch,
> > >     current program: -1
> > >     checkpoint file: 6
> > >
> > > GROMACS patchlevel, binary or parallel settings differ from previous
> run.
> > > Continuation is exact, but not guaranteed to be binary identical.
> > >
> > >
> > > -------------------------------------------------------
> > > Program gmx mdrun, VERSION 5.1.2
> > > Source code file:
> > > /home/pro/gromacs-5.1.2/src/gromacs/gmxlib/checkpoint.cpp, line: 2216
> > >
> > > Fatal error:
> > > Truncation of file md_test.xtc failed. Cannot do appending because of
> > this
> > > failure.
> > > For more information and tips for troubleshooting, please check the
> > GROMACS
> > > website at http://www.gromacs.org/Documentation/Errors
> > > -------------------------------------------------------
> > > ================================================================
> > > --
> > > Gromacs Users mailing list
> > >
> > > * Please search the archive at
> > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > posting!
> > >
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> > > * For (un)subscribe requests visit
> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > > send a mail to gmx-users-request at gromacs.org.
> > >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list