[gmx-users] Build time/Build user mismatch, fatal error truncation of file *.xtc failed

Mark Abraham mark.j.abraham at gmail.com
Thu Jun 16 11:01:26 CEST 2016


Hi,

There's just nothing special about any node at run time.

Your script looks like it is building GROMACS fresh each time - there's no
need to do that, but the fact that the node name is showing up in the check
that takes place when the checkpoint is read is not relevant to the problem.

Mark

On Thu, Jun 16, 2016 at 9:46 AM Husen R <hus3nr at gmail.com> wrote:

> On Thu, Jun 16, 2016 at 2:32 PM, Mark Abraham <mark.j.abraham at gmail.com>
> wrote:
>
> > Hi,
> >
> > On Thu, Jun 16, 2016 at 9:30 AM Husen R <hus3nr at gmail.com> wrote:
> >
> > > Hi,
> > >
> > > Thank you for your reply !
> > >
> > > md_test.xtc is exist and writable.
> > >
> >
> > OK, but it needs to be seen that way from the set of compute nodes you
> are
> > using, and organizing that is up to you and your job scheduler, etc.
> >
> >
> > > I tried to restart from checkpoint file by excluding other node than
> > > compute-node and it works.
> > >
> >
> > Go do that, then :-)
> >
>
> I'm building a simple system that can respond to node failure. if failure
> occured on node A, than the application has to be restarted and that node
> has to be excluded.
> this should apply to all node including this 'compute-node'.
>
> >
> >
> > > only '--exclude=compute-node' that produces this error.
> > >
> >
> > Then there's something about that node that is special with respect to
> the
> > file system - there's nothing about any particular node that GROMACS
> cares
> > about.
> >
>
> > Mark
> >
> >
> > > is this has the same issue with this thread ?
> > > http://comments.gmane.org/gmane.science.biology.gromacs.user/40984
> > >
> > > regards,
> > >
> > > Husen
> > >
> > > On Thu, Jun 16, 2016 at 2:20 PM, Mark Abraham <
> mark.j.abraham at gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > The stuff about different nodes or numbers of nodes doesn't matter -
> > it's
> > > > merely an advisory note from mdrun. mdrun failed when it tried to
> > operate
> > > > upon md_test.xtc, so perhaps you need to consider whether the file
> > > exists,
> > > > is writable, etc.
> > > >
> > > > Mark
> > > >
> > > > On Thu, Jun 16, 2016 at 6:48 AM Husen R <hus3nr at gmail.com> wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I got the following error message when I tried to restart gromacs
> > > > > simulation from checkpoint file.
> > > > > I restart the simulation using fewer nodes and processes, and also
> I
> > > > > exclude one node using '--exclude=' option (in slurm) for
> > experimental
> > > > > purpose.
> > > > >
> > > > > I'm sure fewer nodes and processes are not the cause of this error
> > as I
> > > > > already test that.
> > > > > I have checked that the cause of this error is '--exclude=' usage.
> I
> > > > > excluded 1 node named 'compute-node' when restart from checkpoint
> (at
> > > > first
> > > > > run, I use all node including 'compute-node').
> > > > >
> > > > >
> > > > > it seems that at first run, the submit job script was built at
> > > > > compute-node. So, at restart, build user mismatch appeared because
> > > > > compute-node was not found (excluded).
> > > > >
> > > > > Am I right ? is this behavior normal ?
> > > > > or is that a way to avoid this, so I can freely restart from
> > checkpoint
> > > > > using any nodes without limitation.
> > > > >
> > > > > thank you in advance
> > > > >
> > > > > Regards,
> > > > >
> > > > >
> > > > > Husen
> > > > >
> > > > > ==========================restart script=================
> > > > > #!/bin/bash
> > > > > #SBATCH -J ayo
> > > > > #SBATCH -o md%j.out
> > > > > #SBATCH -A necis
> > > > > #SBATCH -N 2
> > > > > #SBATCH -n 16
> > > > > #SBATCH --exclude=compute-node
> > > > > #SBATCH --time=144:00:00
> > > > > #SBATCH --mail-user=hus3nr at gmail.com
> > > > > #SBATCH --mail-type=begin
> > > > > #SBATCH --mail-type=end
> > > > >
> > > > > mpirun gmx_mpi mdrun -cpi md_test.cpt -deffnm md_test
> > > > > =====================================================
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > ==================================output
> > error========================
> > > > > Reading checkpoint file md_test.cpt generated: Wed Jun 15 16:30:44
> > 2016
> > > > >
> > > > >
> > > > >   Build time mismatch,
> > > > >     current program: Sel Apr  5 13:37:32 WIB 2016
> > > > >     checkpoint file: Rab Apr  6 09:44:51 WIB 2016
> > > > >
> > > > >   Build user mismatch,
> > > > >     current program: pro at head-node [CMAKE]
> > > > >     checkpoint file: pro at compute-node [CMAKE]
> > > > >
> > > > >   #ranks mismatch,
> > > > >     current program: 16
> > > > >     checkpoint file: 24
> > > > >
> > > > >   #PME-ranks mismatch,
> > > > >     current program: -1
> > > > >     checkpoint file: 6
> > > > >
> > > > > GROMACS patchlevel, binary or parallel settings differ from
> previous
> > > run.
> > > > > Continuation is exact, but not guaranteed to be binary identical.
> > > > >
> > > > >
> > > > > -------------------------------------------------------
> > > > > Program gmx mdrun, VERSION 5.1.2
> > > > > Source code file:
> > > > > /home/pro/gromacs-5.1.2/src/gromacs/gmxlib/checkpoint.cpp, line:
> 2216
> > > > >
> > > > > Fatal error:
> > > > > Truncation of file md_test.xtc failed. Cannot do appending because
> of
> > > > this
> > > > > failure.
> > > > > For more information and tips for troubleshooting, please check the
> > > > GROMACS
> > > > > website at http://www.gromacs.org/Documentation/Errors
> > > > > -------------------------------------------------------
> > > > > ================================================================
> > > > > --
> > > > > Gromacs Users mailing list
> > > > >
> > > > > * Please search the archive at
> > > > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > > > posting!
> > > > >
> > > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > > > >
> > > > > * For (un)subscribe requests visit
> > > > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> > or
> > > > > send a mail to gmx-users-request at gromacs.org.
> > > > >
> > > > --
> > > > Gromacs Users mailing list
> > > >
> > > > * Please search the archive at
> > > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > > posting!
> > > >
> > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > > >
> > > > * For (un)subscribe requests visit
> > > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> or
> > > > send a mail to gmx-users-request at gromacs.org.
> > > >
> > > --
> > > Gromacs Users mailing list
> > >
> > > * Please search the archive at
> > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > posting!
> > >
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> > > * For (un)subscribe requests visit
> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > > send a mail to gmx-users-request at gromacs.org.
> > >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list