[gmx-users] Restart simulation from checkpoint file with fewer nodes
Husen R
hus3nr at gmail.com
Mon May 16 11:20:02 CEST 2016
Hi all,
After spending time for troubleshooting, I found that gromacs
checkpoint/restart feature is working well.
The failure occurred because I use root user to submit restart job (using
slurm resource manager). After switching to non root user, the restart
process is running.
The reason why I use root user is because I run this job in bash scripting
and execute it at designated time using Cron.
I know this is not the right place to talk about slurm.
Thank you for your reply !
Regards,
Husen
On Sun, May 15, 2016 at 8:20 PM, <jkrieger at mrc-lmb.cam.ac.uk> wrote:
> ok thanks
>
> > Hi,
> >
> > Yes, that's one way to work around the problem. In some places, a module
> > subsystem can be used to take care of the selection automatically, but
> you
> > don't want to set one up for just you to use.
> >
> > Mark
> >
> > On Sun, May 15, 2016 at 11:48 AM <jkrieger at mrc-lmb.cam.ac.uk> wrote:
> >
> >> Thanks Mark,
> >>
> >> My sysadmins have let me install my own GROMACS versions and have not
> >> informed me of any such mechanism. Would you suggest I qrsh into a node
> >> of
> >> each type and build an mdrun-only version on each? I'd then select a
> >> particular node type for a submit script with the relevant mdrun.
> >>
> >> Many thanks
> >> James
> >>
> >> > Hi,
> >> >
> >> > On Sat, May 14, 2016 at 1:09 PM <jkrieger at mrc-lmb.cam.ac.uk> wrote:
> >> >
> >> >> In case it's relevant/interesting to anyone, here are the details on
> >> our
> >> >> cluster nodes:
> >> >>
> >> >> nodes # model # cores cpu
> >> >> model
> >> >> RAM node_type
> >> >> fmb01 - fmb33 33 IBM HS21XM 8 3 GHz
> >> >> Xeon
> >> >> E5450
> >> >> 16GB hs21
> >> >> fmb34 - fmb42 9 IBM HS22 8 2.4
> >> GHz
> >> >> Xeon E5530
> >> >> 16GB hs22
> >> >> fmb43 - fmb88 45 Dell PE M610 8 2.4
> >> GHz
> >> >> Xeon E5530
> >> >> 16GB m610
> >> >> fmb88 - fmb90 3 Dell PE M610+ 12 3.4
> >> GHz
> >> >> Xeon X5690
> >> >> 48GB m610+
> >> >> fmb91 - fmb202 112 Dell PE M620 24 (HT) 2.9
> >> GHz
> >> >> Xeon E5-2667
> >> >> 64GB m620
> >> >> fmb203 - fmb279 77 Dell PE M620 24 (HT) 3.5
> >> GHz
> >> >> Xeon E5-2643 v2 64GB
> >> >> m620+
> >> >> fmb280 - fmb359 80 Dell PE M630 24 (HT) 3.4
> >> GHz
> >> >> Xeon E5-2643 v3 64GB
> >> >> m630
> >> >>
> >> >> I could only run GROMACS 4.6.2 on the last three node types and I
> >> >> believe
> >> >> the same is true for 5.0.4
> >> >>
> >> >
> >> > Sure. GROMACS is designed to target whichever hardware was selected at
> >> > configure time, which your sysadmins for such a heterogeneous cluster
> >> > should have documented somewhere. They should also be making available
> >> to
> >> > you a mechanism to target your jobs to nodes where they can run
> >> programs
> >> > that use the hardware efficiently, or providing GROMACS installations
> >> that
> >> > work regardless of which node you are actually on. You might like to
> >> > respectfully remind them of the things we say at
> >> >
> >>
> http://manual.gromacs.org/documentation/5.1.2/install-guide/index.html#portability-aspects
> >> > (These thoughts are common to earlier versions also.)
> >> >
> >> > Mark
> >> >
> >> >
> >> > Best wishes
> >> >> James
> >> >>
> >> >> > I have found that only some kinds of nodes on our cluster work for
> >> >> gromacs
> >> >> > 4.6 (the ones we call m620, m620+ and m630 but not others - I can
> >> >> check
> >> >> > the details tomorrow). I haven't tested it again now I'm using 5.0
> >> so
> >> >> > don't know if that's still an issue but if it is it could explain
> >> why
> >> >> your
> >> >> > restart failed even and the initial run didn't.
> >> >> >
> >> >> >> thanks a lot for your fast response.
> >> >> >>
> >> >> >> I have tried it, and it failed. I ask in this forum just to make
> >> >> sure.
> >> >> >> However, there was something in my cluster that probably make it
> >> >> failed.
> >> >> >> I'll handle it first and then retry to restart again.
> >> >> >>
> >> >> >> Regards,
> >> >> >>
> >> >> >> Husen
> >> >> >>
> >> >> >> On Sat, May 14, 2016 at 7:58 AM, Justin Lemkul <jalemkul at vt.edu>
> >> >> wrote:
> >> >> >>
> >> >> >>>
> >> >> >>>
> >> >> >>> On 5/13/16 8:53 PM, Husen R wrote:
> >> >> >>>
> >> >> >>>> Dear all
> >> >> >>>>
> >> >> >>>> Does simulation able to be restarted from checkpoint file with
> >> >> fewer
> >> >> >>>> nodes ?
> >> >> >>>> let's say, at the first time, I run simulation with 3 nodes. At
> >> >> >>>> running
> >> >> >>>> time, one of those nodes is crashed and the simulation is
> >> >> terminated.
> >> >> >>>>
> >> >> >>>> I want to restart that simulation immadiately based on
> >> checkpoint
> >> >> file
> >> >> >>>> with
> >> >> >>>> the remaining 2 nodes. does gromacs support such case ?
> >> >> >>>> I need help.
> >> >> >>>>
> >> >> >>>
> >> >> >>> Have you tried it? It should work. You will probably get a note
> >> >> about
> >> >> >>> the continuation not being exact due to a change in the number of
> >> >> >>> cores,
> >> >> >>> but the run should proceed fine.
> >> >> >>>
> >> >> >>> -Justin
> >> >> >>>
> >> >> >>> --
> >> >> >>> ==================================================
> >> >> >>>
> >> >> >>> Justin A. Lemkul, Ph.D.
> >> >> >>> Ruth L. Kirschstein NRSA Postdoctoral Fellow
> >> >> >>>
> >> >> >>> Department of Pharmaceutical Sciences
> >> >> >>> School of Pharmacy
> >> >> >>> Health Sciences Facility II, Room 629
> >> >> >>> University of Maryland, Baltimore
> >> >> >>> 20 Penn St.
> >> >> >>> Baltimore, MD 21201
> >> >> >>>
> >> >> >>> jalemkul at outerbanks.umaryland.edu | (410) 706-7441
> >> >> >>> http://mackerell.umaryland.edu/~jalemkul
> >> >> >>>
> >> >> >>> ==================================================
> >> >> >>> --
> >> >> >>> Gromacs Users mailing list
> >> >> >>>
> >> >> >>> * Please search the archive at
> >> >> >>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List
> >> before
> >> >> >>> posting!
> >> >> >>>
> >> >> >>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >> >> >>>
> >> >> >>> * For (un)subscribe requests visit
> >> >> >>>
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> >> >> or
> >> >> >>> send a mail to gmx-users-request at gromacs.org.
> >> >> >>>
> >> >> >> --
> >> >> >> Gromacs Users mailing list
> >> >> >>
> >> >> >> * Please search the archive at
> >> >> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List
> before
> >> >> >> posting!
> >> >> >>
> >> >> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >> >> >>
> >> >> >> * For (un)subscribe requests visit
> >> >> >>
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> >> or
> >> >> >> send
> >> >> >> a mail to gmx-users-request at gromacs.org.
> >> >> >>
> >> >> >
> >> >> >
> >> >> > --
> >> >> > Gromacs Users mailing list
> >> >> >
> >> >> > * Please search the archive at
> >> >> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> >> > posting!
> >> >> >
> >> >> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >> >> >
> >> >> > * For (un)subscribe requests visit
> >> >> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> >> or
> >> >> send
> >> >> > a mail to gmx-users-request at gromacs.org.
> >> >> >
> >> >>
> >> >>
> >> >> --
> >> >> Gromacs Users mailing list
> >> >>
> >> >> * Please search the archive at
> >> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> >> posting!
> >> >>
> >> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >> >>
> >> >> * For (un)subscribe requests visit
> >> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> or
> >> >> send a mail to gmx-users-request at gromacs.org.
> >> >>
> >> > --
> >> > Gromacs Users mailing list
> >> >
> >> > * Please search the archive at
> >> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> > posting!
> >> >
> >> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >> >
> >> > * For (un)subscribe requests visit
> >> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> send
> >> > a mail to gmx-users-request at gromacs.org.
> >> >
> >>
> >>
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> send a mail to gmx-users-request at gromacs.org.
> >>
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send
> > a mail to gmx-users-request at gromacs.org.
> >
>
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>
More information about the gromacs.org_gmx-users
mailing list