[gmx-users] replica restart from checkpoints

Massimiliano Bonomi massimiliano.bonomi at gmail.com
Fri Feb 20 10:47:21 CET 2009


On Feb 20, 2009, at 10:07 AM, Berk Hess wrote:

> Hi,
>
> I guess that actually the -maxh procedure might be the problem in  
> your case.
> If all replicas stop correctly after -maxh, they will all be between  
> the same exchange events,
> so it should work.
> The only issue I can see is that one (or more) replica reaches an  
> exchange attempt step
> early and waits for communication, while the others are late and get  
> stopped by -maxh.
> Have you checked that the simulation terminated properly?

This is the last output line of one md.log

Step 4834163: Run time exceeded 23.760 hours, will terminate the run
            Step           Time         Lambda
         4834164     9668.32800        0.00000

No checkpoints are created after this point.
The same for all the other replicas.
Is this a correct stop or the code should have print out a "final"  
checkpoint before stopping?

PS: simulations are in the NVT ensemble...

Massimiliano

>
>
> If this is the case, currently the only solution is not to use -maxh,
> but to make tpr files with nsteps short enough to finish in time and  
> then use tpbconv
> to extend the tpr files (without trajectory and energy) and then run  
> mdrun -cpi.
>
> Berk
>
> From: massimiliano.bonomi at gmail.com
> To: gmx-users at gromacs.org
> Subject: Re: [gmx-users] replica restart from checkpoints
> Date: Thu, 19 Feb 2009 22:47:23 +0100
>
> Thanks for your reply...
>
> Which version are you using?
> In 4.0.3 I made things slightly better by allowing checkpoints
> to have different step numbers, as long as they fall within
> the same exchange attempt steps.
>
> I'm using 4.0.3. Same problem with the former versions 4.0.x.
>
> This could still cause problems when the steps in the checkpoints
> differ very much. But if you use -maxh all simulations should finish
> closely within each other.
>
> Actually I'm using -maxh!
>
>
> (you can always go back to using tpbconv)
>
>
> Unfortunately I have no trr files, but just xtc with only solute...
>
> Synchronizing the checkpoint writing is a bit complicated
> and will probably only be done in 4.1.
>
> Is it not possible to define the writing stride in terms of MD steps?
>
> Thanks again,
> Massimiliano
>
> Berk
>
> > From: massimiliano.bonomi at gmail.com
> > To: gmx-users at gromacs.org
> > Date: Thu, 19 Feb 2009 20:14:15 +0100
> > Subject: [gmx-users] replica restart from checkpoints
> >
> > Dear Gromacs Users,
> >
> > I'm experiencing some problems when restarting a replica exchange  
> run
> > from previous checkpoint files.
> > It often happens to me that the number of MD steps done in the
> > previous run is not the
> > same for all the replica. If this is the case, the program stops.
> > This may happen since checkpoints are written with a stride  
> expressed
> > in REAL time (every 15 minutes) and replica on different processors
> > may have run
> > for different number of steps in the same amount of time.
> >
> > Is it possible to specify the checkpoint writing stride in number of
> > steps instead of real time?
> >
> > Regards,
> > Massimiliano Bonomi
> > _______________________________________________
> > gmx-users mailing list gmx-users at gromacs.org
> > http://www.gromacs.org/mailman/listinfo/gmx-users
> > Please search the archive at http://www.gromacs.org/search before  
> posting!
> > Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-users-request at gromacs.org.
> > Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>
> Express yourself instantly with MSN Messenger! MSN Messenger  
> _______________________________________________
> gmx-users mailing list    gmx-users at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/search before  
> posting!
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>
>
> What can you do with the new Windows Live? Find out  
> _______________________________________________
> gmx-users mailing list    gmx-users at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/search before  
> posting!
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20090220/5ce34f34/attachment.html>


More information about the gromacs.org_gmx-users mailing list