[gmx-users] restart issues with Gromacs

Nash, Anthony a.nash at ucl.ac.uk
Sun Nov 15 23:19:37 CET 2015


Thanks Mark,

Ok, so it sounds very much like it¹s on the cluster side. I¹ll fire this
across to one of the sys admins and see if they can find out what the
problem is, although I have no idea what and if they will find anything.
>From the looks of it, the reading of the checkpoint file data was fine
(else I expect an MD5 hashkey error IF my memory serves me right).

Thanks again
Anthony

Dr Anthony Nash
Department of Chemistry
University College London





On 15/11/2015 22:12, "gromacs.org_gmx-users-bounces at maillist.sys.kth.se on
behalf of Mark Abraham" <gromacs.org_gmx-users-bounces at maillist.sys.kth.se
on behalf of mark.j.abraham at gmail.com> wrote:

>Hi,
>
>GROMACS assumes the file systems actually write to disk when you use the
>system call that means that, and works correctly if so. But if the file
>system or its configuration don't actually do that (for "performance" or
>erroneous reasons), then all bets are off. mdrun can't even know if it's
>being lied to, because, well, it's being lied to...
>
>Mark
>
>On Sun, 15 Nov 2015 22:30 Nash, Anthony <a.nash at ucl.ac.uk> wrote:
>
>> Hi all,
>>
>> Running Grimaces 5.0.4 with PLUMED 2.2 on a cluster, number of ranks
>>(MPI
>> processes) is 24. The simulation successfully ran for the maximum
>>cluster
>> wall time (48 hours).
>>
>> I attempt to restart the simulations using the following command (with a
>> sun microsystem grid engine submission script):
>>
>> gerun mdrun_mpi_d -deffnm neu_mut_meta_K -cpi neu_mut_meta_K.cpt
>>-noappend
>> -plumed
>>
>>
>> However, whilst the queue is telling me that the job is running, the
>> *.part0002.log file seems stuck at:
>>
>> -----------------------------------
>> When dynamic load balancing gets turned on, these settings will change
>>to:
>> The maximum number of communication pulses is: X 1 Y 1
>> The minimum size for domain decomposition cells is 1.025 nm
>> The requested allowed shrink of DD cells (option -dds) is: 0.80
>> The allowed shrink of domain decomposition cells is: X 0.43 Y 0.56
>> The maximum allowed distance for charge groups involved in interactions
>>is:
>>                  non-bonded interactions           1.025 nm
>>             two-body bonded interactions  (-rdd)   1.025 nm
>>           multi-body bonded interactions  (-rdd)   1.025 nm
>>   atoms separated by up to 5 constraints  (-rcon)  1.025 nm
>> ------------------------------------
>>
>>
>> The cluster error file and output file (not gromacs file) contains no
>> warnings or errors. The gromacs log file contains no warnings or errors.
>>
>> I have seen this behaviour quite a number of times, going back to early
>> versions of gromacs 4 around late 2010 (I think). I got into the habit
>>of
>> a) copying backing up the .cpt files, and b) always using -noappend
>>option
>> to preserve the .trr file. Has there ever been an explanation as to why
>> this is happening?
>>
>> Many thanks
>> Anthony
>>
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>>
>-- 
>Gromacs Users mailing list
>
>* Please search the archive at
>http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>posting!
>
>* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
>* For (un)subscribe requests visit
>https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>send a mail to gmx-users-request at gromacs.org.



More information about the gromacs.org_gmx-users mailing list