[gmx-users] Restarting crashed simulation
Mark Abraham
mark.j.abraham at gmail.com
Sat Nov 18 23:55:27 CET 2017
Hi,
Looks like you made a typo with state.cpt and that you perhaps have
multiple mdrun processes running such that the actual output file is in one
of the backup files labelled with # characters.
Mark
On Fri, 17 Nov 2017 19:37 Ali Ahmed <aa5635737 at gmail.com> wrote:
> Hello GROMACS users
> My MD simulation was crashed then I restarted the simulation from the point
> when the point was written using this command on 64 processors: mpirun -np
> 64 mdrun_mpi -s md.tpr -cpi stat.cpt
>
> After few days I got nothing in the folder usch as output.gro and I got the
> following
> _______________________________________________
> Command line:
> mdrun_mpi -s md.tpr -cpi stat.cpt
>
> Warning: No checkpoint file found with -cpi option. Assuming this is a new
> run.
>
>
> Back Off! I just backed up md.log to ./#md.log.2#
>
> Running on 4 nodes with total 64 cores, 64 logical cores
> Cores per node: 16
> Logical cores per node: 16
> Hardware detected on host compute-2-27.local (the node of MPI rank 0):
> CPU info:
> Vendor: Intel
> Brand: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
> SIMD instructions most likely to fit this hardware: AVX_256
> SIMD instructions selected at GROMACS compile time: AVX_256
>
> Hardware topology: Basic
>
> Reading file md.tpr, VERSION 2016.3 (single precision)
> Changing nstlist from 10 to 40, rlist from 1 to 1.003
>
> Will use 48 particle-particle and 16 PME only ranks
> This is a guess, check the performance at the end of the log file
> Using 64 MPI processes
> Using 1 OpenMP thread per MPI process
>
> Non-default thread affinity set probably by the OpenMP library,
> disabling internal thread affinity
> WARNING: This run will generate roughly 50657 Mb of data
>
> starting mdrun 'Molecular Dynamics'
> 25000000 steps, 50000.0 ps.
>
> step 888000 Turning on dynamic load balancing, because the performance loss
> due to load imbalance is 8.7 %.
> step 930400 Turning off dynamic load balancing, because it is degrading
> performance.
> step 1328000 Turning on dynamic load balancing, because the performance
> loss due to load imbalance is 3.4 %.
> step 1328800 Turning off dynamic load balancing, because it is degrading
> performance.
> step 1336000 Turning on dynamic load balancing, because the performance
> loss due to load imbalance is 3.4 %.
> step 1338400 Turning off dynamic load balancing, because it is degrading
> performance.
> step 1340000 Will no longer try dynamic load balancing, as it degraded
> performance.
> Writing final coordinates.
> Average load imbalance: 13.2 %
> Part of the total run time spent waiting due to load imbalance: 7.5 %
> Average PME mesh/force load: 1.077
> Part of the total run time spent waiting due to PP/PME imbalance: 4.1 %
>
> NOTE: 7.5 % of the available CPU time was lost due to load imbalance
> in the domain decomposition.
> You might want to use dynamic load balancing (option -dlb.)
>
>
> Core t (s) Wall t (s) (%)
> Time: 26331875.601 411435.556 6400.0
> 4d18h17:15
> (ns/day) (hour/ns)
> Performance: 10.500 2.286
> _____________________________________________________________
>
> Any advise or suggestion will be helpful.
>
> Thanks in advance
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>
More information about the gromacs.org_gmx-users
mailing list