[gmx-users] Restarting crashed simulation
Ali Ahmed
aa5635737 at gmail.com
Fri Nov 17 19:37:06 CET 2017
Hello GROMACS users
My MD simulation was crashed then I restarted the simulation from the point
when the point was written using this command on 64 processors: mpirun -np
64 mdrun_mpi -s md.tpr -cpi stat.cpt
After few days I got nothing in the folder usch as output.gro and I got the
following
_______________________________________________
Command line:
mdrun_mpi -s md.tpr -cpi stat.cpt
Warning: No checkpoint file found with -cpi option. Assuming this is a new
run.
Back Off! I just backed up md.log to ./#md.log.2#
Running on 4 nodes with total 64 cores, 64 logical cores
Cores per node: 16
Logical cores per node: 16
Hardware detected on host compute-2-27.local (the node of MPI rank 0):
CPU info:
Vendor: Intel
Brand: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
SIMD instructions most likely to fit this hardware: AVX_256
SIMD instructions selected at GROMACS compile time: AVX_256
Hardware topology: Basic
Reading file md.tpr, VERSION 2016.3 (single precision)
Changing nstlist from 10 to 40, rlist from 1 to 1.003
Will use 48 particle-particle and 16 PME only ranks
This is a guess, check the performance at the end of the log file
Using 64 MPI processes
Using 1 OpenMP thread per MPI process
Non-default thread affinity set probably by the OpenMP library,
disabling internal thread affinity
WARNING: This run will generate roughly 50657 Mb of data
starting mdrun 'Molecular Dynamics'
25000000 steps, 50000.0 ps.
step 888000 Turning on dynamic load balancing, because the performance loss
due to load imbalance is 8.7 %.
step 930400 Turning off dynamic load balancing, because it is degrading
performance.
step 1328000 Turning on dynamic load balancing, because the performance
loss due to load imbalance is 3.4 %.
step 1328800 Turning off dynamic load balancing, because it is degrading
performance.
step 1336000 Turning on dynamic load balancing, because the performance
loss due to load imbalance is 3.4 %.
step 1338400 Turning off dynamic load balancing, because it is degrading
performance.
step 1340000 Will no longer try dynamic load balancing, as it degraded
performance.
Writing final coordinates.
Average load imbalance: 13.2 %
Part of the total run time spent waiting due to load imbalance: 7.5 %
Average PME mesh/force load: 1.077
Part of the total run time spent waiting due to PP/PME imbalance: 4.1 %
NOTE: 7.5 % of the available CPU time was lost due to load imbalance
in the domain decomposition.
You might want to use dynamic load balancing (option -dlb.)
Core t (s) Wall t (s) (%)
Time: 26331875.601 411435.556 6400.0
4d18h17:15
(ns/day) (hour/ns)
Performance: 10.500 2.286
_____________________________________________________________
Any advise or suggestion will be helpful.
Thanks in advance
More information about the gromacs.org_gmx-users
mailing list