[gmx-users] please guide me through the confusing gromacs results!!!
Justin Lemkul
jalemkul at vt.edu
Wed Feb 26 12:00:36 CET 2014
On 2/26/14, 4:31 AM, delara aghaie wrote:
> Dear Gromacs users,
> we want to simulateHSA protein using8 processors. Usually with our available system 10ns simulation on 8 processors lasts 2-3 days.
> This time we submitted 10 ns simulation, after almost 6 days it has finished but when we look at md.log file, only 179411 steps has been completed and all the averages are over this time, although we have submitted the run 5000000 steps=10ns
> 1) First I want to know how the log file shows the end of simulation, although only 0.35 ns of simulation is done and if we draw RDF the time again shows completing of 0.35 ns.
> you can see the ending part of log file below:
> 2) the second point is that the average load imbalance is shown as 110.3% while for previous runs we had this as 3-4 %.
> what can be the reason for this high load imbalance?
> and is this responsible for the lower simulation efficiency and higher time?
Something is unhealthy with the nodes you ran on; either they had other jobs
running that competed with yours or something is simply wrong with one or more
of them.
> and the most important that why the log file is written in the way that the simulation has been completed while only 0.3 ns has passes?
Something caused the run to end, and when that happens, mdrun writes the
statistics to the .log file. Presumably you hit a wallclock limit or something
that told mdrun to stop.
-Justin
> Thanks for your time
> **********************************************************
> D O M A I N D E C O M P O S I T I O N S T A T I S T I C S
>
> av. #atoms communicated per step for force: 2 x 97992.1
> av. #atoms communicated per step for LINCS: 2 x 2373.8
>
> Average load imbalance: 110.3 %
> Part of the total run time spent waiting due to load imbalance: 1.4 %
>
>
> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>
> Computing: Nodes Number G-Cycles Seconds %
> -----------------------------------------------------------------------
> Domain decomp. 8 35882 556174.638 209101.4 5.1
> DD comm. load 8 359 235.791 88.6 0.0
> Comm. coord. 8 179411 407595.614 153241.1 3.7
> Neighbor search 8 35883 39333.698 14788.0 0.4
> Force 8 179411 205203.393 77149.0 1.9
> Wait + Comm. F 8 179411 471040.171 177093.9 4.3
> PME mesh 8 179411 8699277.459 3270611.0 79.1
> Write traj. 8 752 15563.463 5851.3 0.1
> Update 8 179411 12991.140 4884.2 0.1
> Constraints 8 179411 438837.169 164986.8 4.0
> Comm. energies 8 35884 149298.111 56130.6 1.4
> Rest 8 6090.776 2289.9 0.1
> -----------------------------------------------------------------------
> Total 8 11001641.424 4136215.9 100.0
> -----------------------------------------------------------------------
> -----------------------------------------------------------------------
> PME redist. X/F 8 358822 1446552.180 543850.9 13.1
> PME spread/gather 8 358822 820940.059 308643.5 7.5
> PME 3D-FFT 8 358822 6426689.995 2416201.0 58.4
> PME solve 8 179411 5033.523 1892.4 0.0
> -----------------------------------------------------------------------
>
> Parallel run - timing based on wallclock.
>
> NODE (s) Real (s) (%)
> Time: 517026.991 517026.991 100.0
> 5d23h37:06
> (Mnbf/s) (MFlops) (ns/day) (hour/ns)
> Performance: 9.634 506.361 0.060 400.250
> Finished mdrun on node 0 Tue Feb 25 13:34:48 2014
>
--
==================================================
Justin A. Lemkul, Ph.D.
Ruth L. Kirschstein NRSA Postdoctoral Fellow
Department of Pharmaceutical Sciences
School of Pharmacy
Health Sciences Facility II, Room 601
University of Maryland, Baltimore
20 Penn St.
Baltimore, MD 21201
jalemkul at outerbanks.umaryland.edu | (410) 706-7441
http://mackerell.umaryland.edu/~jalemkul
==================================================
More information about the gromacs.org_gmx-users
mailing list