[gmx-users] Loosing partly the available CPU time
Szilárd Páll
pall.szilard at gmail.com
Mon Aug 15 14:52:30 CEST 2016
Hi,
Please post full logs; what you cut out of the file will often miss
information needed to diagnose your issues.
At first sight it seems that you simply have an imbalanced system. Not
sure about the source of the imbalance and without knowing more about
your system/setup and how is it decomposed what I can suggest is to:
try other decomposition schemes or simply less decomposition (use more
threads or less cores).
Additionally you also have a pretty bad PP-PME load balance, but
that's likely going to get better if you get you PP performance
better.
Cheers,
--
Szilárd
On Sun, Aug 14, 2016 at 3:23 PM, Alexander Alexander
<alexanderwien2k at gmail.com> wrote:
> Dear gromacs user,
>
> My free energy calculation works well, however, I am loosing around 56.5 %
> of the available CPU time as stated in my log file which is really
> considerable. The problem is due to the load imbalance and domain
> decomposition, but I have no idea to improve it, below is the very end of
> my log file and I would be so appreciated if you could help avoid this.
>
>
> D O M A I N D E C O M P O S I T I O N S T A T I S T I C S
>
> av. #atoms communicated per step for force: 2 x 115357.4
> av. #atoms communicated per step for LINCS: 2 x 2389.1
>
> Average load imbalance: 285.9 %
> Part of the total run time spent waiting due to load imbalance: 56.5 %
> Steps where the load balancing was limited by -rdd, -rcon and/or -dds: X 2
> % Y 2 % Z 2 %
> Average PME mesh/force load: 0.384
> Part of the total run time spent waiting due to PP/PME imbalance: 14.5 %
>
> NOTE: 56.5 % of the available CPU time was lost due to load imbalance
> in the domain decomposition.
>
> NOTE: 14.5 % performance was lost because the PME ranks
> had less work to do than the PP ranks.
> You might want to decrease the number of PME ranks
> or decrease the cut-off and the grid spacing.
>
>
> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>
> On 96 MPI ranks doing PP, and
> on 32 MPI ranks doing PME
>
> Computing: Num Num Call Wall time Giga-Cycles
> Ranks Threads Count (s) total sum %
> -----------------------------------------------------------------------------
> Domain decomp. 96 1 175000 242.339 53508.472 0.5
> DD comm. load 96 1 174903 9.076 2003.907 0.0
> DD comm. bounds 96 1 174901 27.054 5973.491 0.1
> Send X to PME 96 1 7000001 44.342 9790.652 0.1
> Neighbor search 96 1 175001 251.994 55640.264 0.6
> Comm. coord. 96 1 6825000 1521.009 335838.747 3.4
> Force 96 1 7000001 7001.990 1546039.264 15.5
> Wait + Comm. F 96 1 7000001 10761.296 2376093.759 23.8
> PME mesh * 32 1 7000001 11796.344 868210.788 8.7
> PME wait for PP * 22135.752 1629191.096 16.3
> Wait + Recv. PME F 96 1 7000001 393.117 86800.265 0.9
> NB X/F buffer ops. 96 1 20650001 132.713 29302.991 0.3
> COM pull force 96 1 7000001 165.613 36567.368 0.4
> Write traj. 96 1 7037 55.020 12148.457 0.1
> Update 96 1 14000002 140.972 31126.607 0.3
> Constraints 96 1 14000002 12871.236 2841968.551 28.4
> Comm. energies 96 1 350001 261.976 57844.219 0.6
> Rest 52.349 11558.715 0.1
> -----------------------------------------------------------------------------
> Total 33932.096 9989607.639 100.0
> -----------------------------------------------------------------------------
> (*) Note that with separate PME ranks, the walltime column actually sums to
> twice the total reported, but the cycle count total and % are correct.
> -----------------------------------------------------------------------------
> Breakdown of PME mesh computation
> -----------------------------------------------------------------------------
> PME redist. X/F 32 1 21000003 2334.608 171827.143 1.7
> PME spread/gather 32 1 28000004 3640.870 267967.972 2.7
> PME 3D-FFT 32 1 28000004 1587.105 116810.882 1.2
> PME 3D-FFT Comm. 32 1 56000008 4066.097 299264.666 3.0
> PME solve Elec 32 1 14000002 148.284 10913.728 0.1
> -----------------------------------------------------------------------------
>
> Core t (s) Wall t (s) (%)
> Time: 4341204.790 33932.096 12793.8
> 9h25:32
> (ns/day) (hour/ns)
> Performance: 35.648 0.673
> Finished mdrun on rank 0 Sat Aug 13 23:45:45 2016
>
> Thanks,
> Regards,
> Alex
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
More information about the gromacs.org_gmx-users
mailing list