[gmx-users] Re: mdrun -nosum still complains that 15 % of the run time was spent communicating energies
Mark Abraham
Mark.Abraham at anu.edu.au
Tue Jul 21 03:44:58 CEST 2009
Chris Neale wrote:
> I have now tested with and without -nosum and it appears that the option
> is working (see 51 vs. 501 Number of communications) but that the total
> amount of time communicating energies didn't go down by very much. Seems
> strange to me. Anybody have any ideas if this is normal?
Seems strange, but perhaps a 45-second test is not sufficiently long to
demonstrate suitable scaling. There's no discussion in the 4.0.5 release
notes of a relevant change to -nosum, but there has been a change:
http://oldwww.gromacs.org/content/view/181/132/.
> At the very least, I suggest adding an if statement to mdrun so that it
> doesn't output the -nosum usage note if the user did in fact use -nosum
> in that run.
>
>
> Without using -nosum:
>
> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>
> Computing: Nodes Number G-Cycles Seconds %
> -----------------------------------------------------------------------
> ...
> Write traj. 256 2 233.218 93.7 0.5
> Update 256 501 777.511 312.5 1.7
> Constraints 256 1002 1203.894 483.9 2.7
> Comm. energies 256 501 7397.995 2973.9 16.5
> Rest 256 128.058 51.5 0.3
> -----------------------------------------------------------------------
> Total 384 44897.468 18048.0 100.0
> -----------------------------------------------------------------------
>
> NOTE: 16 % of the run time was spent communicating energies,
> you might want to use the -nosum option of mdrun
>
>
> Parallel run - timing based on wallclock.
>
> NODE (s) Real (s) (%)
> Time: 47.000 47.000 100.0
> (Mnbf/s) (GFlops) (ns/day) (hour/ns)
> Performance: 13485.788 712.634 1.842 13.029
> Finished mdrun on node 0 Mon Jul 20 12:53:41 2009
>
> #########
>
> And using -nosum:
>
> R E A L C Y C L E A N D T I M E A C C O U N T I N G
> Computing: Nodes Number G-Cycles Seconds %
> -----------------------------------------------------------------------
> ...
> Write traj. 256 2 213.521 83.3 0.5
> Update 256 501 776.606 303.0 1.8
> Constraints 256 1002 1200.285 468.2 2.7
> Comm. energies 256 51 6926.667 2702.1 15.6
> Rest 256 127.503 49.7 0.3
> -----------------------------------------------------------------------
> Total 384 44296.670 17280.0 100.0
> -----------------------------------------------------------------------
>
> NOTE: 16 % of the run time was spent communicating energies,
> you might want to use the -nosum option of mdrun
>
>
> Parallel run - timing based on wallclock.
>
> NODE (s) Real (s) (%)
> Time: 45.000 45.000 100.0
> (Mnbf/s) (GFlops) (ns/day) (hour/ns)
> Performance: 14084.547 744.277 1.924 12.475
>
> #########
>
> Thanks,
> Chris.
>
> Chris Neale wrote:
>> Hello,
>>
>> I have been running simulations on a larger number of processors
>> recently and am confused about the message regarding -nosum that
>> occurs at the end of the .log file. In this case, I have included the
>> -nosum option to mdrun and I still get this warning (gromacs 4.0.4).
>>
>> My command was:
>> mpirun -np $(wc -l $PBS_NODEFILE | gawk '{print $1}') -machinefile
>> $PBS_NODEFILE /scratch/cneale/exe/intel/gromacs-4.0.4/exec/bin/mdrun
>> -deffnm test -nosum -npme 128
Perhaps assigning the result of this to a variable and printing it
before executing it would help confirm that -nosum really was there.
Your mdrun output from your first email was...
>> #########
>>
>> To confirm that I am asking mdrun for -nosum, to stderr I get:
>> ...
>> Option Type Value Description
>> ------------------------------------------------------
>> -[no]h bool no Print help info and quit
>> -nice int 0 Set the nicelevel
>> -deffnm string test Set the default filename for all file options
>> -[no]xvgr bool yes Add specific codes (legends etc.) in the
>> output
>> xvg files for the xmgrace program
>> -[no]pd bool no Use particle decompostion
>> -dd vector 0 0 0 Domain decomposition grid, 0 is optimize
>> -npme int 128 Number of separate nodes to be used for
>> PME, -1
>> is guess
>> -ddorder enum interleave DD node order: interleave, pp_pme or
>> cartesian
>> -[no]ddcheck bool yes Check for all bonded interactions with DD
>> -rdd real 0 The maximum distance for bonded
>> interactions with
>> DD (nm), 0 is determine from initial
>> coordinates
>> -rcon real 0 Maximum distance for P-LINCS (nm), 0 is
>> estimate
>> -dlb enum auto Dynamic load balancing (with DD): auto, no
>> or yes
>> -dds real 0.8 Minimum allowed dlb scaling of the DD cell
>> size
>> -[no]sum bool no Sum the energies at every step
>> -[no]v bool no Be loud and noisy
>> -[no]compact bool yes Write a compact log file
>> -[no]seppot bool no Write separate V and dVdl terms for each
>> interaction type and node to the log file(s)
>> -pforce real -1 Print all forces larger than this (kJ/mol nm)
>> -[no]reprod bool no Try to avoid optimizations that affect binary
>> reproducibility
>> -cpt real 15 Checkpoint interval (minutes)
>> -[no]append bool no Append to previous output files when
>> continuing
>> from checkpoint
>> -[no]addpart bool yes Add the simulation part number to all output
>> files when continuing from checkpoint
>> -maxh real -1 Terminate after 0.99 times this time (hours)
>> -multi int 0 Do multiple simulations in parallel
>> -replex int 0 Attempt replica exchange every # steps
>> -reseed int -1 Seed for replica exchange, -1 is generate
>> a seed
>> -[no]glas bool no Do glass simulation with special long range
>> corrections
>> -[no]ionize bool no Do a simulation including the effect of an
>> X-Ray
>> bombardment on your system
>> ...
>>
>> ########
... and this does not demonstrate -nosum. Either you've mismatched, or
the command line has lost the -nosum, or there's a bug. The fact that
the number for "Comm. energies" decreases suggests you have done it
correctly, though. Perhaps the contents of this variable are being
incorrectly propagated through the code.
Mark
>> And the message at the end of the .log file is:
>> ...
>> D O M A I N D E C O M P O S I T I O N S T A T I S T I C S
>>
>> av. #atoms communicated per step for force: 2 x 3376415.3
>> av. #atoms communicated per step for LINCS: 2 x 192096.6
>>
>> Average load imbalance: 11.7 %
>> Part of the total run time spent waiting due to load imbalance: 7.9 %
>> Steps where the load balancing was limited by -rdd, -rcon and/or -dds:
>> X 0 % Y 0 % Z 0 %
>> Average PME mesh/force load: 0.620
>> Part of the total run time spent waiting due to PP/PME imbalance: 10.0 %
>>
>> NOTE: 7.9 % performance was lost due to load imbalance
>> in the domain decomposition.
>>
>> NOTE: 10.0 % performance was lost because the PME nodes
>> had less work to do than the PP nodes.
>> You might want to decrease the number of PME nodes
>> or decrease the cut-off and the grid spacing.
>>
>>
>> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>>
>> Computing: Nodes Number G-Cycles Seconds %
>> -----------------------------------------------------------------------
>> Domain decomp. 256 51 337.551 131.2 0.7
>> Send X to PME 256 501 59.454 23.1 0.1
>> Comm. coord. 256 501 289.936 112.7 0.6
>> Neighbor search 256 51 1250.088 485.9 2.8
>> Force 256 501 16105.584 6259.9 35.4
>> Wait + Comm. F 256 501 2441.390 948.9 5.4
>> PME mesh 128 501 5552.336 2158.1 12.2
>> Wait + Comm. X/F 128 501 9586.486 3726.1 21.1
>> Wait + Recv. PME F 256 501 459.752 178.7 1.0
>> Write traj. 256 2 223.993 87.1 0.5
>> Update 256 501 777.618 302.2 1.7
>> Constraints 256 1002 1223.093 475.4 2.7
>> Comm. energies 256 51 7011.309 2725.1 15.4
>> Rest 256 127.710 49.6 0.3
>> -----------------------------------------------------------------------
>> Total 384 45446.299 17664.0 100.0
>> -----------------------------------------------------------------------
>>
>> NOTE: 15 % of the run time was spent communicating energies,
>> you might want to use the -nosum option of mdrun
>>
>>
>> Parallel run - timing based on wallclock.
>>
>> NODE (s) Real (s) (%)
>> Time: 46.000 46.000 100.0
>> (Mnbf/s) (GFlops) (ns/day) (hour/ns)
>> Performance: 13778.036 728.080 1.882 12.752
>>
>> ########
>>
>> Thanks,
>> Chris
>
> _______________________________________________
> gmx-users mailing list gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/search before posting!
> Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>
More information about the gromacs.org_gmx-users
mailing list