[gmx-users] Re: mdrun -nosum still complains that 15 % of > the runtime was spent communicating energies
chris.neale at utoronto.ca
chris.neale at utoronto.ca
Tue Jul 21 14:47:09 CEST 2009
Thanks mark, I'll respond inline.
> Chris Neale wrote:
>> I have now tested with and without -nosum and it appears that the option
>> is working (see 51 vs. 501 Number of communications) but that the total
>> amount of time communicating energies didn't go down by very much. Seems
>> strange to me. Anybody have any ideas if this is normal?
>
> Seems strange, but perhaps a 45-second test is not sufficiently long to
> demonstrate suitable scaling.
Agreed, although I find it a good jumping off point, especially now
that we need to optimize -npme. If I use quick tests to narrow down
the range of nodes/npme that is going to scale the best then I fine
tune it with longer scaling tests.
There's no discussion in the 4.0.5 release
> notes of a relevant change to -nosum, but there has been a change:
> http://oldwww.gromacs.org/content/view/181/132/.
Thanks, I did see this but don't think that it is related to this
issue, which I have now confirmed in both 4.0.4 and 4.0.5.
>
>> At the very least, I suggest adding an if statement to mdrun so that it
>> doesn't output the -nosum usage note if the user did in fact use -nosum
>> in that run.
>>
>>
>> Without using -nosum:
>>
>> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>>
>> Computing: Nodes Number G-Cycles Seconds %
>> -----------------------------------------------------------------------
>> ...
>> Write traj. 256 2 233.218 93.7 0.5
>> Update 256 501 777.511 312.5 1.7
>> Constraints 256 1002 1203.894 483.9 2.7
>> Comm. energies 256 501 7397.995 2973.9 16.5
>> Rest 256 128.058 51.5 0.3
>> -----------------------------------------------------------------------
>> Total 384 44897.468 18048.0 100.0
>> -----------------------------------------------------------------------
>>
>> NOTE: 16 % of the run time was spent communicating energies,
>> you might want to use the -nosum option of mdrun
>>
>>
>> Parallel run - timing based on wallclock.
>>
>> NODE (s) Real (s) (%)
>> Time: 47.000 47.000 100.0
>> (Mnbf/s) (GFlops) (ns/day) (hour/ns)
>> Performance: 13485.788 712.634 1.842 13.029
>> Finished mdrun on node 0 Mon Jul 20 12:53:41 2009
>>
>> #########
>>
>> And using -nosum:
>>
>> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>> Computing: Nodes Number G-Cycles Seconds %
>> -----------------------------------------------------------------------
>> ...
>> Write traj. 256 2 213.521 83.3 0.5
>> Update 256 501 776.606 303.0 1.8
>> Constraints 256 1002 1200.285 468.2 2.7
>> Comm. energies 256 51 6926.667 2702.1 15.6
>> Rest 256 127.503 49.7 0.3
>> -----------------------------------------------------------------------
>> Total 384 44296.670 17280.0 100.0
>> -----------------------------------------------------------------------
>>
>> NOTE: 16 % of the run time was spent communicating energies,
>> you might want to use the -nosum option of mdrun
>>
>>
>> Parallel run - timing based on wallclock.
>>
>> NODE (s) Real (s) (%)
>> Time: 45.000 45.000 100.0
>> (Mnbf/s) (GFlops) (ns/day) (hour/ns)
>> Performance: 14084.547 744.277 1.924 12.475
>>
>> #########
>>
>> Thanks,
>> Chris.
>>
>> Chris Neale wrote:
>>> Hello,
>>>
>>> I have been running simulations on a larger number of processors
>>> recently and am confused about the message regarding -nosum that
>>> occurs at the end of the .log file. In this case, I have included the
>>> -nosum option to mdrun and I still get this warning (gromacs 4.0.4).
>>>
>>> My command was:
>>> mpirun -np $(wc -l $PBS_NODEFILE | gawk '{print $1}') -machinefile
>>> $PBS_NODEFILE /scratch/cneale/exe/intel/gromacs-4.0.4/exec/bin/mdrun
>>> -deffnm test -nosum -npme 128
>
> Perhaps assigning the result of this to a variable and printing it
> before executing it would help confirm that -nosum really was there.
I am not sure what you mean... the while line as a variable? I'm
pretty sure that -nosum is there.
>
> Your mdrun output from your first email was...
>
>>> #########
>>>
>>> To confirm that I am asking mdrun for -nosum, to stderr I get:
>>> ...
>>> Option Type Value Description
>>> ------------------------------------------------------
>>> -[no]h bool no Print help info and quit
>>> -nice int 0 Set the nicelevel
>>> -deffnm string test Set the default filename for all file options
>>> -[no]xvgr bool yes Add specific codes (legends etc.) in the
>>> output
>>> xvg files for the xmgrace program
>>> -[no]pd bool no Use particle decompostion
>>> -dd vector 0 0 0 Domain decomposition grid, 0 is optimize
>>> -npme int 128 Number of separate nodes to be used for
>>> PME, -1
>>> is guess
>>> -ddorder enum interleave DD node order: interleave, pp_pme or
>>> cartesian
>>> -[no]ddcheck bool yes Check for all bonded interactions with DD
>>> -rdd real 0 The maximum distance for bonded
>>> interactions with
>>> DD (nm), 0 is determine from initial
>>> coordinates
>>> -rcon real 0 Maximum distance for P-LINCS (nm), 0 is
>>> estimate
>>> -dlb enum auto Dynamic load balancing (with DD): auto, no
>>> or yes
>>> -dds real 0.8 Minimum allowed dlb scaling of the DD cell
>>> size
>>> -[no]sum bool no Sum the energies at every step
>>> -[no]v bool no Be loud and noisy
>>> -[no]compact bool yes Write a compact log file
>>> -[no]seppot bool no Write separate V and dVdl terms for each
>>> interaction type and node to the log file(s)
>>> -pforce real -1 Print all forces larger than this (kJ/mol nm)
>>> -[no]reprod bool no Try to avoid optimizations that affect binary
>>> reproducibility
>>> -cpt real 15 Checkpoint interval (minutes)
>>> -[no]append bool no Append to previous output files when
>>> continuing
>>> from checkpoint
>>> -[no]addpart bool yes Add the simulation part number to all output
>>> files when continuing from checkpoint
>>> -maxh real -1 Terminate after 0.99 times this time (hours)
>>> -multi int 0 Do multiple simulations in parallel
>>> -replex int 0 Attempt replica exchange every # steps
>>> -reseed int -1 Seed for replica exchange, -1 is generate
>>> a seed
>>> -[no]glas bool no Do glass simulation with special long range
>>> corrections
>>> -[no]ionize bool no Do a simulation including the effect of an
>>> X-Ray
>>> bombardment on your system
>>> ...
>>>
>>> ########
>
> ... and this does not demonstrate -nosum. Either you've mismatched, or
> the command line has lost the -nosum, or there's a bug.
The sections are not mismatched, but thanks for looking in such
detail. Perhaps I misunderstand the -[no]sum line below that I have
parsed from the above:
-[no]sum bool no Sum the energies at every step
The fact that
> the number for "Comm. energies" decreases suggests you have done it
> correctly, though. Perhaps the contents of this variable are being
> incorrectly propagated through the code.
This is what I think is most likely the case. I'll take a look, but my
gromacs coding endeavours have not been highly successful in the past.
Thanks again,
Chris.
>
> Mark
>
>>> And the message at the end of the .log file is:
>>> ...
>>> D O M A I N D E C O M P O S I T I O N S T A T I S T I C S
>>>
>>> av. #atoms communicated per step for force: 2 x 3376415.3
>>> av. #atoms communicated per step for LINCS: 2 x 192096.6
>>>
>>> Average load imbalance: 11.7 %
>>> Part of the total run time spent waiting due to load imbalance: 7.9 %
>>> Steps where the load balancing was limited by -rdd, -rcon and/or -dds:
>>> X 0 % Y 0 % Z 0 %
>>> Average PME mesh/force load: 0.620
>>> Part of the total run time spent waiting due to PP/PME imbalance: 10.0 %
>>>
>>> NOTE: 7.9 % performance was lost due to load imbalance
>>> in the domain decomposition.
>>>
>>> NOTE: 10.0 % performance was lost because the PME nodes
>>> had less work to do than the PP nodes.
>>> You might want to decrease the number of PME nodes
>>> or decrease the cut-off and the grid spacing.
>>>
>>>
>>> R E A L C Y C L E A N D T I M E A C C O U N T I N G
>>>
>>> Computing: Nodes Number G-Cycles Seconds %
>>> -----------------------------------------------------------------------
>>> Domain decomp. 256 51 337.551 131.2 0.7
>>> Send X to PME 256 501 59.454 23.1 0.1
>>> Comm. coord. 256 501 289.936 112.7 0.6
>>> Neighbor search 256 51 1250.088 485.9 2.8
>>> Force 256 501 16105.584 6259.9 35.4
>>> Wait + Comm. F 256 501 2441.390 948.9 5.4
>>> PME mesh 128 501 5552.336 2158.1 12.2
>>> Wait + Comm. X/F 128 501 9586.486 3726.1 21.1
>>> Wait + Recv. PME F 256 501 459.752 178.7 1.0
>>> Write traj. 256 2 223.993 87.1 0.5
>>> Update 256 501 777.618 302.2 1.7
>>> Constraints 256 1002 1223.093 475.4 2.7
>>> Comm. energies 256 51 7011.309 2725.1 15.4
>>> Rest 256 127.710 49.6 0.3
>>> -----------------------------------------------------------------------
>>> Total 384 45446.299 17664.0 100.0
>>> -----------------------------------------------------------------------
>>>
>>> NOTE: 15 % of the run time was spent communicating energies,
>>> you might want to use the -nosum option of mdrun
>>>
>>>
>>> Parallel run - timing based on wallclock.
>>>
>>> NODE (s) Real (s) (%)
>>> Time: 46.000 46.000 100.0
>>> (Mnbf/s) (GFlops) (ns/day) (hour/ns)
>>> Performance: 13778.036 728.080 1.882 12.752
>>>
>>> ########
>>>
>>> Thanks,
>>> Chris
>>
>> _______________________________________________
>> gmx-users mailing list gmx-users at gromacs.org
>> http://lists.gromacs.org/mailman/listinfo/gmx-users
>> Please search the archive at http://www.gromacs.org/search before posting!
>> Please don't post (un)subscribe requests to the list. Use the www
>> interface or send it to gmx-users-request at gromacs.org.
>> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>>
>
>
> ------------------------------
>
> Message: 4
> Date: Tue, 21 Jul 2009 09:12:31 +0200
> From: patrick fuchs <patrick.fuchs at univ-paris-diderot.fr>
> Subject: Re: [gmx-users] problem with 53a6 simulating a coiled-coil
> fiber like protein
> To: Discussion list for GROMACS users <gmx-users at gromacs.org>
> Message-ID: <4A656A5F.4060104 at univ-paris-diderot.fr>
> Content-Type: text/plain; charset=UTF-8; format=flowed
>
>>> Well, not coiled coil, but I have observed serious distortions in a
>>> all-helix protein with G53A6 and reaction field. Using PME solved the
>>> problem.
>>
>> Indeed. rlist = 0.8 and any kind of electostatic cut-off was acceptable
>> in about the 1980s :-)
> Well, this is how the force field was parameterized with the force field
> correction in 2004... But I do agree that G53a6 with RF is 'helix
> unfriendly', never tried it with PME though.
>
> Patrick
>
>>
>> Mark
>>
>>> Marcos
>>>
>>> On Mon, 2009-07-20 at 13:50 +0200, Lory Montout wrote:
>>>> Dear all
>>>>
>>>> I recently performed MD simulations using 53A6 force field with
>>>> Gromacs4.0 The system includes a protein+water and ions for
>>>> neutralizing, The protocol is quite classical: NPT ensemble, 300K and
>>>> reaction field for electrostatics, 2fs for integration, bond lengths are
>>>> constrained.
>>>> The protein is a coiled-coil fiber like protein, including different
>>>> repeat units. At the
>>>> starting point, the protein roughly adopts a cylinder shape. After
>>>> few ns ( less than 5), some helices are broken, even unfold. Finally,
>>>> the protein is kinked,with a kink angle ~ 90°. I tested different
>>>> constructions but observed similar results.
>>>> The same system was simulated with NAMD, charmm force field, the
>>>> structure remains stable all along the simulation (10ns for now ).
>>>> Did anyone obtain similar results for a coiled coil system with 53A6
>>>> force field?
>>>>
>>>> here is my .mdp file :
>>>>
>>>> nstvout = 10000
>>>> nstfout = 0
>>>> nstxtcout = 2500
>>>> xtc_precision = 1000
>>>> nstlog = 500
>>>> nstenergy = 500
>>>> nstlist = 5
>>>> rlist = 0.8
>>>> coulombtype = generalized-reaction-field
>>>> rcoulomb = 1.4
>>>> rvdw = 1.4
>>>> epsilon_rf = 62.0
>>>> ; Temperature coupling is on in two groups
>>>> Tcoupl = Berendsen
>>>> tc-grps = Protein Non-Protein
>>>> tau_t = 0.1 0.1
>>>> ref_t = 300 300
>>>> ; Energy monitoring
>>>> energygrps = Protein SOL NA+
>>>> ; Pressure coupling is not on
>>>> Pcoupl = Berendsen
>>>> tau_p = 1.0
>>>> compressibility = 4.5e-5
>>>> ref_p = 1.0
>>>>
>>>> Thanks a lot for your
>>>> answers._______________________________________________
>>>> gmx-users mailing list gmx-users at gromacs.org
>>>> http://lists.gromacs.org/mailman/listinfo/gmx-users
>>>> Please search the archive at http://www.gromacs.org/search before
>>>> posting!
>>>> Please don't post (un)subscribe requests to the list. Use the www
>>>> interface or send it to gmx-users-request at gromacs.org.
>>>> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>>>
>>> _______________________________________________
>>> gmx-users mailing list gmx-users at gromacs.org
>>> http://lists.gromacs.org/mailman/listinfo/gmx-users
>>> Please search the archive at http://www.gromacs.org/search before
>>> posting!
>>> Please don't post (un)subscribe requests to the list. Use the www
>>> interface or send it to gmx-users-request at gromacs.org.
>>> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>>>
>> _______________________________________________
>> gmx-users mailing list gmx-users at gromacs.org
>> http://lists.gromacs.org/mailman/listinfo/gmx-users
>> Please search the archive at http://www.gromacs.org/search before posting!
>> Please don't post (un)subscribe requests to the list. Use the www
>> interface or send it to gmx-users-request at gromacs.org.
>> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>>
>
> --
> _______________________________________________________________________
> !!!! new E-mail address: patrick.fuchs at univ-paris-diderot.fr !!!!
> Patrick FUCHS
> Dynamique des Structures et Interactions des Macromolécules Biologiques
> INTS, INSERM UMR-S665, Université Paris Diderot,
> 6 rue Alexandre Cabanel, 75015 Paris
> Tel : +33 (0)1-44-49-30-57 - Fax : +33 (0)1-47-34-74-31
> Web Site: http://www.dsimb.inserm.fr/~fuchs
>
>
> ------------------------------
>
> _______________________________________________
> gmx-users mailing list
> gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at http://www.gromacs.org/search before posting!
>
> End of gmx-users Digest, Vol 63, Issue 102
> ******************************************
>
More information about the gromacs.org_gmx-users
mailing list