[gmx-users] 2018: large performance variations

Szilárd Páll pall.szilard at gmail.com
Fri Mar 2 19:29:10 CET 2018


BTW, we have considered adding a warmup delay to the tuner, would you be
willing to help testing (or even contributing such a feature)?

--
Szilárd

On Fri, Mar 2, 2018 at 7:28 PM, Szilárd Páll <pall.szilard at gmail.com> wrote:

> Hi Michael,
>
> Can you post full logs, please? This is likely related to a known issue
> where CPU cores (and in some cases GPUs too) may take longer to clock up
> and get a stable performance than the time the auto-tuner takes to do a few
> cycles of measurements.
>
> Unfortunately we do not have a good solution for this, but what you can do
> make runs more consistent is:
> - try "warming up" the CPU/GPU before production runs (e.g. stress -c or
> just a dummy 30 sec mdrun run)
> - repeat the benchmark a few times, see which cutoff / grid setting is
> best, set that in the mdp options and run with -notunepme
>
> Of course the latter may be too tedious if you have a variety of
> systems/inputs to run.
>
> Regarding tune_pme: that issue is related to resetting timings too early
> (for -resetstep see mdrun -h -hidden); not sure if we have a fix, but
> either way tune_pme is more suited for parallel runs' separate PME rank
> count tuning.
>
> Cheers,
>
> --
> Szilárd
>
> On Thu, Mar 1, 2018 at 7:11 PM, Michael Brunsteiner <mbx0009 at yahoo.com>
> wrote:
>
>> Hi,I ran a few MD runs with identical input files (the SAME tpr file. mdp
>> included below) on the same computer
>> with gmx 2018 and observed rather large performance variations (~50%) as
>> in:
>> grep Performance */mcz1.log7/mcz1.log:Performance:       98.510
>> 0.244
>> 7d/mcz1.log:Performance:      140.733        0.171
>> 7e/mcz1.log:Performance:      115.586        0.208
>> 7f/mcz1.log:Performance:      139.197        0.172
>>
>> turns out the load balancing effort that is done at the beginning gives
>> quite different results:
>> grep "optimal pme grid" */mcz1.log
>> 7/mcz1.log:              optimal pme grid 32 32 28, coulomb cutoff 1.394
>> 7d/mcz1.log:              optimal pme grid 36 36 32, coulomb cutoff 1.239
>> 7e/mcz1.log:              optimal pme grid 25 24 24, coulomb cutoff 1.784
>> 7f/mcz1.log:              optimal pme grid 40 36 32, coulomb cutoff 1.200
>>
>> next i tried tune_pme as in:gmx tune_pme -mdrun 'gmx mdrun' -nt 6 -ntmpi
>> 1 -ntomp 6 -pin on -pinoffset 0 -s mcz1.tpr  -pmefft cpu -pinstride 1 -r 10
>> which didn't work ... in some log file it says:Fatal error:
>> PME tuning was still active when attempting to reset mdrun counters at
>> step
>> 1500. Try resetting counters later in the run, e.g. with gmx mdrun
>> -resetstep.
>>
>> i found no documentation regarding "-resetstep"  ...
>>
>> i could of course optimize the PME grid manually but since i plan to run
>> a large numberof jobs with different systems and sizes this would be a lot
>> of work and if possible i'd like to avoid that.
>> is there any way to ask gmx to perform more tests at the beginning of
>> therun when optimizing the PME grid?or is using "-notunepme -dlb yes" an
>> option, and does the latter require aconcurrent optimization of the domain
>> decomposition, if so how is this done?
>> thanks for any help!
>> michael
>>
>>
>> mdp:
>> integrator        = md
>> dt                = 0.001
>> nsteps            = 500000
>> comm-grps         = System
>> ;
>> nstxout           = 0
>> nstvout           = 0
>> nstfout           = 0
>> nstlog            = 1000
>> nstenergy         = 1000
>> ;
>> nstlist                  = 40
>> ns_type                  = grid
>> pbc                      = xyz
>> rlist                    = 1.2
>> cutoff-scheme            = Verlet
>> ;
>> coulombtype              = PME
>> rcoulomb                 = 1.2
>> vdw_type                 = cut-off
>> rvdw                     = 1.2
>> ;
>> constraints              = none
>> ;
>> tcoupl             = v-rescale
>> tau-t              = 0.1
>> ref-t              = 300
>> tc-grps            = System
>> ;
>> pcoupl             = berendsen
>> pcoupltype         = anisotropic
>> tau-p              = 2.0
>> compressibility    = 4.5e-5 4.5e-5 4.5e-5 0 0 0
>> ref-p              = 1 1 1 0 0 0
>> ;
>> annealing          = single
>> annealing-npoints  = 2
>> annealing-time     = 0 500
>> annealing-temp     = 500 480
>>
>>
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at http://www.gromacs.org/Support
>> /Mailing_Lists/GMX-Users_List before posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>
>
>


More information about the gromacs.org_gmx-users mailing list