[gmx-users] High load imbalance: 31.8%
Nash, Anthony
a.nash at ucl.ac.uk
Thu Aug 20 18:12:59 CEST 2015
Hi Szilárd
Thanks for all of that advice. I’m going to have to take a lot of this up
with the Cluster Service Staff. This is a new cluster service I won a
grant for, thus not my usual platform which would typically yield an
imbalance of somewhere around 0.8% to 2%.
Thanks again
Anthony
On 20/08/2015 16:52, "Szilárd Páll" <pall.szilard at gmail.com> wrote:
>Hi,
>
>You're not pinning threads and it seems that you're running on a large SMP
>machine! Assuming that the 512 threads reported (line 91) is correct
>that's
>a 32 socket SMP machine, perhaps an SGI UV? In any case Xeon E5-4xxx is
>typically deployed in 4-8 socket installations, so your 8 threads will be
>floating around on a number of CPUs which ruins your performance - and
>likely contributes to the varying and large load imbalance.
>
>My advice:
>- don't ignore notes/warnings issued by mdrun (line 366, should be on the
>standard out too), we put quite some though into spamming users only when
>relevant :)
>- pin mdrun and/or its threads either with "-pin on" (and -pinoffset if
>needed) or with whatever tools your admins provide/recommend
>
>[Extras: consider using FFTW even with the Intel compilers it's often
>faster for our small FFTs than MKL; and GNU iso Intel compiler is often
>faster too.]
>
>Fixing the above issues should not only reduce imbalance but most likely
>also allow you to gain quite some simulation performance! Let us know if
>it
>worked.
>
>Cheers,
>
>--
>Szilárd
>
>On Thu, Aug 20, 2015 at 5:08 PM, Nash, Anthony <a.nash at ucl.ac.uk> wrote:
>
>> Hi Mark,
>>
>> Many thanks for looking into this.
>>
>> One of the log files (the job hasn’t finished running) is here:
>> https://www.dropbox.com/s/zwrro54yni2uxtn/umb_3_umb.log?dl=0
>>
>> The system is a soluble collagenase in water with a collagen substrate
>>and
>> two zinc co-factors. There are 287562 atoms in the system.
>>
>> Please let me know if you need to know anything else. Thanks!
>>
>> Anthony
>>
>>
>>
>>
>>
>> On 20/08/2015 11:39, "Mark Abraham" <mark.j.abraham at gmail.com> wrote:
>>
>> >Hi,
>> >
>> >In cases like this, it's good to describe what's in your simulation,
>>and
>> >share the full .log file on a file-sharing service, so we can see both
>>the
>> >things mdrun reports early and late.
>> >
>> >Mark
>> >
>> >On Thu, Aug 20, 2015 at 8:22 AM Nash, Anthony <a.nash at ucl.ac.uk> wrote:
>> >
>> >> Hi all,
>> >>
>> >> I appear to have a very high load imbalance on some of my runs.
>>Values
>> >> starting from approx. 7% up to 31.8% with reported vol min/aver of
>> >>around
>> >> 0.6 (I haven¹t found one under half yet).
>> >>
>> >> When I look through the .log file at the start of the run I see:
>> >>
>> >> Initializing Domain Decomposition on 8 ranks
>> >> Dynamic load balancing: auto
>> >> Will sort the charge groups at every domain (re)decomposition
>> >> Initial maximum inter charge-group distances:
>> >> two-body bonded interactions: 0.514 nm, LJ-14, atoms 3116 3123
>> >> multi-body bonded interactions: 0.429 nm, Proper Dih., atoms 3116
>>3123
>> >> Minimum cell size due to bonded interactions: 0.472 nm
>> >> Maximum distance for 5 constraints, at 120 deg. angles, all-trans:
>> >>0.862 nm
>> >> Estimated maximum distance required for P-LINCS: 0.862 nm
>> >> This distance will limit the DD cell size, you can override this with
>> >>-rcon
>> >> Using 0 separate PME ranks, as there are too few total
>> >> ranks for efficient splitting
>> >> Scaling the initial minimum size with 1/0.8 (option -dds) = 1.25
>> >> Optimizing the DD grid for 8 cells with a minimum initial size of
>>1.077
>> >>nm
>> >> The maximum allowed number of cells is: X 12 Y 12 Z 12
>> >> Domain decomposition grid 4 x 2 x 1, separate PME ranks 0
>> >> PME domain decomposition: 4 x 2 x 1
>> >> Domain decomposition rank 0, coordinates 0 0 0
>> >> Using 8 MPI processes
>> >> Using 1 OpenMP thread per MPI process
>> >>
>> >>
>> >>
>> >>
>> >> Having a quick look through the documentation and I see that I should
>> >> consider implementing the verlet cut-off (which I am) and adjust the
>> >> number of PME nodes/cut-off and PME grid spacing. Would this simply
>>be a
>> >> case of throwing more cores at the simulation or must I play around
>>with
>> >> P-LINCS parameters?
>> >>
>> >> Thanks
>> >> Anthony
>> >>
>> >> --
>> >> Gromacs Users mailing list
>> >>
>> >> * Please search the archive at
>> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> >> posting!
>> >>
>> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> >>
>> >> * For (un)subscribe requests visit
>> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> >> send a mail to gmx-users-request at gromacs.org.
>> >>
>> >--
>> >Gromacs Users mailing list
>> >
>> >* Please search the archive at
>> >http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> >posting!
>> >
>> >* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> >
>> >* For (un)subscribe requests visit
>> >https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> >send a mail to gmx-users-request at gromacs.org.
>>
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>>
>--
>Gromacs Users mailing list
>
>* Please search the archive at
>http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>posting!
>
>* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
>* For (un)subscribe requests visit
>https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>send a mail to gmx-users-request at gromacs.org.
More information about the gromacs.org_gmx-users
mailing list