[gmx-developers] PME tuning-related hang [was Re: Gromacs 2016.3 (and earlier) freezing up.]

Tue Sep 19 22:39:59 CEST 2017

That sounds extreme -- although not unfamiliar (the same issue come up
in GPU accelerated runs).

What cut-off and grid spacing? And what node-count, is this at low,
medium or high parallelization?
--
Szilárd

On Tue, Sep 19, 2017 at 6:08 PM, John Eblen <jeblen at acm.org> wrote:
> Hi
>
> On KNL, I achieve best performance by using the maximum number of PME nodes
> allowed.
> So I figured removing the upper bound might allow for even better
> performance.
>
>
> John
>
> On Tue, Sep 19, 2017 at 5:50 AM, Szilárd Páll <pall.szilard at gmail.com>
> wrote:
>>
>> Hi,
>>
>> Why would you want to increase the MPI rank count in PME? Is it to
>> compensate for the thread scaling being worse than in the PP ranks?
>>
>> It might be more worthwhile improving PME multi-threading rather than
>> allowing higher rank count.
>>
>> --
>> Szilárd
>>
>>
>> On Tue, Sep 19, 2017 at 10:01 AM, Berk Hess <hess at kth.se> wrote:
>> > On 2017-09-18 18:34, John Eblen wrote:
>> >
>> > Hi Szilárd
>> >
>> > These runs used 2M huge pages. I will file a redmine shortly.
>> >
>> > On a related topic, how difficult would it be to modify GROMACS to
>> > support >
>> > 50%
>> > PME nodes?
>> >
>> > That's not so hard, but I see little benefit, since then the MPI
>> > communication is not reduced much compared to all ranks doing PME.
>> >
>> > Berk
>> >
>> >
>> >
>> > John
>> >
>> > On Fri, Sep 15, 2017 at 6:37 PM, Szilárd Páll <pall.szilard at gmail.com>
>> > wrote:
>> >>
>> >> Hi John,
>> >>
>> >> Thanks for diagnosing the issue!
>> >>
>> >> We have been aware of this behavior, but been both intentional (as we
>> >> re-scan grids after the first pass at least once more); plus, it's
>> >> also simply been considered a "not too big of a deal" given that in
>> >> general mdrun has very low memory footprint. However, it seems that,
>> >> at least on this particular machine, our assumption was wrong. What is
>> >> the page sizes on Cori KNL?
>> >>
>> >> Can you please file a redmine with your observations?
>> >>
>> >> Thanks,
>> >> --
>> >> Szilárd
>> >>
>> >>
>> >> On Fri, Sep 15, 2017 at 8:25 PM, John Eblen <jeblen at acm.org> wrote:
>> >> > This issue appears to not be a GROMACS problem so much as a problem
>> >> > with
>> >> > "huge pages" that is
>> >> > triggered by PME tuning. PME tuning creates a large data structure
>> >> > for
>> >> > every
>> >> > cutoff that it tries, which
>> >> > is replicated on each PME node. These data structures are not freed
>> >> > during
>> >> > tuning, so memory usage
>> >> > expands. Normally it is still too small to cause problems. With huge
>> >> > pages,
>> >> > however, I get errors from
>> >> > "libhugetlbfs" and very slow runs if more than about five cutoffs are
>> >> > attempted.
>> >> >
>> >> > Sample output on NERSC Cori KNL with 32 nodes. Input system size is
>> >> > 248,101
>> >> > atoms.
>> >> >
>> >> > step 0
>> >> > step 100, remaining wall clock time:    24 s
>> >> > step  140: timed with pme grid 128 128 128, coulomb cutoff 1.200:
>> >> > 66.2
>> >> > M-cycles
>> >> > step  210: timed with pme grid 112 112 112, coulomb cutoff 1.336:
>> >> > 69.6
>> >> > M-cycles
>> >> > step  280: timed with pme grid 100 100 100, coulomb cutoff 1.496:
>> >> > 63.6
>> >> > M-cycles
>> >> > step  350: timed with pme grid 84 84 84, coulomb cutoff 1.781: 85.9
>> >> > M-cycles
>> >> > step  420: timed with pme grid 96 96 96, coulomb cutoff 1.559: 68.8
>> >> > M-cycles
>> >> > step  490: timed with pme grid 100 100 100, coulomb cutoff 1.496:
>> >> > 68.3
>> >> > M-cycles
>> >> > libhugetlbfs [nid08887:140420]: WARNING: New heap segment map at
>> >> > 0x10001200000 failed: Cannot allocate memory
>> >> > libhugetlbfs [nid08881:97968]: WARNING: New heap segment map at
>> >> > 0x10001200000 failed: Cannot allocate memory
>> >> > libhugetlbfs [nid08881:97978]: WARNING: New heap segment map at
>> >> > 0x10001200000 failed: Cannot allocate memory
>> >> >
>> >> > Szilárd, to answer to your questions: This is the verlet scheme. The
>> >> > problem
>> >> > happens during tuning, and
>> >> > no problems occur if -notunepme is used. In fact, the best
>> >> > performance
>> >> > thus
>> >> > far has been with 50% PME
>> >> > nodes, using huge pages, and '-notunepme'.
>> >> >
>> >> >
>> >> > John
>> >> >
>> >> > On Wed, Sep 13, 2017 at 6:20 AM, Szilárd Páll
>> >> > <pall.szilard at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Forking the discussion as now we've learned more about the issue Åke
>> >> >> is reporting and it is quiterather dissimilar.
>> >> >>
>> >> >> On Mon, Sep 11, 2017 at 8:09 PM, John Eblen <jeblen at acm.org> wrote:
>> >> >> > Hi Szilárd
>> >> >> >
>> >> >> > No, I'm not using the group scheme.
>> >> >>
>> >> >>  $ grep -i 'cutoff-scheme' md.log
>> >> >>    cutoff-scheme                  = Verlet
>> >> >>
>> >> >> > The problem seems similar because:
>> >> >> >
>> >> >> > 1) Deadlocks and very slow runs can be hard to distinguish.
>> >> >> > 2) Since Mark mentioned it, I assume he believes PME tuning is a
>> >> >> > possible
>> >> >> >     cause, which is also the cause in my situation.
>> >> >>
>> >> >> Does that mean you tested with "-notunepme" and the excessive memory
>> >> >> usage could not be reproduced? Did the memory usage increase only
>> >> >> during the tuning or did it keep increasing after the tuning
>> >> >> completed?
>> >> >>
>> >> >> > 3) Åke may be experiencing higher-than-normal memory usage as far
>> >> >> > as
>> >> >> > I
>> >> >> > know.
>> >> >> >     Not sure how you know otherwise.
>> >> >> > 4) By "successful," I assume you mean the tuning had completed.
>> >> >> > That
>> >> >> > doesn't
>> >> >> >     mean, though, that the tuning could not be creating conditions
>> >> >> > that
>> >> >> > causes the
>> >> >> >     problem, like an excessively high cutoff.
>> >> >>
>> >> >> Sure. However, it's unlikely that the tuning creates conditions
>> >> >> under
>> >> >> which the run proceeds after the after the initial tuning phase and
>> >> >> keeps allocating memory (which is more prone to be the source of
>> >> >> issues).
>> >> >>
>> >> >> I suggest to first rule our the bug I linked and if that's not the
>> >> >> culprit, we can have a closer look.
>> >> >>
>> >> >> Cheers,
>> >> >> --
>> >> >> Szilárd
>> >> >>
>> >> >> >
>> >> >> >
>> >> >> > John
>> >> >> >
>> >> >> > On Mon, Sep 11, 2017 at 1:09 PM, Szilárd Páll
>> >> >> > <pall.szilard at gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> John,
>> >> >> >>
>> >> >> >> In what way do you think your problem is similar? Åke seems to be
>> >> >> >> experiencing a deadlock after successful PME tuning, much later
>> >> >> >> during
>> >> >> >> the run, but no excessive memory usage.
>> >> >> >>
>> >> >> >> Do you happen to be using the group scheme with 2016.x (release
>> >> >> >> code)?
>> >> >> >>
>> >> >> >> Your issue sounds more like it could be related to the the
>> >> >> >> excessive
>> >> >> >> tuning bug with group scheme fixed quite a few months ago, but
>> >> >> >> it's
>> >> >> >> yet to be released (https://redmine.gromacs.org/issues/2200).
>> >> >> >>
>> >> >> >> Cheers,
>> >> >> >> --
>> >> >> >> Szilárd
>> >> >> >>
>> >> >> >>
>> >> >> >> On Mon, Sep 11, 2017 at 6:50 PM, John Eblen <jeblen at acm.org>
>> >> >> >> wrote:
>> >> >> >> > Hi
>> >> >> >> >
>> >> >> >> > I'm having a similar problem that is related to PME tuning.
>> >> >> >> > When
>> >> >> >> > it
>> >> >> >> > is
>> >> >> >> > enabled, GROMACS often, but not
>> >> >> >> > always, slows to a crawl and uses excessive amounts of memory.
>> >> >> >> > Using
>> >> >> >> > "huge
>> >> >> >> > pages" and setting a high
>> >> >> >> > number of PME processes seems to exacerbate the problem.
>> >> >> >> >
>> >> >> >> > Also, occurrences of this problem seem to correlate with how
>> >> >> >> > high
>> >> >> >> > the
>> >> >> >> > tuning
>> >> >> >> > raises the cutoff value.
>> >> >> >> >
>> >> >> >> > Mark, can you give us more information on the problems with PME
>> >> >> >> > tuning?
>> >> >> >> > Is
>> >> >> >> > there a redmine?
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > Thanks
>> >> >> >> > John
>> >> >> >> >
>> >> >> >> > On Mon, Sep 11, 2017 at 10:53 AM, Mark Abraham
>> >> >> >> > <mark.j.abraham at gmail.com>
>> >> >> >> > wrote:
>> >> >> >> >>
>> >> >> >> >> Hi,
>> >> >> >> >>
>> >> >> >> >> Thanks. Was PME tuning active? Does it reproduce if that is
>> >> >> >> >> disabled?
>> >> >> >> >> Is
>> >> >> >> >> the PME tuning still active? How many steps have taken place
>> >> >> >> >> (at
>> >> >> >> >> least
>> >> >> >> >> as
>> >> >> >> >> reported in the log file but ideally from processes)?
>> >> >> >> >>
>> >> >> >> >> Mark
>> >> >> >> >>
>> >> >> >> >> On Mon, Sep 11, 2017 at 4:42 PM Åke Sandgren
>> >> >> >> >> <ake.sandgren at hpc2n.umu.se>
>> >> >> >> >> wrote:
>> >> >> >> >>>
>> >> >> >> >>> My debugger run finally got to the lockup.
>> >> >> >> >>>
>> >> >> >> >>> All processes are waiting on various MPI operations.
>> >> >> >> >>>
>> >> >> >> >>> Attached a stack dump of all 56 tasks.
>> >> >> >> >>>
>> >> >> >> >>> I'll keep the debug session running for a while in case
>> >> >> >> >>> anyone
>> >> >> >> >>> wants
>> >> >> >> >>> some more detailed data.
>> >> >> >> >>> This is a RelwithDeb build though so not everything is
>> >> >> >> >>> available.
>> >> >> >> >>>
>> >> >> >> >>> On 09/08/2017 11:28 AM, Berk Hess wrote:
>> >> >> >> >>> > But you should be able to get some (limited) information by
>> >> >> >> >>> > attaching a
>> >> >> >> >>> > debugger to an aldready running process with a release
>> >> >> >> >>> > build.
>> >> >> >> >>> >
>> >> >> >> >>> > If you plan on compiling and running a new case, use a
>> >> >> >> >>> > release
>> >> >> >> >>> > +
>> >> >> >> >>> > debug
>> >> >> >> >>> > symbols build. That should run as fast as a release build.
>> >> >> >> >>> >
>> >> >> >> >>> > Cheers,
>> >> >> >> >>> >
>> >> >> >> >>> > Berk
>> >> >> >> >>> >
>> >> >> >> >>> > On 2017-09-08 11:23, Åke Sandgren wrote:
>> >> >> >> >>> >> We have, at least, one case that when run over 2 nodes, or
>> >> >> >> >>> >> more,
>> >> >> >> >>> >> quite
>> >> >> >> >>> >> often (always) hangs, i.e. no more output in md.log or
>> >> >> >> >>> >> otherwise
>> >> >> >> >>> >> while
>> >> >> >> >>> >> mdrun still consumes cpu time. It takes a random time
>> >> >> >> >>> >> before
>> >> >> >> >>> >> it
>> >> >> >> >>> >> happens,
>> >> >> >> >>> >> like 1-3 days.
>> >> >> >> >>> >>
>> >> >> >> >>> >> The case can be shared if someone else wants to
>> >> >> >> >>> >> investigate.
>> >> >> >> >>> >> I'm
>> >> >> >> >>> >> planning to run it in the debugger to be able to break and
>> >> >> >> >>> >> look
>> >> >> >> >>> >> at
>> >> >> >> >>> >> states when it happens, but since it takes so long with
>> >> >> >> >>> >> the
>> >> >> >> >>> >> production
>> >> >> >> >>> >> build it is not something i'm looking forward to.
>> >> >> >> >>> >>
>> >> >> >> >>> >> On 09/08/2017 11:13 AM, Berk Hess wrote:
>> >> >> >> >>> >>> Hi,
>> >> >> >> >>> >>>
>> >> >> >> >>> >>> We are far behind schedule for the 2017 release. We are
>> >> >> >> >>> >>> working
>> >> >> >> >>> >>> hard
>> >> >> >> >>> >>> on
>> >> >> >> >>> >>> it, but I don't think we can promise a date yet.
>> >> >> >> >>> >>>
>> >> >> >> >>> >>> We have a 2016.4 release planned for this week (might
>> >> >> >> >>> >>> slip
>> >> >> >> >>> >>> to
>> >> >> >> >>> >>> next
>> >> >> >> >>> >>> week). But if you can give us enough details to track
>> >> >> >> >>> >>> down
>> >> >> >> >>> >>> your
>> >> >> >> >>> >>> hanging
>> >> >> >> >>> >>> issue, we might be able to fix it in 2016.4.
>> >> >> >> >>> >
>> >> >> >> >>>
>> >> >> >> >>> --
>> >> >> >> >>> Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
>> >> >> >> >>> Internet: ake at hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46
>> >> >> >> >>> 90-580
>> >> >> >> >>> 14
>> >> >> >> >>> Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
>> >> >> >> >>> --
>> >> >> >> >>> Gromacs Developers mailing list
>> >> >> >> >>>
>> >> >> >> >>> * Please search the archive at
>> >> >> >> >>>
>> >> >> >> >>> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>> >> >> >> >>> before
>> >> >> >> >>> posting!
>> >> >> >> >>>
>> >> >> >> >>> * Can't post? Read
>> >> >> >> >>> http://www.gromacs.org/Support/Mailing_Lists
>> >> >> >> >>>
>> >> >> >> >>> * For (un)subscribe requests visit
>> >> >> >> >>>
>> >> >> >> >>>
>> >> >> >> >>>
>> >> >> >> >>>
>> >> >> >> >>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>> >> >> >> >>> or send a mail to gmx-developers-request at gromacs.org.
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> --
>> >> >> >> >> Gromacs Developers mailing list
>> >> >> >> >>
>> >> >> >> >> * Please search the archive at
>> >> >> >> >>
>> >> >> >> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>> >> >> >> >> before
>> >> >> >> >> posting!
>> >> >> >> >>
>> >> >> >> >> * Can't post? Read
>> >> >> >> >> http://www.gromacs.org/Support/Mailing_Lists
>> >> >> >> >>
>> >> >> >> >> * For (un)subscribe requests visit
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>> >> >> >> >> or
>> >> >> >> >> send a mail to gmx-developers-request at gromacs.org.
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > --
>> >> >> >> > Gromacs Developers mailing list
>> >> >> >> >
>> >> >> >> > * Please search the archive at
>> >> >> >> >
>> >> >> >> > http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>> >> >> >> > before
>> >> >> >> > posting!
>> >> >> >> >
>> >> >> >> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> >> >> >> >
>> >> >> >> > * For (un)subscribe requests visit
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>> >> >> >> > or
>> >> >> >> > send a mail to gmx-developers-request at gromacs.org.
>> >> >> >> --
>> >> >> >> Gromacs Developers mailing list
>> >> >> >>
>> >> >> >> * Please search the archive at
>> >> >> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>> >> >> >> before
>> >> >> >> posting!
>> >> >> >>
>> >> >> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> >> >> >>
>> >> >> >> * For (un)subscribe requests visit
>> >> >> >>
>> >> >> >>
>> >> >> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>> >> >> >> or
>> >> >> >> send a mail to gmx-developers-request at gromacs.org.
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > Gromacs Developers mailing list
>> >> >> >
>> >> >> > * Please search the archive at
>> >> >> > http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>> >> >> > before
>> >> >> > posting!
>> >> >> >
>> >> >> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> >> >> >
>> >> >> > * For (un)subscribe requests visit
>> >> >> >
>> >> >> >
>> >> >> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>> >> >> > or
>> >> >> > send a mail to gmx-developers-request at gromacs.org.
>> >> >
>> >> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > --
>> > Gromacs Developers mailing list
>> >
>> > * Please search the archive at
>> > http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before
>> > posting!
>> >
>> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> >
>> > * For (un)subscribe requests visit
>> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>> > or
>> > send a mail to gmx-developers-request at gromacs.org.
>> --
>> Gromacs Developers mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before
>> posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers or
>> send a mail to gmx-developers-request at gromacs.org.
>
>
>
> --
> Gromacs Developers mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers or
> send a mail to gmx-developers-request at gromacs.org.