[gmx-developers] parallel make problems

Mark Abraham mark.j.abraham at gmail.com
Tue Jun 18 19:15:01 CEST 2013


On Tue, Jun 18, 2013 at 6:47 PM, Roland Schulz <roland at utk.edu> wrote:

> On Tue, Jun 18, 2013 at 11:05 AM, Mark Abraham <mark.j.abraham at gmail.com>
> wrote:
> >
> >
> >
> > On Mon, Jun 17, 2013 at 7:59 PM, Roland Schulz <roland at utk.edu> wrote:
> >>
> >> On Mon, Jun 17, 2013 at 1:10 PM, Mark Abraham <mark.j.abraham at gmail.com
> >
> >> wrote:
> >> >
> >> >
> >> >
> >> > On Mon, Jun 17, 2013 at 6:16 PM, Manuel Nuno Melo <m.n.melo at rug.nl>
> >> > wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> I have also had linking problems when making in parallel. In my case
> >> >> they
> >> >> could be traced back to the option to let GMX download/build its own
> >> >> fftw
> >> >> (-DGMX_BUILD_OWN_FFTW=ON).
> >> >>
> >> >> It seems that only one of make's threads starts building fftw, while
> >> >> the
> >> >> others go ahead building/linking GMX. Since fftw compilation is not
> >> >> ready by
> >> >> the time it is needed, GMX linking is botched.
> >> >
> >> >
> >> > Yes, Rossen first showed this to me. I don't know if the underlying
> >> > issue is
> >> > that the dependency cannot be described properly, or that we're not
> >> > doing it
> >> > properly. If it's a problem, people are welcome to contribute a fix!
> :-)
> >> It was working in https://gerrit.gromacs.org/#/c/1675/12. You then
> >> changed how the dependency works in patch set 13. You never replied to
> >> Christophs comment why this was changed (at least I can't find a
> >> reply). Do you remember?
> >
> >
> > I couldn't remember, but gerrit can - I never published a series of
> > responses I made back then, sorry. Now published at
> > https://gerrit.gromacs.org/1675
> >
> >> Otherwise I can change it back as 12 did it
> >> and it should work again.
> >
> >
> > It might do, but as I said in those secret drafts the form of patch 12
> > doesn't work on cmake 2.8.7 because of a bug there in
> add_library(...GLOBAL)
> > (and I suspect is probably too global, anyway, but this probably does no
> > harm?).
> >
> > So I'm still not sure there's a convenient solution that works in all
> cases.
> > Compromising the smooth running of a parallel make for someone
> downloading
> > FFTW seems like the most low-impact problem of the set we could choose to
> > have.
>
> Probably true. Just doesn't give a good first impression of us to new
> users.
> I think we should also consider for the future whether we really want
> to support ~11 unmaintained version of cmake (including for all our
> optional features). Downloading cmake is no big deal. They have
> binaries to download. And cmake doesn't fix any version but for the
> most recent version. So it seems odd that we try to maintain
> workarounds for the last ~11 versions which are all unmaintained by
> the cmake developers. That seems like it is going to stay a really
> annoying maintenance task.
>

True. Now that we've shown it is a PITA for the developers to work around a
handful of known issues with various 2.8.x point releases of CMake, it
sounds reasonable to me that we pick a late-model CMake 2.8.x as the
requirement for GROMACS 5. That could open the door to an alternative
implementation for self-built FFTW.

Mark


> Roland
>
>
> >
> > Mark
> >
> >>
> >> Roland
> >>
> >> >
> >> > Mark
> >> >
> >> >>
> >> >>
> >> >> Cheers,
> >> >> Manel
> >> >>
> >> >> > Hi,
> >> >> >
> >> >> > I too suspect filesystem issues or clock skews. I think I tested
> make
> >> >> > -j
> >> >> > and make -j 12. The cluster is currently down for maintenance, so I
> >> >> > can't
> >> >> > inspect the details at the moment.
> >> >> >
> >> >> > On 5 Apr 2013, at 13:14, Alexey Shvetsov <alexxy at
> omrb.pnpi.spb.ru>
> >> >> > wrote:
> >> >> >
> >> >> > > Hi Erik
> >> >> > >
> >> >> > > What are underlaying filesystem on this cluster? If it slow or
> >> >> > > overloaded
> >> >> > > somehow it may lead to parallel make issues. Also it may be
> related
> >> >> > > to
> >> >> > > make
> >> >> > > version (some old versions may expose such behavior). How many
> make
> >> >> > > threads do
> >> >> > > you issued? I tryed with make -j64 and it builds fine with recent
> >> >> > > cmake
> >> >> > > (2.8.10) and make (3.82) utility.
> >> >> > >
> >> >> > >
> >> >> > > В письме от 5 апреля 2013 11:55:27 пользователь Erik Marklund
> >> >> > > написал:
> >> >> > >> Hi,
> >> >> > >>
> >> >> > >> Building gromacs 4.6.1 failed whenever I issued parallel make,
> >> >> > >> i.e.
> >> >> > >> make -j.
> >> >> > >> I reported this to the cluster admins since I had never seen
> such
> >> >> > >> behaviour
> >> >> > >> before from gromacs' side, and here's their reply. I can't tell
> >> >> > >> whether
> >> >> > >> gromacs is at fault or the cluster.
> >> >> > >>
> >> >> > >> Erik
> >> >> > >>
> >> >> > >> Begin forwarded message:
> >> >> > >>> Hi,
> >> >> > >>>
> >> >> > >>>> I was compiling gromacs on tintin's login node the other day
> and
> >> >> > >>>> it
> >> >> > >>>>
> >> >> > >>>>  seems that parallel make, i.e. make -j, doesn't work on
> tintin.
> >> >> > >>>> I
> >> >> > >>>>  got linker errors that never showed up when make was run
> >> >> > >>>> serially.
> >> >> > >>>>  I've never encountered such behaviour before.
> >> >> > >>>
> >> >> > >>> Without any more information (or being able to look for actual
> >> >> > >>> files
> >> >> > >>> right
> >> >> > >>> now), I'd guess this is a problem with the makefiles rather
> than
> >> >> > >>> the
> >> >> > >>> actual make. It seems somewhat unexpected that CMake would
> create
> >> >> > >>> makefiles that aren't safe for parallel building, but it does
> >> >> > >>> seem
> >> >> > >>> the
> >> >> > >>> most likely culprit (assuming it doesn't let developers add
> rules
> >> >> > >>> directly to the makefile to work around problems, I don't
> >> >> > >>> remember
> >> >> > >>> if
> >> >> > >>> that's possible).
> >> >> > >>>
> >> >> > >>> That you only see the problem on tintin can likely be explained
> >> >> > >>> by
> >> >> > >>> timing
> >> >> > >>> or other non deterministic factors.
> >> >> > > --
> >> >> > > Best Regards,
> >> >> > > Alexey 'Alexxy' Shvetsov
> >> >>
> >> >> --
> >> >> gmx-developers mailing list
> >> >> gmx-developers at gromacs.org
> >> >> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> >> >> Please don't post (un)subscribe requests to the list. Use the
> >> >> www interface or send it to gmx-developers-request at gromacs.org.
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
> >> 865-241-1537, ORNL PO BOX 2008 MS6309
> >> --
> >> gmx-developers mailing list
> >> gmx-developers at gromacs.org
> >> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> >> Please don't post (un)subscribe requests to the list. Use the
> >> www interface or send it to gmx-developers-request at gromacs.org.
> >
> >
>
>
>
> --
> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
> 865-241-1537, ORNL PO BOX 2008 MS6309
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20130618/ea9a6412/attachment.html>


More information about the gromacs.org_gmx-developers mailing list