[gmx-developers] parallel make problems

Szilárd Páll szilard.pall at cbr.su.se
Wed Jun 19 00:20:57 CEST 2013


On Tue, Jun 18, 2013 at 7:15 PM, Mark Abraham <mark.j.abraham at gmail.com> wrote:
>
>
>
> On Tue, Jun 18, 2013 at 6:47 PM, Roland Schulz <roland at utk.edu> wrote:
>>
>> On Tue, Jun 18, 2013 at 11:05 AM, Mark Abraham <mark.j.abraham at gmail.com>
>> wrote:
>> >
>> >
>> >
>> > On Mon, Jun 17, 2013 at 7:59 PM, Roland Schulz <roland at utk.edu> wrote:
>> >>
>> >> On Mon, Jun 17, 2013 at 1:10 PM, Mark Abraham
>> >> <mark.j.abraham at gmail.com>
>> >> wrote:
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Jun 17, 2013 at 6:16 PM, Manuel Nuno Melo <m.n.melo at rug.nl>
>> >> > wrote:
>> >> >>
>> >> >> Hi,
>> >> >>
>> >> >> I have also had linking problems when making in parallel. In my case
>> >> >> they
>> >> >> could be traced back to the option to let GMX download/build its own
>> >> >> fftw
>> >> >> (-DGMX_BUILD_OWN_FFTW=ON).
>> >> >>
>> >> >> It seems that only one of make's threads starts building fftw, while
>> >> >> the
>> >> >> others go ahead building/linking GMX. Since fftw compilation is not
>> >> >> ready by
>> >> >> the time it is needed, GMX linking is botched.
>> >> >
>> >> >
>> >> > Yes, Rossen first showed this to me. I don't know if the underlying
>> >> > issue is
>> >> > that the dependency cannot be described properly, or that we're not
>> >> > doing it
>> >> > properly. If it's a problem, people are welcome to contribute a fix!
>> >> > :-)
>> >> It was working in https://gerrit.gromacs.org/#/c/1675/12. You then
>> >> changed how the dependency works in patch set 13. You never replied to
>> >> Christophs comment why this was changed (at least I can't find a
>> >> reply). Do you remember?
>> >
>> >
>> > I couldn't remember, but gerrit can - I never published a series of
>> > responses I made back then, sorry. Now published at
>> > https://gerrit.gromacs.org/1675
>> >
>> >> Otherwise I can change it back as 12 did it
>> >> and it should work again.
>> >
>> >
>> > It might do, but as I said in those secret drafts the form of patch 12
>> > doesn't work on cmake 2.8.7 because of a bug there in
>> > add_library(...GLOBAL)
>> > (and I suspect is probably too global, anyway, but this probably does no
>> > harm?).
>> >
>> > So I'm still not sure there's a convenient solution that works in all
>> > cases.
>> > Compromising the smooth running of a parallel make for someone
>> > downloading
>> > FFTW seems like the most low-impact problem of the set we could choose
>> > to
>> > have.
>>
>> Probably true. Just doesn't give a good first impression of us to new
>> users.
>> I think we should also consider for the future whether we really want
>> to support ~11 unmaintained version of cmake (including for all our
>> optional features). Downloading cmake is no big deal. They have
>> binaries to download. And cmake doesn't fix any version but for the
>> most recent version. So it seems odd that we try to maintain
>> workarounds for the last ~11 versions which are all unmaintained by
>> the cmake developers. That seems like it is going to stay a really
>> annoying maintenance task.
>
>
> True. Now that we've shown it is a PITA for the developers to work around a
> handful of known issues with various 2.8.x point releases of CMake, it
> sounds reasonable to me that we pick a late-model CMake 2.8.x as the
> requirement for GROMACS 5. That could open the door to an alternative
> implementation for self-built FFTW.

I agree that it is annoying having to work around CMake issues. At the
same time, I think it would be a rather "user-unfriendly" move to
require a very late version of CMake. As a user, it is fair to expect
that building GROMACS is as hassle-free as possible. Hence, having to
download CMake as a first step of the installation process will
probably lead to many users not updating (early) to newer GROMACS
versions.

In general, the issue is the way CMake development introduces changes
in minor versions which affect behaviour. This can easily break
fragile code in the build system. I don't have a good suggestion to
overcome such problems, but I think that the choice of required
minimum CMake version should depend on what versions provide the major
Linux OS-es.

--
Szilárd

> Mark
>
>>
>> Roland
>>
>>
>> >
>> > Mark
>> >
>> >>
>> >> Roland
>> >>
>> >> >
>> >> > Mark
>> >> >
>> >> >>
>> >> >>
>> >> >> Cheers,
>> >> >> Manel
>> >> >>
>> >> >> > Hi,
>> >> >> >
>> >> >> > I too suspect filesystem issues or clock skews. I think I tested
>> >> >> > make
>> >> >> > -j
>> >> >> > and make -j 12. The cluster is currently down for maintenance, so
>> >> >> > I
>> >> >> > can't
>> >> >> > inspect the details at the moment.
>> >> >> >
>> >> >> > On 5 Apr 2013, at 13:14, Alexey Shvetsov <alexxy at
>> >> >> > omrb.pnpi.spb.ru>
>> >> >> > wrote:
>> >> >> >
>> >> >> > > Hi Erik
>> >> >> > >
>> >> >> > > What are underlaying filesystem on this cluster? If it slow or
>> >> >> > > overloaded
>> >> >> > > somehow it may lead to parallel make issues. Also it may be
>> >> >> > > related
>> >> >> > > to
>> >> >> > > make
>> >> >> > > version (some old versions may expose such behavior). How many
>> >> >> > > make
>> >> >> > > threads do
>> >> >> > > you issued? I tryed with make -j64 and it builds fine with
>> >> >> > > recent
>> >> >> > > cmake
>> >> >> > > (2.8.10) and make (3.82) utility.
>> >> >> > >
>> >> >> > >
>> >> >> > > В письме от 5 апреля 2013 11:55:27 пользователь Erik Marklund
>> >> >> > > написал:
>> >> >> > >> Hi,
>> >> >> > >>
>> >> >> > >> Building gromacs 4.6.1 failed whenever I issued parallel make,
>> >> >> > >> i.e.
>> >> >> > >> make -j.
>> >> >> > >> I reported this to the cluster admins since I had never seen
>> >> >> > >> such
>> >> >> > >> behaviour
>> >> >> > >> before from gromacs' side, and here's their reply. I can't tell
>> >> >> > >> whether
>> >> >> > >> gromacs is at fault or the cluster.
>> >> >> > >>
>> >> >> > >> Erik
>> >> >> > >>
>> >> >> > >> Begin forwarded message:
>> >> >> > >>> Hi,
>> >> >> > >>>
>> >> >> > >>>> I was compiling gromacs on tintin's login node the other day
>> >> >> > >>>> and
>> >> >> > >>>> it
>> >> >> > >>>>
>> >> >> > >>>>  seems that parallel make, i.e. make -j, doesn't work on
>> >> >> > >>>> tintin.
>> >> >> > >>>> I
>> >> >> > >>>>  got linker errors that never showed up when make was run
>> >> >> > >>>> serially.
>> >> >> > >>>>  I've never encountered such behaviour before.
>> >> >> > >>>
>> >> >> > >>> Without any more information (or being able to look for actual
>> >> >> > >>> files
>> >> >> > >>> right
>> >> >> > >>> now), I'd guess this is a problem with the makefiles rather
>> >> >> > >>> than
>> >> >> > >>> the
>> >> >> > >>> actual make. It seems somewhat unexpected that CMake would
>> >> >> > >>> create
>> >> >> > >>> makefiles that aren't safe for parallel building, but it does
>> >> >> > >>> seem
>> >> >> > >>> the
>> >> >> > >>> most likely culprit (assuming it doesn't let developers add
>> >> >> > >>> rules
>> >> >> > >>> directly to the makefile to work around problems, I don't
>> >> >> > >>> remember
>> >> >> > >>> if
>> >> >> > >>> that's possible).
>> >> >> > >>>
>> >> >> > >>> That you only see the problem on tintin can likely be
>> >> >> > >>> explained
>> >> >> > >>> by
>> >> >> > >>> timing
>> >> >> > >>> or other non deterministic factors.
>> >> >> > > --
>> >> >> > > Best Regards,
>> >> >> > > Alexey 'Alexxy' Shvetsov
>> >> >>
>> >> >> --
>> >> >> gmx-developers mailing list
>> >> >> gmx-developers at gromacs.org
>> >> >> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>> >> >> Please don't post (un)subscribe requests to the list. Use the
>> >> >> www interface or send it to gmx-developers-request at gromacs.org.
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
>> >> 865-241-1537, ORNL PO BOX 2008 MS6309
>> >> --
>> >> gmx-developers mailing list
>> >> gmx-developers at gromacs.org
>> >> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>> >> Please don't post (un)subscribe requests to the list. Use the
>> >> www interface or send it to gmx-developers-request at gromacs.org.
>> >
>> >
>>
>>
>>
>> --
>> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
>> 865-241-1537, ORNL PO BOX 2008 MS6309
>> --
>> gmx-developers mailing list
>> gmx-developers at gromacs.org
>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>> Please don't post (un)subscribe requests to the list. Use the
>> www interface or send it to gmx-developers-request at gromacs.org.
>
>
>
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.



More information about the gromacs.org_gmx-developers mailing list