[gmx-developers] parallel make problems

Mark Abraham mark.j.abraham at gmail.com
Tue Jun 18 17:05:08 CEST 2013


On Mon, Jun 17, 2013 at 7:59 PM, Roland Schulz <roland at utk.edu> wrote:

> On Mon, Jun 17, 2013 at 1:10 PM, Mark Abraham <mark.j.abraham at gmail.com>
> wrote:
> >
> >
> >
> > On Mon, Jun 17, 2013 at 6:16 PM, Manuel Nuno Melo <m.n.melo at rug.nl>
> wrote:
> >>
> >> Hi,
> >>
> >> I have also had linking problems when making in parallel. In my case
> they
> >> could be traced back to the option to let GMX download/build its own
> fftw
> >> (-DGMX_BUILD_OWN_FFTW=ON).
> >>
> >> It seems that only one of make's threads starts building fftw, while the
> >> others go ahead building/linking GMX. Since fftw compilation is not
> ready by
> >> the time it is needed, GMX linking is botched.
> >
> >
> > Yes, Rossen first showed this to me. I don't know if the underlying
> issue is
> > that the dependency cannot be described properly, or that we're not
> doing it
> > properly. If it's a problem, people are welcome to contribute a fix! :-)
> It was working in https://gerrit.gromacs.org/#/c/1675/12. You then
> changed how the dependency works in patch set 13. You never replied to
> Christophs comment why this was changed (at least I can't find a
> reply). Do you remember?


I couldn't remember, but gerrit can - I never published a series of
responses I made back then, sorry. Now published at
https://gerrit.gromacs.org/1675

Otherwise I can change it back as 12 did it
> and it should work again.
>

It might do, but as I said in those secret drafts the form of patch 12
doesn't work on cmake 2.8.7 because of a bug there in
add_library(...GLOBAL) (and I suspect is probably too global, anyway, but
this probably does no harm?).

So I'm still not sure there's a convenient solution that works in all
cases. Compromising the smooth running of a parallel make for someone
downloading FFTW seems like the most low-impact problem of the set we could
choose to have.

Mark


> Roland
>
> >
> > Mark
> >
> >>
> >>
> >> Cheers,
> >> Manel
> >>
> >> > Hi,
> >> >
> >> > I too suspect filesystem issues or clock skews. I think I tested make
> -j
> >> > and make -j 12. The cluster is currently down for maintenance, so I
> can't
> >> > inspect the details at the moment.
> >> >
> >> > On 5 Apr 2013, at 13:14, Alexey Shvetsov <alexxy at omrb.pnpi.spb.ru>
> >> > wrote:
> >> >
> >> > > Hi Erik
> >> > >
> >> > > What are underlaying filesystem on this cluster? If it slow or
> >> > > overloaded
> >> > > somehow it may lead to parallel make issues. Also it may be related
> to
> >> > > make
> >> > > version (some old versions may expose such behavior). How many make
> >> > > threads do
> >> > > you issued? I tryed with make -j64 and it builds fine with recent
> >> > > cmake
> >> > > (2.8.10) and make (3.82) utility.
> >> > >
> >> > >
> >> > > В письме от 5 апреля 2013 11:55:27 пользователь Erik Marklund
> написал:
> >> > >> Hi,
> >> > >>
> >> > >> Building gromacs 4.6.1 failed whenever I issued parallel make, i.e.
> >> > >> make -j.
> >> > >> I reported this to the cluster admins since I had never seen such
> >> > >> behaviour
> >> > >> before from gromacs' side, and here's their reply. I can't tell
> >> > >> whether
> >> > >> gromacs is at fault or the cluster.
> >> > >>
> >> > >> Erik
> >> > >>
> >> > >> Begin forwarded message:
> >> > >>> Hi,
> >> > >>>
> >> > >>>> I was compiling gromacs on tintin's login node the other day and
> it
> >> > >>>>
> >> > >>>>  seems that parallel make, i.e. make -j, doesn't work on tintin.
> I
> >> > >>>>  got linker errors that never showed up when make was run
> serially.
> >> > >>>>  I've never encountered such behaviour before.
> >> > >>>
> >> > >>> Without any more information (or being able to look for actual
> files
> >> > >>> right
> >> > >>> now), I'd guess this is a problem with the makefiles rather than
> the
> >> > >>> actual make. It seems somewhat unexpected that CMake would create
> >> > >>> makefiles that aren't safe for parallel building, but it does seem
> >> > >>> the
> >> > >>> most likely culprit (assuming it doesn't let developers add rules
> >> > >>> directly to the makefile to work around problems, I don't remember
> >> > >>> if
> >> > >>> that's possible).
> >> > >>>
> >> > >>> That you only see the problem on tintin can likely be explained by
> >> > >>> timing
> >> > >>> or other non deterministic factors.
> >> > > --
> >> > > Best Regards,
> >> > > Alexey 'Alexxy' Shvetsov
> >>
> >> --
> >> gmx-developers mailing list
> >> gmx-developers at gromacs.org
> >> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> >> Please don't post (un)subscribe requests to the list. Use the
> >> www interface or send it to gmx-developers-request at gromacs.org.
> >
> >
>
>
>
> --
> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
> 865-241-1537, ORNL PO BOX 2008 MS6309
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20130618/7bff2e65/attachment.html>


More information about the gromacs.org_gmx-developers mailing list