[gmx-developers] parallel make problems

Mark Abraham mark.j.abraham at gmail.com
Wed Jun 19 22:14:47 CEST 2013


On Wed, Jun 19, 2013 at 8:58 PM, Szilárd Páll <szilard.pall at cbr.su.se> wrote:
> On Wed, Jun 19, 2013 at 5:16 PM, Mark Abraham <mark.j.abraham at gmail.com> wrote:
>>
>>
>>
>> On Wed, Jun 19, 2013 at 3:48 PM, Szilárd Páll <szilard.pall at cbr.su.se>
>> wrote:
>>>
>>> On Wed, Jun 19, 2013 at 2:19 PM, Mark Abraham <mark.j.abraham at gmail.com>
>>> wrote:
>>> >
>>> >
>>> >
>>> > On Wed, Jun 19, 2013 at 12:20 AM, Szilárd Páll <szilard.pall at cbr.su.se>
>>> > wrote:
>>> >>
>>> >> On Tue, Jun 18, 2013 at 7:15 PM, Mark Abraham
>>> >> <mark.j.abraham at gmail.com>
>>> >> wrote:
>>> >> >
>>> >> >
>>> >> >
>>> >> > On Tue, Jun 18, 2013 at 6:47 PM, Roland Schulz <roland at utk.edu>
>>> >> > wrote:
>>> >> >>
>>> >> >> On Tue, Jun 18, 2013 at 11:05 AM, Mark Abraham
>>> >> >> <mark.j.abraham at gmail.com>
>>> >> >> wrote:
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> > On Mon, Jun 17, 2013 at 7:59 PM, Roland Schulz <roland at utk.edu>
>>> >> >> > wrote:
>>> >> >> >>
>>> >> >> >> On Mon, Jun 17, 2013 at 1:10 PM, Mark Abraham
>>> >> >> >> <mark.j.abraham at gmail.com>
>>> >> >> >> wrote:
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > On Mon, Jun 17, 2013 at 6:16 PM, Manuel Nuno Melo
>>> >> >> >> > <m.n.melo at rug.nl>
>>> >> >> >> > wrote:
>>> >> >> >> >>
>>> >> >> >> >> Hi,
>>> >> >> >> >>
>>> >> >> >> >> I have also had linking problems when making in parallel. In
>>> >> >> >> >> my
>>> >> >> >> >> case
>>> >> >> >> >> they
>>> >> >> >> >> could be traced back to the option to let GMX download/build
>>> >> >> >> >> its
>>> >> >> >> >> own
>>> >> >> >> >> fftw
>>> >> >> >> >> (-DGMX_BUILD_OWN_FFTW=ON).
>>> >> >> >> >>
>>> >> >> >> >> It seems that only one of make's threads starts building fftw,
>>> >> >> >> >> while
>>> >> >> >> >> the
>>> >> >> >> >> others go ahead building/linking GMX. Since fftw compilation
>>> >> >> >> >> is
>>> >> >> >> >> not
>>> >> >> >> >> ready by
>>> >> >> >> >> the time it is needed, GMX linking is botched.
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > Yes, Rossen first showed this to me. I don't know if the
>>> >> >> >> > underlying
>>> >> >> >> > issue is
>>> >> >> >> > that the dependency cannot be described properly, or that we're
>>> >> >> >> > not
>>> >> >> >> > doing it
>>> >> >> >> > properly. If it's a problem, people are welcome to contribute a
>>> >> >> >> > fix!
>>> >> >> >> > :-)
>>> >> >> >> It was working in https://gerrit.gromacs.org/#/c/1675/12. You
>>> >> >> >> then
>>> >> >> >> changed how the dependency works in patch set 13. You never
>>> >> >> >> replied
>>> >> >> >> to
>>> >> >> >> Christophs comment why this was changed (at least I can't find a
>>> >> >> >> reply). Do you remember?
>>> >> >> >
>>> >> >> >
>>> >> >> > I couldn't remember, but gerrit can - I never published a series
>>> >> >> > of
>>> >> >> > responses I made back then, sorry. Now published at
>>> >> >> > https://gerrit.gromacs.org/1675
>>> >> >> >
>>> >> >> >> Otherwise I can change it back as 12 did it
>>> >> >> >> and it should work again.
>>> >> >> >
>>> >> >> >
>>> >> >> > It might do, but as I said in those secret drafts the form of
>>> >> >> > patch
>>> >> >> > 12
>>> >> >> > doesn't work on cmake 2.8.7 because of a bug there in
>>> >> >> > add_library(...GLOBAL)
>>> >> >> > (and I suspect is probably too global, anyway, but this probably
>>> >> >> > does
>>> >> >> > no
>>> >> >> > harm?).
>>> >> >> >
>>> >> >> > So I'm still not sure there's a convenient solution that works in
>>> >> >> > all
>>> >> >> > cases.
>>> >> >> > Compromising the smooth running of a parallel make for someone
>>> >> >> > downloading
>>> >> >> > FFTW seems like the most low-impact problem of the set we could
>>> >> >> > choose
>>> >> >> > to
>>> >> >> > have.
>>> >> >>
>>> >> >> Probably true. Just doesn't give a good first impression of us to
>>> >> >> new
>>> >> >> users.
>>> >> >> I think we should also consider for the future whether we really
>>> >> >> want
>>> >> >> to support ~11 unmaintained version of cmake (including for all our
>>> >> >> optional features). Downloading cmake is no big deal. They have
>>> >> >> binaries to download. And cmake doesn't fix any version but for the
>>> >> >> most recent version. So it seems odd that we try to maintain
>>> >> >> workarounds for the last ~11 versions which are all unmaintained by
>>> >> >> the cmake developers. That seems like it is going to stay a really
>>> >> >> annoying maintenance task.
>>> >> >
>>> >> >
>>> >> > True. Now that we've shown it is a PITA for the developers to work
>>> >> > around a
>>> >> > handful of known issues with various 2.8.x point releases of CMake,
>>> >> > it
>>> >> > sounds reasonable to me that we pick a late-model CMake 2.8.x as the
>>> >> > requirement for GROMACS 5. That could open the door to an alternative
>>> >> > implementation for self-built FFTW.
>>> >>
>>> >> I agree that it is annoying having to work around CMake issues. At the
>>> >> same time, I think it would be a rather "user-unfriendly" move to
>>> >> require a very late version of CMake. As a user, it is fair to expect
>>> >> that building GROMACS is as hassle-free as possible.
>>> >
>>> >
>>> > Right. But I think all of the CMake issues have arisen while trying to
>>> > make
>>> > building GROMACS as hassle-free as possible. Here are just some of the
>>> > known
>>> > issues:
>>> > 1) we need something in 2.8.2 in order to download the regression tests
>>> > via
>>> > CMake
>>> > 2) 2.8.10 updated FindCUDA and changed the behaviour with respect to
>>> > setting
>>> > the host compiler for nvcc
>>> > 3) 2.8.8 provides CMAKE_<LANG>_COMPILER_VERSION, for which we currently
>>> > provide duplicate functionality in at least two places; this is mostly
>>> > used
>>> > for supporting tests for known versions of compilers that have missing
>>> > or
>>> > broken functionality
>>> > 4) we sometimes fall back on finding MPI, which didn't work well before
>>> > 2.8.5 (and find_file() has a minor bug that was fixed in 2.8.10)
>>> > 5) can't check an MD5 sum of the downloaded FFTW library before 2.8.3
>>> > 6) one way to link the downloaded FFTW library doesn't work on 2.8.7
>>> > Those are all things that we are trying to do to make a hassle-free
>>> > GROMACS
>>> > build experience. In some of the above cases we issue a fatal error and
>>> > suggest a CMake upgrade anyway.
>>> >
>>> > The principle has led the GROMACS devs to do a whole pile of "extra"
>>> > work
>>> > implementing those checks, and future work reading and maintaining that
>>> > code. This is not sustainable. For GROMACS 4.6, we compromised on CMake
>>> > 2.8
>>> > between developer convenience and the possibility users would not need
>>> > to
>>> > install cmake, which we did before realising we'd be encouraging about
>>> > half
>>> > the users to update to a real compiler. Getting cmake at the same time
>>> > is
>>> > not a big deal. Requiring 2.8.10 for GROMACS 5 lets get rid of a lot of
>>> > nonsense and go back to writing MD code.
>>>
>>> I do agree with you to a fairly large extent. However, most of the
>>> above workarounds are meant to support non-essential features, so
>>> either a warning (e.g. "skipping consitency check because CMake
>>> version <2.8.X") or in other cases a fatal error (e.g. "Insufficient
>>> CMake version for feature X") would be just fine instead of *
>>> hardcoded* very recent required version which would prevent users from
>>> building even if they don't use the feature which requires a late
>>> CMake version. Regarding points 2 and 3, those are good examples of
>>> minor feature additions which we could have chosen to ignore and stick
>>> to the legacy behavior, but we did not.
>>
>>
>> We currently offer the user the choice of "update CMake and have all the
>> convenient features if you want them" or "don't update CMake 2.8.x and maybe
>> have to make decisions later." Different people will have different
>> preferences there! :-)
>>
>>>
>>> Additionally, having seen the development style of CMake, coming
>>> across issues which need workarounds on our side seems unavoidable. As
>>> it happened with 4.6, issues will  come up during development or
>>> (beta/RC) testing of a release. With the above reasoning, we'd have to
>>> bump the required version quite a few times just to avoid
>>> complications. Instead, Instead I prefer to fix a moderately
>>> conservative required version during the development phase of a
>>> certain release and stick to that.
>>
>>
>> Nobody is suggesting having a rolling requirement. I think we're talking
>> about what choice to make for our next major release. So far the leading
>> suggestions on functionality seem to be 2.8.8 or 2.8.10.
>>
>>> >
>>> >> Hence, having to
>>> >> download CMake as a first step of the installation process will
>>> >> probably lead to many users not updating (early) to newer GROMACS
>>> >> versions.
>>> >
>>> >
>>> > I'd much rather spend devs time writing documentation of their awesome
>>> > new
>>> > features, so that users know they can do cool science with the new
>>> > version
>>> > if they get a new CMake version (which they might re-use anyway). The
>>> > needs
>>> > of the few and the needs of the many are being traded off here. It's not
>>> > like we're charging money selling shrink-wrapped software. :-)
>>> >
>>> >> In general, the issue is the way CMake development introduces changes
>>> >> in minor versions which affect behaviour. This can easily break
>>> >> fragile code in the build system. I don't have a good suggestion to
>>> >> overcome such problems, but I think that the choice of required
>>> >> minimum CMake version should depend on what versions provide the major
>>> >> Linux OS-es.
>>> >
>>> >
>>> > RHEL ships with a 2.6 (which drives CentOS and Scientific Linux). EPEL
>>> > provides a 2.8.9, though.
>>> > Current Ubuntu LTS (precise) has 2.8.7, but the next looks like it would
>>> > have at least 2.8.11. Current stable has 2.8.10.
>>> > Fedora 19 has 2.8.10.
>>> > macports has 2.8.10
>>>
>>> 12.04 LTS will be around at least until 2015 and many users will stick
>>> to it (like we do on all our compute and development machines).
>>>
>>> Don't get me wrong, I'm not advocating sticking to the currently
>>> required v2.8.0, but I'd prefer to be more conservative and require a
>>> version at least 1.5-2 years old, e.g. 2.8.8 or 2.8.7 (to cater for
>>> 12.04 LTS).
>>
>>
>> 2.8.10 was out in October 2012. By the time a hypothetical 2.8.10
>> requirement hits users (February 2014) it will be 15 months old. That seems
>> acceptable to me, but so far there's only a minor relevant difference known
>> (FindCUDA update, which we could back-port if we wanted). Having the
>
> The FindCUDA change adds a variable for host compiler which is a more
> of a regression rather than an overlapping feature for GROMACS as this
> does provides a functionality we implemented ourselves, but in an
> inferior manner (it sets the host compiler without doing at least a
> compatibility check).
>
>> reliable compiler version-checking functionality in 2.8.8 seems to me much
>> more important than supporting a particular distro's LTS with 2.8.7.
>
> Reliable is IMHO an overstatement. Seeing how buggy many CMake
> implementations are (e.g. detection of compiler warnings/errors in
> try_compile() just to name one really poorly implemented

Neither can we take the time to do all this ourselves. For example,
our get_compiler_version() and get_compiler_info() only work on a
compiler that supports -dumpversion, so our CMake usage is broken in
various ways with different versions of xlc. Wanting functionality
from CMake 2.8.8 is hardly "brand new". ;-) The compiler-version
support from CMake 2.8.8 doesn't have to be perfect, but since our
current code is versioned accordingly, we already think it is
useful...

> compiler-related feature), I would not jump to the conclusion that
> it's best to fully rely on brand new CMake features.

You're jumping to conclusions - bumping the required CMake version to
2.8.8 does not mean we are not going to "fully rely" on whatever
compiler version detection is in 2.8.8.

> Additionally, I'm not sure which two places duplicate this functionality, but:
> - when it comes to compiler version checks, we'll still need the
> special Clang checks as the clang version number itself does not mean
> much;
> - the code that queries compiler information for the -version header
> needs the *version string" which is more than the version number that
> CMake provides (i.e. "Ubuntu/Linaro 4.7.2-11lucid3" != 4.7.2")
> - the current "legacy" version detection code does *not* replicate
> much functionality; it only provides ~20 lines of simple, "best
> effort" code which has not changed since I wrote it in early 2012,

Great. If we bump the CMake requirement to >=2.8.8 and still need to
post-process what we get from CMake 2.8.8, that's still a win over
writing, documenting and maintaing code that can also generate
something else (that sometimes works on some platforms) to go into the
post-processing, just to support distros/users who haven't gotten
around to updating CMake lately.

>
> To conclude, I think such decisions should be thought through
> carefully and discussed in a developer meeting. With the above

Not unless you organize it. Our last sets of meetings of people in
Stockholm simply added a pile of things to my already too-long to-do
list. If exporting the workload of updating CMake to users helps us
keep our user-convenience CMake features, while doing core developer
activities better, then I think that's a no-brain decision.

> reasoning, we could end up requiring not only a recent CMake version,
> but recent compilers (gcc 4.8 will be 1 year old by 5.0 ;), libraries,
> etc. With the increasing number of dependencies and requirements
> becoming more strict, we run the risk of not only annoying the hell
> out of users, but also potentially excluding moderately old and exotic
> platforms which often do not have/support the latest and greatest of
> everything.

If installing a new version of CMake is enough to annoy the hell out
of user, then they can have their money back ;-) Installing CMake from
source is a 10 minute job for someone who's done it before. From a
distro package is even easier.

> I personally do not agree with such a direction and this also
> conflicts with the principles that have been guiding decisions the
> last few years (and as far as I understand quite far back in the
> history of GROMACS).

When we used assembly kernels and only cared about x86 and a system
that supported autotools, we could claim we supported any C compiler.
Those days are gone. Principles that went with the design decisions
that led to those kernels need to be reconsidered. There is no value
in us tweaking the hell out of code, and letting someone compile with
gcc 4.3 out of sheer laziness.

> Of course, a decision could be made that instead
> of doing our best to make future GROMACS versions work with most but
> ancient *essential* software dependencies (right now that is build
> system and compilers), users will have to use older GROMACS version.

We already require up-to-date compilers for maximum performance, so I
think that decision is already made. In an ideal world we can offer
infinite backward-compatible support, but we are not close to that
world, and nobody is paying us money to stay close to it. Anybody
buying a new compute cluster, putting RHEL6-based distro on it and
expecting modern software to Just Work Best on it is living in fantasy
land. For example, if SSE2 support caused us any pain, I'd be ripping
it out tomorrow. Desktop distros all have a usefully up-to-date CMake.
Users who have old hardware do not have to use comparably old
software. Where's the problem?

Mark

>>
>> Mark
>>
>>> Cheers,
>>> --
>>> Szilard
>>>
>>> > So it seems to me that requiring CMake 2.8.10 for a Feb 2014 release is
>>> > quite reasonable.
>>> >
>>> > Mark
>>> >
>>> > --
>>> > gmx-developers mailing list
>>> > gmx-developers at gromacs.org
>>> > http://lists.gromacs.org/mailman/listinfo/gmx-developers
>>> > Please don't post (un)subscribe requests to the list. Use the
>>> > www interface or send it to gmx-developers-request at gromacs.org.
>>> --
>>> gmx-developers mailing list
>>> gmx-developers at gromacs.org
>>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>>> Please don't post (un)subscribe requests to the list. Use the
>>> www interface or send it to gmx-developers-request at gromacs.org.
>>
>>
>>
>> --
>> gmx-developers mailing list
>> gmx-developers at gromacs.org
>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>> Please don't post (un)subscribe requests to the list. Use the
>> www interface or send it to gmx-developers-request at gromacs.org.
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.



More information about the gromacs.org_gmx-developers mailing list