[gmx-developers] libgromacs vs libgromacs_core?

Sat Sep 28 21:23:19 CEST 2013

On Sat, Sep 28, 2013 at 8:23 PM, Szilárd Páll <szilard.pall at cbr.su.se> wrote:
> On Sat, Sep 28, 2013 at 8:00 PM, Mark Abraham <mark.j.abraham at gmail.com> wrote:
>> On Sat, Sep 28, 2013 at 7:17 PM, Szilárd Páll <szilard.pall at cbr.su.se> wrote:
>>> On Sat, Sep 28, 2013 at 6:55 PM, Mark Abraham <mark.j.abraham at gmail.com> wrote:
>>>> On Sat, Sep 28, 2013 at 5:56 PM, Szilárd Páll <szilard.pall at cbr.su.se> wrote:
>>>>> Hi,
>>>>>
>>>>> I would like to bring up the topic of keeping mdrun lean and as
>>>>> dependency-free as possible (for HPC/exotic platforms).
>>>>>
>>>>> This has been discussed earlier, but I'm not sure what the decision
>>>>> was (AFAIK there wasn't any) and what the current direction is. The
>>>>> two possibilities discussed:
>>>>> - splitting libgromacs into libgromacs and libgromacs_core (where
>>>>> mdrun depends only on the latter which is as portable and lightweight
>>>>> as possible);
>>>>> - more of a workaround solution: allowing to build a stripped-down
>>>>> version of libgromacs that mdrun links against; this libgromacs may
>>>>> still be better called libgromacs_core to allow conflict-free
>>>>> installation of e.g. full non-MPI build + reduced MPI-enabled mdrun
>>>>> build in the same location.
>>>>
>>>> What external dependencies do we have that are optional for mdrun and
>>>> cannot be turned off at CMake level?
>>>
>>> I'm not familiar with the current dependencies of 5.0 code to a great
>>> detail, but actually I have mis-phrased the problem statement.
>>
>> This has been discussed at great length over years on this list and on
>> Redmine. The summary material on the web is a pretty accurate state of
>> affairs: we require C++98; stuff in POSIX is probably fine, but be
>> aware there are non-POSIX platforms. For example, we want smarter
>> pointers than C++98, and so we bundle a few pieces of Boost that make
>> that possible, unless the compiler in use offers C++11 support.
>>
>>> It is not just a matter of what _libraies_ does mdrun depend on _now_,
>>> but in general, what will go into libgromacs that could potentially
>>> set back porting mdrun to a new platform in (almost) as little time as
>>> in the days of C99 code <v5.0 code. This includes external library
>>> dependencies as well as language features and constructs which may
>>> prevent porting or efficient use on some hardware.
>>
>> This, too, has been discussed a lot. There's a list of allowed C++
>> constructs on the website.
>
> Sure, I know. However, AFAIK, this list has even grown throughout the
> years,

So take a look at the history of that web page ;-)

> and I don't know whether anyone has actually checked whether
> all accepted features are implemented

Easy. If they don't support 15-year old standard, we're not yet interested.

> and work well even in
> non-mainstream compilers.

Harder - but I don't see articles titled "C++98 is unusable for HPC".
Until we write some code, we can't even know. But we are months away
from being able to write any C++ code for mdrun, so how about we do
some cleaning up and then see how it goes? ;-)

>>> Additionally, one can also look at the issue the other way around:
>>> while maintaining the portability of mdrun should be a major goal,
>>> having to keep this restriction in mind when writing tools code is
>>> both a burden as well as a risk - it takes a C++/HPC expert to know
>>> what language features of say C++11 are "too much" for non-mainstream
>>> compilers (and we have a trend of squeezing in more and more allowed
>>> stuff). Hence, IMO it would be advantageous to not have to scrutinize
>>> every bit of code for a high level of portability that goes into
>>> libgromacs because, I think, that's the sure way to make the situation
>>> disadvantageous for both mdrun and tools.
>>
>> Right, so we are being very conservative. If we are ever able to
>> separate modules such that we can relax that requirement (e.g. for
>
> Why the "if"?

Because there's months of work before it's even worth attempting.

> The least one could do is to allow building a light
> libgromacs that does not act as a kitchen sink into which tons of code
> gets compiled in just because tool X needs it.

Look at what's in typedefs.h. There's a reason Teemu re-iterated
earlier this year that one of the first orders of business is breaking
apart those dependency nests. It pulls in 30 header files. It is
explicitly included by 100 header files, and 300 code files. Until
that clique is smaller, I expect there is no relevant subset, much
less that there is one that can implement the bulk of either mdrun or
the tools library.

If the objective is being able to build some kind of mdrun-lite, I
don't think I'm interested. Look at the fragmentation of features in
sander and pmemd. Or PD and DD. Or GPU and non-GPU. A clean MD loop
with access to the full feature set is much more useful for everyone
than an MD loop that only does plain vanilla things, because everybody
will want some special thing for the simple version, and the union of
those sets will be most of what mdrun does. There are already MD codes
that are good for being an algorithm testbed.

> Correct me if I'm wrong, but to me it seems that this is more of a
> matter of prioritizing such a "feature" (=portability). Of course, it

When we can identify a non-portable non-optional aspect, we can talk
further about it :-)

> may take a bit of effort to design all the C++ stuff such that core
> mdrun features are easily separable, but it's better to do it now than
> when the lack of such design considerations may start to hurt.

... which is why I'm starting by breaking apart data structures to
split dependencies. There are plenty of good books on this topic on my
bookshelf ;-). Only when you have independent atoms can you put them
together so that they can be parts of different molecules.

>> ease of writing tools code) then we can consider doing that. But for a
>> non-trivial period of time, the question of whether a compiler can do
>> what parts of C++11 efficiently or portably is moot. The first
>> fully-compliant C++11 compilers are fresh on the market, and GROMACS
>> is pretty unmodular.
>
> C++11 was just an example. AFAIR even including boost code was out of
> the question when the C++ discussion first started, but it still got
> embraced.

And assuming that subset of Boost is suitably licensed, bundled,
requires only C++98, and not template metaprogramming, what is the
problem with that?

>>> While it may seem that I'm putting too much emphasis on this, I know
>>> concrete examples of projects which took a disproportionate amount of
>>> time to get running on some platform simply because of dependencies or
>>> the use of C++ and its features.
>>
>> Sure. We are trying to learn from those lessons. What do you think we
>> need to do more?
>
> Unfortunately, I don't have strong enough experience with C++ in HPC
> and probably many of us don't. Hence, I can't write up the list 10
> commandments to follow - especially that there is no such list, at

Right. So we adopt a conservative policy and see how things go.
Talking about hypothetical problems we might have achieves less than
nothing, though.

> least not a universal one. Just as an example, an acquaintance of mine
> who has worked in HPC with both C and C++ for a decade or so said that
> he'd never use most C++ features in performance-oriented code like
> virtual functions, most of what templates offer, exceptions, just to
> name a few I can remember.

By that argument, we should go and write machine code. People are
great at saying things like "gee those virtual functions are slow"
when their real problem is that they were using RTTI in their inner
loop, or something. Back in the day, people hated not using global
variables. People hated not using goto. People hated using
encapsulated objects in C. But then they learned that this made them
more productive, even if their code happened to be slightly less
efficient. And the people's time is much more important.

Certainly doing something like using a templated container to replace
arrays of rvec for possible use in kernels will require extensive run-
and compile-time performance testing. So it is no longer on the table
for 5.0.

We're already using function pointers to call our kernels. I expect
that judicious use of virtual functions will give us much better
readability, reusability, maintainability, etc. If that means people
can deploy state of the art algorithms quicker than 5 years after they
are published, that is a lot more important than a few percent of
runtime performance. More important still is that these questions are
unanswerable until there is a lot more plain-C cleanup completed.

> Of course, there is room to analyse every feature on a case-by-case
> and per-project basis, but I personally have a hard time providing
> much concrete and useful input due to my lack of extensive experience.

Join the club, except I've been reading and planning for months. I
still have much to learn by doing, however! :-)

Mark