[gmx-developers] plans for mdrun features to deprecate in GROMACS 5.0

Szilárd Páll szilard.pall at cbr.su.se
Tue Sep 17 18:48:41 CEST 2013


On Tue, Sep 17, 2013 at 3:49 PM, Erik Lindahl
<erik.lindahl at scilifelab.se> wrote:
> Hi,
>
> I actually don't think it is overly difficult to port anything to domain decomposition, in particular not if performance isn't critical (which it isn't if particle decomposition is an alternative :-).
>
> The problem is that it is slightly easier initially to just support PD, and that has meant even some new code only supported PD (guilty as charged - our first version of generalized born was PD only). This in turn augments the problem, since the existence of PD means developers might add more code that only works with PD, and suddenly we have quite a few parts that don't work properly with our main parallelization algorithm.
>
> I'm a bit afraid that a "relaxed DD" would have exactly the same effect. Thus, I would rather advocate some extra semi-stupid communication routines that might kill your performance compared to vanilla runs, but it _will_ work with normal DD.

That depends on how we interpret "relaxed". What I meant was not an
entirely different algorithm as an easy way out, but the same
code-path just with some restrictions applied. This means that the
same/similar unit tests and the same regression tests will apply,
hence the code-path will be easier to test, support, and maintain. I
think in practice, this is equivalent to imposing an extra
communication step. It may be the case that extra communication is
easier to implement, but in both IMO in both cases there should be a
clear specification of what is the minimum requirement for some
feature to work across nodes and most of not all mdrun features should
comply.

Additionally, I find it reasonable for PD users to expect that
something like this will be implemented, ideally before the plug is
pulled on PD. Of course, it is not unreasonable to expect a combined
effort of the ("guilty") developers/maintainers of PD only-features
and users interested in these features.

Cheers,
--
Szilárd

PS: Don't get me wrong, I'm not trying to tell others what to focus
on, but I happen to be user-centric in the sense that IMHO deciding to
drop a feature (mostly for developer convenience reasons) just because
there are no masses protesting is rather unreasonable.


> Cheers,
>
> Erik
>
>
> On Sep 17, 2013, at 3:30 PM, Szilárd Páll <szilard.pall at cbr.su.se> wrote:
>
>> One limitation of leaving OpenMP the only option for PD runs is that
>> OpenMP scaling is far from stellar when running across multiple NUMA
>> domains, most notably on AMD, but not only. While on a dual socket
>> 8-core Sandy Bridge with 1000s of atoms/core you get typically 60-85%
>> scaling across two sockets, on an dual 16-core/8-module AMD Piledriver
>> it's more like 20-60% and on previous generation AMD-s it's even worse
>> (not to mention quad-socket machines).
>>
>> While some improvements with DD + multi-threading may be needed to
>> improve scaling with high thread count/rank, this is quite feasible
>> even with MPI+OpenMP, while pushing OpenMP across NUMA regions will
>> hardly work.
>>
>> I'm wondering, would it be feasible to provide a "relaxed" DD
>> code-path in combination with strongly limiting how small the domain
>> size can be (or is this similar to what Carsten suggests)?
>>
>> --
>> Szilárd
>>
>>
>> On Mon, Sep 16, 2013 at 5:15 PM, XAvier Periole <x.periole at rug.nl> wrote:
>>>
>>> I'll have to look at what we did and get back to you guys ...
>>>
>>> I think I got stuck at the running on one node with pd and got distracted by something else ... I'll need to get back to this in more details.
>>>
>>> XAvier.
>>>
>>> On Sep 16, 2013, at 17:04, "Shirts, Michael (mrs5pt)" <mrs5pt at eservices.virginia.edu> wrote:
>>>
>>>>
>>>>> Berk, the use of OpenMP on a single node ... Should work indeed. We tried this
>>>>> for REMD using one node per replica each having exotic bonded terms, but we
>>>>> failed for reason I forgot.
>>>>
>>>> Was this Hamiltonia replica exchange in 4.6?  If so, let me know about any
>>>> failures to see if it's a problem with insufficient documentation or an
>>>> underlying bug with the REMD/exotic bonded interaction that needs to be
>>>> fixed.
>>>>
>>>> Best,
>>>> ~~~~~~~~~~~~
>>>> Michael Shirts
>>>> Assistant Professor
>>>> Department of Chemical Engineering
>>>> University of Virginia
>>>> michael.shirts at virginia.edu
>>>> (434)-243-1821
>>>>
>>>>
>>>>
>>>> --
>>>> gmx-developers mailing list
>>>> gmx-developers at gromacs.org
>>>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>>>> Please don't post (un)subscribe requests to the list. Use the
>>>> www interface or send it to gmx-developers-request at gromacs.org.
>>> --
>>> gmx-developers mailing list
>>> gmx-developers at gromacs.org
>>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>>> Please don't post (un)subscribe requests to the list. Use the
>>> www interface or send it to gmx-developers-request at gromacs.org.
>> --
>> gmx-developers mailing list
>> gmx-developers at gromacs.org
>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>> Please don't post (un)subscribe requests to the list. Use the
>> www interface or send it to gmx-developers-request at gromacs.org.
>
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.



More information about the gromacs.org_gmx-developers mailing list