[gmx-developers] [RFC] thread affinity in mdrun

Sun Sep 22 20:29:32 CEST 2013

Hi!

Szilárd Páll писал 22-09-2013 19:49:
> Hi,
> 
> On Fri, Sep 20, 2013 at 7:06 AM, Alexey Shvetsov
> <alexxy at omrb.pnpi.spb.ru> wrote:
>> Hi!
>> 
>> I saw issues with demo Numascale system[1] and default (without 
>> external
>> mpi) mdrun behavior -- it pins all 128 threads to first ~20 cores. 
>> Version
>> with external  MPI (numascale provides openmpi offload module) works 
>> fine.
> 
> That's possible. I don't know of any testing done on Numascale systems
> - at least not for 4.6. Feel free to file a bug report! However,
> somebody with access to the machine would need to contribute a patch
> or at least help figuring out what does not work correctly in the
> current hardware detection code.

Currently I don't have access to NumaScale hw, but we plan to get small 
system (2x2x2 3D torus) by the end of this year, so it will be possible 
to check whats going wrong. As I know currently even hwloc code doesn't 
work well on NumaScale.

> Cheers,
> --
> Szilárd
> 
>> 
>> [1] http://numascale.com/numa_access.php
>> 
>> Szilárd Páll писал 19-09-2013 21:53:
>> 
>>> Hi,
>>> 
>>> I would like to get feedback on an issue (or more precisely a set of
>>> issues) related to thread/process affinities and
>>> i) the way we should (or should not) tweak the current behavior and
>>> ii) the way we should proceed in the future.
>>> 
>>> 
>>> Brief introduction, skip this if you are familiar with the
>>> implementation details:
>>> Currently, mdrun always sets per-thread affinity if the number of
>>> threads is equal to the number of "CPUs" detected (reported by the OS
>>> ~ number of hardware threads supported). However, if this is not the
>>> case, e.g. one wants to leave some cores empty (run multiple
>>> simulations per node) or avoid using HT, thread pinning will not be
>>> done. This can have quite harsh consequences on the performance -
>>> especially when OpenMP parallelization is used (most notably with
>>> GPUs).
>>> Additionally, we try hard to not override externally set affinities
>>> which means that if mdrun detects non-default affinity, it will not
>>> pin threads (not even if -pin on is used). This happens if the job
>>> scheduler sets the affinity, or if the user sets it e.g. with
>>> KMP_AFFINITY/GOMP_CPU_AFFINITY, taskset, etc., but even if the MPI
>>> implementation sets only its thread's affinity.
>>> 
>>> 
>>> On the one hand, there was a request (see
>>> http://redmine.gromacs.org/issues/1122) that we should allow forcing
>>> the affinity setting by mdrun either by "-pin on" acquiring more
>>> aggressive behavior or using a "-pin force" option. Please check out
>>> the discussion on the issue page and express your opinion on whether
>>> you agree/which behavior you support.
>>> 
>>> 
>>> On the other hand, more generally, I would like to get feedback on
>>> what people's experience is with affinity setting. I'll just list a
>>> few aspects of this issue that should be considered, but feel free to
>>> raise other issues:
>>> - per-process vs per-thread affinity;
>>> - affinity set by or required (for optimal performance)
>>> MPI/communication software stack;
>>> - GPU/accelerator NUMA aspects;
>>> - hwloc;
>>> - leaving a core empty, for interrupts (AMD/Cray?), MPI, NIC or GPU
>>> driver thread.
>>> 
>>> Note that this part of the discussion is aimed more at the behavior 
>>> of
>>> mdrun in the future. This is especially relevant as the next major 
>>> (?)
>>> version is being planned/developed and new tasking/parallelization
>>> design options are being explored.
>>> 
>>> Cheers,
>>> --
>>> Szilárd
>> 
>> 
>> --
>> Best Regards,
>> Alexey 'Alexxy' Shvetsov
>> Petersburg Nuclear Physics Institute, NRC Kurchatov Institute, 
>> Gatchina,
>> Russia
>> Department of Molecular and Radiation Biophysics
>> Gentoo Team Ru
>> Gentoo Linux Dev
>> mailto:alexxyum at gmail.com
>> mailto:alexxy at gentoo.org
>> mailto:alexxy at omrb.pnpi.spb.ru
>> --
>> gmx-developers mailing list
>> gmx-developers at gromacs.org
>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>> Please don't post (un)subscribe requests to the list. Use the www 
>> interface
>> or send it to gmx-developers-request at gromacs.org.

-- 
Best Regards,
Alexey 'Alexxy' Shvetsov
Petersburg Nuclear Physics Institute, NRC Kurchatov Institute, Gatchina, 
Russia
Department of Molecular and Radiation Biophysics
Gentoo Team Ru
Gentoo Linux Dev
mailto:alexxyum at gmail.com
mailto:alexxy at gentoo.org
mailto:alexxy at omrb.pnpi.spb.ru