[gmx-developers] Disabling Pthreads
Gilles Gouaillardet
gilles at rist.or.jp
Thu Nov 12 03:43:09 CET 2020
Erik,
That's right, it happens only in the hardware detection phase.
GROMACS is compiled with *both* MPI and OpenMP support
Here are the relevant bits of lscpu
$ lscpu
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 60
On-line CPU(s) list: 0,1,12-59
Off-line CPU(s) list: 2-11
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
fphp asimdhp cpuid asimdrdm fcma dcpop sve
So there are indeed 60 CPUs reported (I previously stated 50 or 52)
- assistant cores are 0,1 (and 2,3 on some models)
- cores 4-11 are always offline (and also 2,3 on this model)
- cores used by HPC jobs are 12-59
When a job is started, it is put in a cgroup whose cpuset are 12-59.
Then each MPI task is spawned in the same cgroup *but* with a more
restrictive cpuset
for example, with 4 MPI tasks per node
# the cpuset is "inherited" from the cgroup
$ grep Cpus_allowed_list /proc/self/status
Cpus_allowed_list: 12-59
# the per MPI task cpuset is more restrictive
[r00018 at g25-7109c ~]$ mpiexec grep Cpus_allowed_list /proc/self/status
Cpus_allowed_list: 12-23
Cpus_allowed_list: 24-35
Cpus_allowed_list: 36-47
Cpus_allowed_list: 48-59
So what matters here is not the cgroup strictly speaking, but the set of
available cpus a given process can run on.
in hardwareTopologyPrepareDetection(), each task will indeed spawn 60 /*
sysconf(_SC_NPROCESSORS_CONF) */
threads, so the worst case scenario is 48 MPI tasks * 60 threads per MPI
tasks.
We can also note that in this configuration, because each MPI task is
put in a restrictive cpuset, most of these 60 threads
will end up doing time sharing.
I am not sure of how to best handle this:
hardwareTopologyPrepareDetection() is currently a no-op on X86 and PowerPC.
It is needed on *some* ARM processors but should be avoided on A64fx,
that's why I suggested a cmake option is a fit here : let the users decide.
Cheers,
Gilles
On 11/11/2020 5:34 PM, Erik Lindahl wrote:
> Hi,
>
> Gilles: If this happens, I *suspect* this might only be in the
> hardware detection phase, right?
>
> We might indeed have overlooked that, since ARM was originally just a
> low-end platform (think 32-bit Tegra...) where we never even thought
> of running anything multi-node.
>
> We've already thought of adding cgroups awareness (for docker), so
> could you possibly assist me by showing a concrete example of this on
> A64fx:
>
> 1) lscpu
>
> 2) cgroups information, ideally both for the node and specific MPI
> processes
>
> ... since otherwise we're programming a bit blind :-)
>
> Cheers,
>
> Erik
>
>
>
>
>
>
>
>
> On Wed, Nov 11, 2020 at 8:47 AM Berk Hess <hess at kth.se
> <mailto:hess at kth.se>> wrote:
>
> Hi,
>
> Then there is a strange bug somewhere.
>
> Are you using real MPI or thread-MPI. Do you know where this happens?
> Are you running with OpenMP support?
>
> Cheers,
>
> Berk
>
> On 2020-11-11 08:26, Gilles Gouaillardet wrote:
> > Berk,
> >
> >
> > There is a total of 52 cores, and my observation is that each
> MPI task
> > does spawn 52 threads.
> >
> > So the worst case scenario is 48 MPI tasks each spawning 52
> threads,
> > so a total of 48 * 52 threads on a single node
> >
> >
> > Cheers,
> >
> >
> > Gilles
> >
> > On 11/11/2020 4:23 PM, Berk Hess wrote:
> >> On 2020-11-11 03:37, Gilles Gouaillardet wrote:
> >>> Erik and all,
> >>>
> >>>
> >>> I am kind of facing the exact opposite issue on an other ARM
> processor:
> >>>
> >>> High end A64fx (Fugaku/FX1000) have 48 cores plus 2 or 4
> assistant
> >>> cores.
> >>>
> >>> A job is put in a cgroup of 48 cores (e.g. no assistant cores)
> >>>
> >>> Worst case scenario, a flat mpi run (48 tasks) will spawn 48 * 52
> >>> cores to spin up all the cores.
> >> What do you mean with 48 * 52 cores? There are only 52 cores.
> Gromacs
> >> will by default not spawn more threads in total then there are
> cores.
> >> If you ask for 48 MPI ranks with a real MPI library Gromacs
> will not
> >> spawn any additional threads. With thread-mpi it will spawn 48
> or 52
> >> total.
> >>
> >> Cheers,
> >>
> >> Berk
> >>>
> >>> 1) GROMACS is not cgroup aware and hence consider there are 52
> (or
> >>> 50) cores per node (this is a very minor issue)
> >>>
> >>> 2) spawning such a high number of threads caused some crashes
> (weird
> >>> stack traces, I did not spend much time investigating)
> >>>
> >>> 3) in the case of A64fx, all cores are up and running, ready to
> >>> crunch, and do not require any special tricks.
> >>>
> >>>
> >>> At this stage, I think the easiest path to address this on
> A64fx is
> >>> to add yet an other cmake option to
> >>>
> >>> unconditionally skip the spin up phase.
> >>>
> >>> This could be improved by adding a command line option, or an
> >>> environment variable, to change the default behavior
> >>>
> >>> (default behavior should be a cmake option imho)
> >>>
> >>>
> >>> Any thoughts on how to best move forward?
> >>>
> >>>
> >>> Cheers,
> >>>
> >>>
> >>> Gilles
> >>>
> >>> On 11/11/2020 6:08 AM, Erik Lindahl wrote:
> >>>> We might be able to work around the last aspect, but it will
> likely
> >>>> take a couple of weeks until I can lay hands on a new ARM-based
> >>>> Macbook.
> >>>>
> >>>> Long story made short: The advanced power-saving features on ARM
> >>>> mean some cores are not visible until they are used, so we
> created
> >>>> a small hack where we "spin up" the CPU by exercising all cores.
> >>>>
> >>>> We might anyway need to do something different with the new
> type of
> >>>> big.LITTLE cores where we have 4+4 or 8+4 cores, but I can't
> even
> >>>> start to work on that until I have suitable hardware. The
> good news
> >>>> is that such hardware was announced a couple of hours ago, with
> >>>> availability next week ;-)
> >>>>
> >>>> Cheers,
> >>>>
> >>>> Erik
> >>>>
> >>>> On Tue, Nov 10, 2020 at 9:45 PM Mark Abraham
> >>>> <mark.j.abraham at gmail.com <mailto:mark.j.abraham at gmail.com>
> <mailto:mark.j.abraham at gmail.com
> <mailto:mark.j.abraham at gmail.com>>> wrote:
> >>>>
> >>>> No, the use of std::thread in eg hardware detection also
> requires
> >>>> a lower level threading implementation.
> >>>>
> >>>> Mark
> >>>>
> >>>> On Tue, Nov 10, 2020, 20:41 Berk Hess <hess at kth.se
> <mailto:hess at kth.se>
> >>>> <mailto:hess at kth.se <mailto:hess at kth.se>>> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> Turning off GMX_THREAD_MPI in cmake should remove the
> >>>> dependency on pthreads.
> >>>>
> >>>> Cheers,
> >>>>
> >>>> Berk
> >>>>
> >>>> On 2020-11-10 18:06, Guido Giuntoli wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> Is there any way to disable the “Pthreads”
> dependency during
> >>>>> the configuration/compilation of GROMACS?
> >>>>>
> >>>>> *Best regards | Mit freundlichen Grüßen*
> >>>>>
> >>>>> **
> >>>>>
> >>>>> *Guido Giuntoli***
> >>>>>
> >>>>> **
> >>>>>
> >>>>> HUAWEI TECHNOLOGIES Duesseldorf GmbH
> >>>>> Hansaallee 205, 40549 Dusseldorf, Germany,
> *www.huawei.com <http://www.huawei.com>*
> >>>>> <http://www.huawei.com/ <http://www.huawei.com/>>
> >>>>> Registered Office: Düsseldorf, Register Court
> Düsseldorf, HRB
> >>>>> 56063,
> >>>>> Managing Director: Li Peng, Li Jian, Shi Yanli**
> >>>>>
> >>>>> Sitz der Gesellschaft: Düsseldorf, Amtsgericht
> Düsseldorf,
> >>>>> HRB 56063,
> >>>>> Geschäftsführer: Li Peng, Li Jian, Shi Yanli
> >>>>>
> >>>>>
> *-----------------------------------------------------------------------------------------------*
>
> >>>>>
> >>>>>
> >>>>> *This e-mail and its attachments contain confidential
> >>>>> information from HUAWEI, which is intended only for the
> >>>>> person or entity whose address is listed above. Any
> use of
> >>>>> the information contained herein in any way
> (including, but
> >>>>> not limited to, total or partial disclosure,
> reproduction, or
> >>>>> dissemination) by persons other than the intended
> >>>>> recipient(s) is prohibited. If you receive this
> e-mail in
> >>>>> error, please notify the sender by phone or email
> immediately
> >>>>> and delete it!*
> >>>>>
> >>>>>
> >>>>
> >>>> -- Gromacs Developers mailing list
> >>>>
> >>>> * Please search the archive at
> >>>>
> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
> <http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List>
> >>>>
> <http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
> <http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List>>
> >>>> before posting!
> >>>>
> >>>> * Can't post? Read
> >>>> http://www.gromacs.org/Support/Mailing_Lists
> <http://www.gromacs.org/Support/Mailing_Lists>
> >>>> <http://www.gromacs.org/Support/Mailing_Lists
> <http://www.gromacs.org/Support/Mailing_Lists>>
> >>>>
> >>>> * For (un)subscribe requests visit
> >>>>
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
> <https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers>
>
> >>>>
> >>>>
> <https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
> <https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers>>
>
> >>>>
> >>>> or send a mail to gmx-developers-request at gromacs.org
> <mailto:gmx-developers-request at gromacs.org>
> >>>> <mailto:gmx-developers-request at gromacs.org
> <mailto:gmx-developers-request at gromacs.org>>.
> >>>>
> >>>> -- Gromacs Developers mailing list
> >>>>
> >>>> * Please search the archive at
> >>>>
> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
> <http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List>
> >>>>
> <http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
> <http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List>>
> >>>> before posting!
> >>>>
> >>>> * Can't post? Read
> http://www.gromacs.org/Support/Mailing_Lists
> <http://www.gromacs.org/Support/Mailing_Lists>
> >>>> <http://www.gromacs.org/Support/Mailing_Lists
> <http://www.gromacs.org/Support/Mailing_Lists>>
> >>>>
> >>>> * For (un)subscribe requests visit
> >>>>
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
> <https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers>
>
> >>>>
> >>>>
> <https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
> <https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers>>
>
> >>>>
> >>>> or send a mail to gmx-developers-request at gromacs.org
> <mailto:gmx-developers-request at gromacs.org>
> >>>> <mailto:gmx-developers-request at gromacs.org
> <mailto:gmx-developers-request at gromacs.org>>.
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Erik Lindahl <erik.lindahl at dbb.su.se
> <mailto:erik.lindahl at dbb.su.se> <mailto:erik.lindahl at dbb.su.se
> <mailto:erik.lindahl at dbb.su.se>>>
> >>>> Professor of Biophysics, Dept. Biochemistry & Biophysics,
> Stockholm
> >>>> University
> >>>> Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
> >>>>
> >>
>
> --
> Gromacs Developers mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
> <http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List>
> before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> <http://www.gromacs.org/Support/Mailing_Lists>
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
> <https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers>
> or send a mail to gmx-developers-request at gromacs.org
> <mailto:gmx-developers-request at gromacs.org>.
>
>
>
> --
> Erik Lindahl <erik.lindahl at dbb.su.se <mailto:erik.lindahl at dbb.su.se>>
> Professor of Biophysics, Dept. Biochemistry & Biophysics, Stockholm
> University
> Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
>
More information about the gromacs.org_gmx-developers
mailing list