[gmx-developers] Disabling Pthreads

Gilles Gouaillardet gilles at rist.or.jp
Thu Nov 12 03:43:09 CET 2020


Erik,


That's right, it happens only in the hardware detection phase.

GROMACS is compiled with *both* MPI and OpenMP support


Here are the relevant bits of lscpu

$ lscpu
Architecture:         aarch64
Byte Order:           Little Endian
CPU(s):               60
On-line CPU(s) list:  0,1,12-59
Off-line CPU(s) list: 2-11
Flags:                fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics 
fphp asimdhp cpuid asimdrdm fcma dcpop sve


So there are indeed 60 CPUs reported (I previously stated 50 or 52)

  - assistant cores are 0,1 (and 2,3 on some models)

  - cores 4-11 are always offline (and also 2,3 on this model)

  - cores used by HPC jobs are 12-59


When a job is started, it is put in a cgroup whose cpuset are 12-59.

Then each MPI task is spawned in the same cgroup *but* with a more 
restrictive cpuset

for example, with 4 MPI tasks per node


# the cpuset is "inherited" from the cgroup

$ grep Cpus_allowed_list /proc/self/status
Cpus_allowed_list: 12-59

# the per MPI task cpuset is more restrictive

[r00018 at g25-7109c ~]$ mpiexec grep Cpus_allowed_list /proc/self/status
Cpus_allowed_list:      12-23
Cpus_allowed_list:      24-35
Cpus_allowed_list:      36-47
Cpus_allowed_list:      48-59


So what matters here is not the cgroup strictly speaking, but the set of 
available cpus a given process can run on.


in hardwareTopologyPrepareDetection(), each task will indeed spawn 60 /* 
sysconf(_SC_NPROCESSORS_CONF) */

threads, so the worst case scenario is 48 MPI tasks * 60 threads per MPI 
tasks.

We can also note that in this configuration, because each MPI task is 
put in a restrictive cpuset, most of these 60 threads

will end up doing time sharing.


I am not sure of how to best handle this:

hardwareTopologyPrepareDetection() is currently a no-op on X86 and PowerPC.

It is needed on *some* ARM processors but should be avoided on A64fx,

that's why I suggested a cmake option is a fit here : let the users decide.


Cheers,


Gilles


On 11/11/2020 5:34 PM, Erik Lindahl wrote:
> Hi,
>
> Gilles: If this happens, I *suspect* this might only be in the 
> hardware detection phase, right?
>
> We might indeed have overlooked that, since ARM was originally just a 
> low-end platform (think 32-bit Tegra...) where we never even thought 
> of running anything multi-node.
>
> We've already thought of adding cgroups awareness (for docker), so 
> could you possibly assist me by showing a concrete example of this on 
> A64fx:
>
> 1) lscpu
>
> 2) cgroups information, ideally both for the node and specific MPI 
> processes
>
> ... since otherwise we're programming a bit blind :-)
>
> Cheers,
>
> Erik
>
>
>
>
>
>
>
>
> On Wed, Nov 11, 2020 at 8:47 AM Berk Hess <hess at kth.se 
> <mailto:hess at kth.se>> wrote:
>
>     Hi,
>
>     Then there is a strange bug somewhere.
>
>     Are you using real MPI or thread-MPI. Do you know where this happens?
>     Are you running with OpenMP support?
>
>     Cheers,
>
>     Berk
>
>     On 2020-11-11 08:26, Gilles Gouaillardet wrote:
>     > Berk,
>     >
>     >
>     > There is a total of 52 cores, and my observation is that each
>     MPI task
>     > does spawn 52 threads.
>     >
>     > So the worst case scenario is 48 MPI tasks each spawning 52
>     threads,
>     > so a total of 48 * 52 threads on a single node
>     >
>     >
>     > Cheers,
>     >
>     >
>     > Gilles
>     >
>     > On 11/11/2020 4:23 PM, Berk Hess wrote:
>     >> On 2020-11-11 03:37, Gilles Gouaillardet wrote:
>     >>> Erik and all,
>     >>>
>     >>>
>     >>> I am kind of facing the exact opposite issue on an other ARM
>     processor:
>     >>>
>     >>> High end A64fx (Fugaku/FX1000) have 48 cores plus 2 or 4
>     assistant
>     >>> cores.
>     >>>
>     >>> A job is put in a cgroup of 48 cores (e.g. no assistant cores)
>     >>>
>     >>> Worst case scenario, a flat mpi run (48 tasks) will spawn 48 * 52
>     >>> cores to spin up all the cores.
>     >> What do you mean with 48 * 52 cores? There are only 52 cores.
>     Gromacs
>     >> will by default not spawn more threads in total then there are
>     cores.
>     >> If you ask for 48 MPI ranks with a real MPI library Gromacs
>     will not
>     >> spawn any additional threads. With thread-mpi it will spawn 48
>     or 52
>     >> total.
>     >>
>     >> Cheers,
>     >>
>     >> Berk
>     >>>
>     >>> 1) GROMACS is not cgroup aware and hence consider there are 52
>     (or
>     >>> 50) cores per node (this is a very minor issue)
>     >>>
>     >>> 2) spawning such a high number of threads caused some crashes
>     (weird
>     >>> stack traces, I did not spend much time investigating)
>     >>>
>     >>> 3) in the case of A64fx, all cores are up and running, ready to
>     >>> crunch, and do not require any special tricks.
>     >>>
>     >>>
>     >>> At this stage, I think the easiest path to address this on
>     A64fx is
>     >>> to add yet an other cmake option to
>     >>>
>     >>> unconditionally skip the spin up phase.
>     >>>
>     >>> This could be improved by adding a command line option, or an
>     >>> environment variable, to change the default behavior
>     >>>
>     >>> (default behavior should be a cmake option imho)
>     >>>
>     >>>
>     >>> Any thoughts on how to best move forward?
>     >>>
>     >>>
>     >>> Cheers,
>     >>>
>     >>>
>     >>> Gilles
>     >>>
>     >>> On 11/11/2020 6:08 AM, Erik Lindahl wrote:
>     >>>> We might be able to work around the last aspect, but it will
>     likely
>     >>>> take a couple of weeks until I can lay hands on a new ARM-based
>     >>>> Macbook.
>     >>>>
>     >>>> Long story made short: The advanced power-saving features on ARM
>     >>>> mean some cores are not visible until they are used, so we
>     created
>     >>>> a small hack where we "spin up" the CPU by exercising all cores.
>     >>>>
>     >>>> We might anyway need to do something different with the new
>     type of
>     >>>> big.LITTLE cores where we have 4+4 or 8+4 cores, but I can't
>     even
>     >>>> start to work on that until I have suitable hardware. The
>     good news
>     >>>> is that such hardware was announced a couple of hours ago, with
>     >>>> availability next week ;-)
>     >>>>
>     >>>> Cheers,
>     >>>>
>     >>>> Erik
>     >>>>
>     >>>> On Tue, Nov 10, 2020 at 9:45 PM Mark Abraham
>     >>>> <mark.j.abraham at gmail.com <mailto:mark.j.abraham at gmail.com>
>     <mailto:mark.j.abraham at gmail.com
>     <mailto:mark.j.abraham at gmail.com>>> wrote:
>     >>>>
>     >>>>     No, the use of std::thread in eg hardware detection also
>     requires
>     >>>>     a lower level threading implementation.
>     >>>>
>     >>>>     Mark
>     >>>>
>     >>>>     On Tue, Nov 10, 2020, 20:41 Berk Hess <hess at kth.se
>     <mailto:hess at kth.se>
>     >>>>     <mailto:hess at kth.se <mailto:hess at kth.se>>> wrote:
>     >>>>
>     >>>>         Hi,
>     >>>>
>     >>>>         Turning off GMX_THREAD_MPI in cmake should remove the
>     >>>>         dependency on pthreads.
>     >>>>
>     >>>>         Cheers,
>     >>>>
>     >>>>         Berk
>     >>>>
>     >>>>         On 2020-11-10 18:06, Guido Giuntoli wrote:
>     >>>>>
>     >>>>>         Hi,
>     >>>>>
>     >>>>>         Is there any way to disable the “Pthreads”
>     dependency during
>     >>>>>         the configuration/compilation of GROMACS?
>     >>>>>
>     >>>>>         *Best regards | Mit freundlichen Grüßen*
>     >>>>>
>     >>>>>         **
>     >>>>>
>     >>>>>         *Guido Giuntoli***
>     >>>>>
>     >>>>>         **
>     >>>>>
>     >>>>>         HUAWEI TECHNOLOGIES Duesseldorf GmbH
>     >>>>>         Hansaallee 205, 40549 Dusseldorf, Germany,
>     *www.huawei.com <http://www.huawei.com>*
>     >>>>>         <http://www.huawei.com/ <http://www.huawei.com/>>
>     >>>>>         Registered Office: Düsseldorf, Register Court
>     Düsseldorf, HRB
>     >>>>>         56063,
>     >>>>>         Managing Director: Li Peng, Li Jian, Shi Yanli**
>     >>>>>
>     >>>>>         Sitz der Gesellschaft: Düsseldorf, Amtsgericht
>     Düsseldorf,
>     >>>>>         HRB 56063,
>     >>>>>         Geschäftsführer: Li Peng, Li Jian, Shi Yanli
>     >>>>>
>     >>>>>
>     *-----------------------------------------------------------------------------------------------*
>
>     >>>>>
>     >>>>>
>     >>>>>         *This e-mail and its attachments contain confidential
>     >>>>>         information from HUAWEI, which is intended only for the
>     >>>>>         person or entity whose address is listed above. Any
>     use of
>     >>>>>         the information contained herein in any way
>     (including, but
>     >>>>>         not limited to, total or partial disclosure,
>     reproduction, or
>     >>>>>         dissemination) by persons other than the intended
>     >>>>>         recipient(s) is prohibited. If you receive this
>     e-mail in
>     >>>>>         error, please notify the sender by phone or email
>     immediately
>     >>>>>         and delete it!*
>     >>>>>
>     >>>>>
>     >>>>
>     >>>>         --         Gromacs Developers mailing list
>     >>>>
>     >>>>         * Please search the archive at
>     >>>>
>     http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>     <http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List>
>     >>>>
>     <http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>     <http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List>>
>     >>>>         before posting!
>     >>>>
>     >>>>         * Can't post? Read
>     >>>> http://www.gromacs.org/Support/Mailing_Lists
>     <http://www.gromacs.org/Support/Mailing_Lists>
>     >>>>         <http://www.gromacs.org/Support/Mailing_Lists
>     <http://www.gromacs.org/Support/Mailing_Lists>>
>     >>>>
>     >>>>         * For (un)subscribe requests visit
>     >>>>
>     https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>     <https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers>
>
>     >>>>
>     >>>>
>     <https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>     <https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers>>
>
>     >>>>
>     >>>>         or send a mail to gmx-developers-request at gromacs.org
>     <mailto:gmx-developers-request at gromacs.org>
>     >>>>         <mailto:gmx-developers-request at gromacs.org
>     <mailto:gmx-developers-request at gromacs.org>>.
>     >>>>
>     >>>>     --     Gromacs Developers mailing list
>     >>>>
>     >>>>     * Please search the archive at
>     >>>>
>     http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>     <http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List>
>     >>>>
>     <http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>     <http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List>>
>     >>>>     before posting!
>     >>>>
>     >>>>     * Can't post? Read
>     http://www.gromacs.org/Support/Mailing_Lists
>     <http://www.gromacs.org/Support/Mailing_Lists>
>     >>>>     <http://www.gromacs.org/Support/Mailing_Lists
>     <http://www.gromacs.org/Support/Mailing_Lists>>
>     >>>>
>     >>>>     * For (un)subscribe requests visit
>     >>>>
>     https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>     <https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers>
>
>     >>>>
>     >>>>
>     <https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>     <https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers>>
>
>     >>>>
>     >>>>     or send a mail to gmx-developers-request at gromacs.org
>     <mailto:gmx-developers-request at gromacs.org>
>     >>>>     <mailto:gmx-developers-request at gromacs.org
>     <mailto:gmx-developers-request at gromacs.org>>.
>     >>>>
>     >>>>
>     >>>>
>     >>>> --
>     >>>> Erik Lindahl <erik.lindahl at dbb.su.se
>     <mailto:erik.lindahl at dbb.su.se> <mailto:erik.lindahl at dbb.su.se
>     <mailto:erik.lindahl at dbb.su.se>>>
>     >>>> Professor of Biophysics, Dept. Biochemistry & Biophysics,
>     Stockholm
>     >>>> University
>     >>>> Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
>     >>>>
>     >>
>
>     -- 
>     Gromacs Developers mailing list
>
>     * Please search the archive at
>     http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List
>     <http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List>
>     before posting!
>
>     * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>     <http://www.gromacs.org/Support/Mailing_Lists>
>
>     * For (un)subscribe requests visit
>     https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
>     <https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers>
>     or send a mail to gmx-developers-request at gromacs.org
>     <mailto:gmx-developers-request at gromacs.org>.
>
>
>
> -- 
> Erik Lindahl <erik.lindahl at dbb.su.se <mailto:erik.lindahl at dbb.su.se>>
> Professor of Biophysics, Dept. Biochemistry & Biophysics, Stockholm 
> University
> Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
>


More information about the gromacs.org_gmx-developers mailing list