[gmx-developers] Please cancel outdated CI pipelines
erik.lindahl at gmail.com
Tue Sep 29 11:29:54 CEST 2020
This could indeed be important for jobs in general if they auto-detect the
The way containerization work, you will always see that there's e.g. 96
hardware threads on the server, and it is quite possible to start 96
threads (or more) - but the cgroups mechanism in the Linux kernel will
limit the actual CPU usage to the limits set (e.g. 4, with our max allowed
value being 8 or 16, I think). That could then lead to a job trying to run
96 threads, but only having resources equivalent to two physical cores (4
On Tue, Sep 29, 2020 at 11:09 AM Paul bauer <paul.bauer.q at gmail.com> wrote:
> Eric, one question, how do you set the total number of ranks for your jobs
> in the python script?
> I think the issue might be that you are oversubscribing because you are
> trying to use the whole kubernetes cluster, slowing everything down.
> P.s.: https://gitlab.com/gromacs/gromacs/-/merge_requests/594 Is ready
> now and works as intended.
> On 28/09/2020 21:15, Erik Lindahl wrote:
> Note that there are two settings: The *request* you set for CPU and memory
> is guaranteed when you execute (although we count each hardware thread as a
> CPU), while the limit is something you ask for, but aren't guaranteed.
> On Mon, Sep 28, 2020 at 9:14 PM Erik Lindahl <erik.lindahl at gmail.com>
>> Given that several other test containers finish in ~2 minutes, I think
>> it's relatively unlikely that random fluctuating resource contention would
>> systematically always affect the gmxapi test every time it is run, but no
>> other jobs :-)
>> The whole point of k8s/docker is that the runtime environment is
>> standardized; what performance do you see if you run the same container
>> e.g. on your laptop, desktop, or any other cloud resource when assigning
>> two hardware threads to it?
>> On Mon, Sep 28, 2020 at 8:38 PM Eric Irrgang <ericirrgang at gmail.com>
>>> > On Sep 28, 2020, at 6:00 PM, Erik Lindahl <erik.lindahl at gmail.com>
>>> > - gmx-api. They both take 12-15 minutes on two cores, and there are
>>> four of them.
>>> I think there is something wrong with the way resources are detected in
>>> the CI Kubernetes environment that is causing oversubscription. I think at
>>> least 90% of that time is due to resource contention. The jobs only take a
>>> few seconds when run locally. I've mentioned this a couple of times to Paul
>>> and Mark but we haven't been able to prioritize it. It sounds like Mark and
>>> I may take a closer look in October, but I can't troubleshoot effectively
>>> because the run-time environment is opaque to me.
>>> Otherwise, I agree.
>>> Gromacs Developers mailing list
>>> * Please search the archive at
>>> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before
>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>> * For (un)subscribe requests visit
>>> or send a mail to gmx-developers-request at gromacs.org.
>> Erik Lindahl <erik.lindahl at dbb.su.se>
>> Professor of Biophysics, Dept. Biochemistry & Biophysics, Stockholm
>> Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
> Erik Lindahl <erik.lindahl at dbb.su.se>
> Professor of Biophysics, Dept. Biochemistry & Biophysics, Stockholm
> Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
> Paul Bauer, PhD
> GROMACS Development Manager
> KTH Stockholm, SciLifeLab
> Gromacs Developers mailing list
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> * For (un)subscribe requests visit
> or send a mail to gmx-developers-request at gromacs.org.
Erik Lindahl <erik.lindahl at dbb.su.se>
Professor of Biophysics, Dept. Biochemistry & Biophysics, Stockholm
Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the gromacs.org_gmx-developers