[gmx-developers] Please cancel outdated CI pipelines

Eric Irrgang ericirrgang at gmail.com
Tue Sep 29 16:17:09 CEST 2020


Following up at https://gitlab.com/gromacs/gromacs/-/issues/3704

> On Sep 29, 2020, at 4:36 PM, Paul bauer <paul.bauer.q at gmail.com> wrote:
> 
> @Eric, I can confirm that your scripts will try to use all available cores for their jobs, because I don't see anywhere a restriction for the number of OMP threads in the tests with mdrun.
> 
> I will disable the gmxapi testing if this is not fixed, because it is choking the rest of the infrastructure.
> 
> /Paul
> 
> On 29/09/2020 11:29, Erik Lindahl wrote:
>> This could indeed be important for jobs in general if they auto-detect the hardware.
>> 
>> The way containerization work, you will always see that there's e.g. 96 hardware threads on the server, and it is quite possible to start 96 threads (or more) - but the cgroups mechanism in the Linux kernel will limit the actual CPU usage to the limits set (e.g. 4, with our max allowed value being 8 or 16, I think).  That could then lead to a job trying to run 96 threads, but only having resources equivalent to two physical cores (4 threads).
>> 
>> Cheers,
>> 
>> Erik
>> 
>> On Tue, Sep 29, 2020 at 11:09 AM Paul bauer <paul.bauer.q at gmail.com> wrote:
>> Eric, one question, how do you set the total number of ranks for your jobs in the python script?
>> 
>> I think the issue might be that you are oversubscribing because you are trying to use the whole kubernetes cluster, slowing everything down.
>> 
>> Cheers
>> 
>> Paul
>> 
>> P.s.: https://gitlab.com/gromacs/gromacs/-/merge_requests/594 Is ready now and works as intended.
>> 
>> On 28/09/2020 21:15, Erik Lindahl wrote:
>>> PS:
>>> 
>>> Note that there are two settings: The *request* you set for CPU and memory is guaranteed when you execute (although we count each hardware thread as a CPU), while the limit is something you ask for, but aren't guaranteed.
>>> 
>>> Cheers,
>>> 
>>> Erik
>>> 
>>> On Mon, Sep 28, 2020 at 9:14 PM Erik Lindahl <erik.lindahl at gmail.com> wrote:
>>> Hi,
>>> 
>>> Given that several other test containers finish in ~2 minutes, I think it's relatively unlikely that random fluctuating resource contention would systematically always affect the gmxapi test every time it is run, but no other jobs :-)
>>> 
>>> The whole point of k8s/docker is that the runtime environment is standardized; what performance do you see if you run the same container e.g. on your laptop, desktop, or any other cloud resource when assigning two hardware threads to it?
>>> 
>>> Cheers,
>>> 
>>> Erik
>>> 
>>> On Mon, Sep 28, 2020 at 8:38 PM Eric Irrgang <ericirrgang at gmail.com> wrote:
>>> 
>>> > On Sep 28, 2020, at 6:00 PM, Erik Lindahl <erik.lindahl at gmail.com> wrote:
>>> > 
>>> > - gmx-api. They both take 12-15 minutes on two cores, and there are four of them. 
>>> 
>>> I think there is something wrong with the way resources are detected in the CI Kubernetes environment that is causing oversubscription. I think at least 90% of that time is due to resource contention. The jobs only take a few seconds when run locally. I've mentioned this a couple of times to Paul and Mark but we haven't been able to prioritize it. It sounds like Mark and I may take a closer look in October, but I can't troubleshoot effectively because the run-time environment is opaque to me.
>>> 
>>> Otherwise, I agree.
>>> 
>>> -- 
>>> Gromacs Developers mailing list
>>> 
>>> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before posting!
>>> 
>>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>> 
>>> * For (un)subscribe requests visit
>>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers or send a mail to gmx-developers-request at gromacs.org.
>>> 
>>> 
>>> -- 
>>> Erik Lindahl <erik.lindahl at dbb.su.se>
>>> Professor of Biophysics, Dept. Biochemistry & Biophysics, Stockholm University
>>> Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
>>> 
>>> 
>>> -- 
>>> Erik Lindahl <erik.lindahl at dbb.su.se>
>>> Professor of Biophysics, Dept. Biochemistry & Biophysics, Stockholm University
>>> Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
>>> 
>> 
>> 
>> 
>> -- 
>> Paul Bauer, PhD
>> GROMACS Development Manager
>> KTH Stockholm, SciLifeLab
>> 0046737308594
>> 
>> -- 
>> Gromacs Developers mailing list
>> 
>> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before posting!
>> 
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> 
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers or send a mail to gmx-developers-request at gromacs.org.
>> 
>> 
>> -- 
>> Erik Lindahl <erik.lindahl at dbb.su.se>
>> Professor of Biophysics, Dept. Biochemistry & Biophysics, Stockholm University
>> Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
>> 
> 
> 
> 
> -- 
> Paul Bauer, PhD
> GROMACS Development Manager
> KTH Stockholm, SciLifeLab
> 0046737308594
> 
> -- 
> Gromacs Developers mailing list
> 
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before posting!
> 
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> 
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers or send a mail to gmx-developers-request at gromacs.org.



More information about the gromacs.org_gmx-developers mailing list