[gmx-developers] jenkins killed by core files

Schulz, Roland roland.schulz at intel.com
Thu Nov 9 18:32:42 CET 2017


Hi,

This was added when we had intermittent crashes in Jenkins. We didn't find the cause without it because we couldn’t reproduce it. 
If we have this problem again in the future we could archive core files only if there is only a single one. If there is more than one than it shouldn't be a hard to reproduce crash.

Roland

-----Original Message-----
From: gromacs.org_gmx-developers-bounces at maillist.sys.kth.se [mailto:gromacs.org_gmx-developers-bounces at maillist.sys.kth.se] On Behalf Of Szilárd Páll
Sent: Thursday, November 9, 2017 9:20 AM
To: Discussion list for GROMACS development <gmx-developers at gromacs.org>
Subject: Re: [gmx-developers] jenkins killed by core files

On Thu, Nov 9, 2017 at 5:59 PM, Szilárd Páll <pall.szilard at gmail.com> wrote:
> Hi,
>
> Earlier today (around 15:00 CET) a change resulted in ~1200 failed 
> tests and generated a volume of core files that swamped the jenkins 
> server. We spent far too much time tracking down the issue and 
> recovering from it and identified some critical issues in the setup 
> that we believe require changes.
>
> Given that:
> - we all are increasingly drafting and developing in gerrit, not even 
> testing locally,
> - we don't have enough space for even the ~1200 cores files of a 
> single failed job could generate (let alone multiple iterations of a 
> buggy change),
> - we're storing uncompressed cores files that we rarely if ever look 
> at we need to take action and prevent such time-consuming failures.
>
> There are two options I see:
> - I'll disable archiving core files (right away so the aforementioned 
> change won't bomb jenkins again ;)) -- after devs time saved by having 
> jenkins compile for them can now be spent on occasionally testing a 
> bit more locally (or on the build slaves if necessary) when hard to 
> track down bugs cause crashes;
>
> - We have found a jenkis plugin that compresses artifacts; if this 
> reduces the size of archived data enough, we could try to deploy it 
> and re-enable core file archival. I have the suspicion that it won't 
> work due to a bug prevented us from using it to begin with, but I'll 
> have to check.

Indeed, it is not compatible with the articaft copy plugin:
https://issues.jenkins-ci.org/browse/JENKINS-22637

Core file archival is now disabled for the master matrix, will do the same for 2016.

>
> Cheers,
> --
> Szilárd
--
Gromacs Developers mailing list

* Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers or send a mail to gmx-developers-request at gromacs.org.


More information about the gromacs.org_gmx-developers mailing list