[gmx-developers] jenkins killed by core files
Szilárd Páll
pall.szilard at gmail.com
Thu Nov 9 18:20:32 CET 2017
On Thu, Nov 9, 2017 at 5:59 PM, Szilárd Páll <pall.szilard at gmail.com> wrote:
> Hi,
>
> Earlier today (around 15:00 CET) a change resulted in ~1200 failed
> tests and generated a volume of core files that swamped the jenkins
> server. We spent far too much time tracking down the issue and
> recovering from it and identified some critical issues in the setup
> that we believe require changes.
>
> Given that:
> - we all are increasingly drafting and developing in gerrit, not even
> testing locally,
> - we don't have enough space for even the ~1200 cores files of a
> single failed job could generate (let alone multiple iterations of a
> buggy change),
> - we're storing uncompressed cores files that we rarely if ever look at
> we need to take action and prevent such time-consuming failures.
>
> There are two options I see:
> - I'll disable archiving core files (right away so the aforementioned
> change won't bomb jenkins again ;)) -- after devs time saved by having
> jenkins compile for them can now be spent on occasionally testing a
> bit more locally (or on the build slaves if necessary) when hard to
> track down bugs cause crashes;
>
> - We have found a jenkis plugin that compresses artifacts; if this
> reduces the size of archived data enough, we could try to deploy it
> and re-enable core file archival. I have the suspicion that it won't
> work due to a bug prevented us from using it to begin with, but I'll
> have to check.
Indeed, it is not compatible with the articaft copy plugin:
https://issues.jenkins-ci.org/browse/JENKINS-22637
Core file archival is now disabled for the master matrix, will do the
same for 2016.
>
> Cheers,
> --
> Szilárd
More information about the gromacs.org_gmx-developers
mailing list