[gmx-developers] Test Suite, CTest, CDash

Mark Abraham mark.abraham at anu.edu.au
Wed Aug 25 15:58:19 CEST 2010

----- Original Message -----
From: "Esztermann, Ansgar" <Ansgar.Esztermann at mpi-bpc.mpg.de>
Date: Wednesday, August 25, 2010 22:49
Subject: [gmx-developers] Test Suite, CTest, CDash
To: Discussion list for GROMACS development <gmx-developers at gromacs.org>

> Hello everyone,
> a few years ago, I have been briefly involved in the gcc 
> project. I was extremely impressed by the huge test suite and by 
> the way it could be used to quickly detect regressions. Putting 
> aside for the moment the fact that "correct behaviour" is much 
> easier to define for a compiler than for a simulation engine, I 
> think that gromacs would profit from a similar test suite.

Agreed. There are huge combinatorial and chaotic problems, though. I'd expect there must be some literature on dealing with the former.

Testing an MD implementation might use a non-trivial amount of computing resources. I've seen more than a few bugs that required a combination of three or more moderately unusual conditions to occur.  Those are the ones that regression suites are really useful for finding. However how do you store the reference results? If you don't store them, you have to recompute them, which requires having the reference version of the code compiled, and some compute time.

> So I've taken Mark Abrahams' Regression Tests and started to 

Actually, I think David van der Spoel wrote them originally. I just did some updates a year or more ago. Rossen has done some recent things - see threads on gmx-developers.

> convert them to CTest. The simple and complex tests are already 
> in my local git repository (although some of them are still 
> failing). There is CDash support as well: 
> http://my.cdash.org/index.php?project=Gromacs
> I'm planning to extend and maintain these tests. Offhand, these 
> points strike me as important:
> -Integrate more of Mark's tests (i.e. kernel tests, double 
> precision tests)
> -Fix the tests that are failing (or the software, if the failure 
> is genuine)

The failing tests mostly fail on the virial calculation, because that's sensitive to just about everything else. There ought to be a tolerance within which the calculation is acceptably accurate, however. How do we work out what that is?

> -Add more tests (unit tests, "bugzilla" tests triggering known bugs)
> -Add coverage support

One of my more frustrating experiences was a bug provoked when I did an "mdrun -rerun" on a 3.3.3 trajectory under 4.0.x. Trajectories written under each version have different properties, including whether broken molecules might be written, and the consequences of one such difference drove me up the wall for weeks. (Part of the problem was my being ignorant of the possibility that trajectory properties might have changed, of course!)

Now, suppose there'd been a regression suite during the 3.3.3 - 4.0 transition that was capable of flagging to the developers that, under some conditions, a 3.3.3 trajectory would rerun differently under 4.0. We'd have to have gotten lucky about the events in the 3.3.3 trajectory, and in the 4.0 rerun. Even so, there'd be no good solution, because the trajectory formats don't embed any version number. So we'd either have to have implemented after-the-fact a magic number scheme (e.g. as used for testing endian-ness) that changed with version number of the code that produced it, or made 4.0 print a warning against rerunning 3.3.3 trajectories. Neither of those solutions is all that palatable. (Of course, prevention is better than cure - a file format that encodes a version number of the program writing it is better than one that doesn't, but that's not obvious when you go to design the format!)

So I'm going to sound a bit negative here, but I'm skeptical that it's possible to do a sound job of a MD regression suite. Who's got a computer science theoretician looking for a project? :-)


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20100825/ab3900ea/attachment.html>

More information about the gromacs.org_gmx-developers mailing list