[gmx-developers] Test Suite, CTest, CDash

Wed Aug 25 18:59:19 CEST 2010

On Wed, Aug 25, 2010 at 9:58 AM, Mark Abraham <mark.abraham at anu.edu.au>wrote:

>
>
> ----- Original Message -----
> From: "Esztermann, Ansgar" <Ansgar.Esztermann at mpi-bpc.mpg.de>
> Date: Wednesday, August 25, 2010 22:49
> Subject: [gmx-developers] Test Suite, CTest, CDash
> To: Discussion list for GROMACS development <gmx-developers at gromacs.org>
>
> > Hello everyone,
> >
> >
> > a few years ago, I have been briefly involved in the gcc
> > project. I was extremely impressed by the huge test suite and by
> > the way it could be used to quickly detect regressions. Putting
> > aside for the moment the fact that "correct behaviour" is much
> > easier to define for a compiler than for a simulation engine, I
> > think that gromacs would profit from a similar test suite.
>
> Agreed. There are huge combinatorial and chaotic problems, though. I'd
> expect there must be some literature on dealing with the former.
>
I don't think their is any magic solution for it. One will never have have
enough tests to check for all possible combinations.
Suggestions I have read (or are my own opinion):
- When ever a bug is found add a test which would have found the bug. Easy
since one usually already has a simple test tpr. And the experience is that
it happens quite often that the same or a very similar bug reappears in the
future.
- Have very simple tests. Not units tests (found find integration bugs) but
tests which are simple (easy to understand) and fast to run.
- Have very many tests. I think it would be good to collect tpr a large
number of tpr which has shown problems in the past and make sure the
collection covers a reasonable number of combinations of input files,
methods, parameters and parallelization.
- Test for things which are not (very) dependent on the chaotic behavior.
Test e.g. for system not exploding, minimum energy conservation, simulation
not crashing, performance or energy after first (few) steps.
- Make non-brittle tests. Thus make sure that the tests don't fail if some
internal things change non-relevant to the test. Thus e.g. don't check the
energies to the last digit.

So I'm going to sound a bit negative here, but I'm skeptical that it's
> possible to do a sound job of a MD regression suite. Who's got a computer
> science theoretician looking for a project? :-)

I'm sure it is impossible to write a regression suite which would find all
possible errors for any software ;-). I read that some software has more
code for tests than for the software itself (and still are not bug free).
And while to many (brittle) tests are a maintainability problem, I think it
is possible to add some good tests to GROMACS which would have made it
likely to have found bugs earlier and easier.

Specific suggestions:
- Discuss and develop guidelines how good tests should look like
- Collects TPRs which have shown bugs before or otherwise are useful (test
specific feature - e.g. PULL, ...) and run fast
- Test the results for all those TPRs in a way which is non-brittle.

Roland

-- 
ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20100825/cba8ac30/attachment.html>