[gmx-developers] New Test Set

Fri Feb 10 19:36:36 CET 2012

Hi,

On Fri, Feb 10, 2012 at 12:45 PM, Erik Lindahl <erik at kth.se> wrote:

>  Hi,
>
>  It will certainly be easier to have tests with close-to-perfect
> conservation.
>

Yes. It will also help if the test set will be always up to date. Jenkins
should guarantee that by forcing developers to change the expected result
whenever a different algorithm doesn't have a binary identical result.

However, it's too late in the cycle to decide that we're not going to
> release anything without tests. We should still do them, but I would like
> to see two separate parts:
>
But I think we have to somehow give it a high priority and make sure a
large percentage of us developers are contributing to this tasks. I don't
think it can be accomplished by just a few people. When a large part of
basic code is rewritten in C++, we should already have the tests to know
when we create regression bugs.

1) Low-level tests that specifically check the output for several sets of
> input for a *module*, i.e. calling routines rather than running a
> simulation. The point is that this will isolate errors either to a specific
> module, or to modules it depends on. However, when those modules too should
> have tests it will be a 5-min job to find what file+routine the bug is
> likely to be in.
>
What framework should be used to write these unit tests? Should those be
written using GoogleTest as those tests written by Teemu? This would mean
that the tests only compile with C++ but I don't think this would be a
problem.

2) Higher-level tests that check whether this feature appears to work in
> practice in a simulation. The point of these tests is mostly to make sure
> other new features don't break our module.
>
How should we run these integration tests? Should we run them similar to
how the current test-set is run? I.e have scripts which run pdb2gmx,
grompp, mdrun and parse the output for the results and errors. If so do we
want to base it onto the existing perl scripts, have some existing external
framework or write some new scripts from scratch?

Roland

>  Cheers,
>
>  Erik
>
>
>
>   On Feb 6, 2012, at 4:17 PM, Berk Hess wrote:
>
>  Hi,
>
> I agree test sets are very important.
> Having good tests will make development and especially the process of
> accepting contributions much easier.
>
> Now that we have the new, by default, energy conserving loops, I realize
> that energy conservation
> is extremely useful for validation. I think that having tests that check
> energy conservation and particular
> energy values of particular (combinations of) functionality will catch a
> lot of problems.
> The problems is that MD is chaotic and with non energy-conserving setups
> the divergence is extremely fast.
> With energy conservation running 20 steps with nstlist=10, checking the
> conserved energy + a few terms
> would be enough for testing most modules, I think.
> We still want some more extended tests, but that could be a separate set.
>
> So setting up a framework for the simple tests should not be too hard.
> Then we need to come up with a set of tests and reference values.
>
> Cheers,
>
> Berk
>
> On 02/05/2012 04:56 AM, Roland Schulz wrote:
>
> Hi,
>
>  we agreed that we would want to have a test set for 4.6 but so far we
> haven't made any progress on it (as far as I know). I want to try to get
> this work started by posting here a list of questions I have about the new
> test set. Please add your own questions and answer any questions you can
> (no need to try to answer all questions).
>
>  - Why do the current tests fail? Is it only because of different
> floating point rounding or are there other problems? What's the best
> procedure to find out why a test fails?
> - Which tests should be part of the new test set?
> - Should the current tests all be part of the new test set?
> - How should the new test be implemented? Should the comparison with the
> reference value be done in C (within mdrun), ctest script, python or perl?
> - Should the new test execute mdrun for each test? Or should we somehow
> (e.g. python wrapper or within mdrun) load the binary only once and run
> many test per execution?
> - What are the requirements for the new test set? E.g. how easy should it
> be to see whats wrong when a test fails? Should the test support being run
> under valgrind? Other?
> - Do we have any other bugs which have to be solved before the test can be
> implemented? E.g. is the problem with shared libraries solved? Are there
> any open redmine issues related to the new test set?
> - Should we have a policy that everyone who adds a feature also has to
> provide tests covering those features?
> - Should we have a conference call to discuss the test set? If yes when?
> - Should we agree that we won't release 4.6 without the test set to give
> it a high priority?
> Roland
>
>  --
> ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
> 865-241-1537, ORNL PO BOX 2008 MS6309
>
>
>
>  --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
>
>
>   --
> Erik Lindahl <erik at kth.se>
> Professor of Theoretical & Computational Biophysics
> Department of Theoretical Physics & Swedish e-Science Research Center
> Royal Institute of Technology, Stockholm, Sweden
> Tel1: +46855378029  Tel2: +468164675  Cell: +46734618050
>
>

-- 
ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20120210/a02508e1/attachment.html>