[gmx-users] Make check not passing tests on 2018.5

Szilárd Páll pall.szilard at gmail.com
Tue Feb 5 13:14:56 CET 2019


On Fri, Feb 1, 2019 at 5:01 AM David Lister <me at davidlister.ca> wrote:
>
> Hello,
>
> I've compiled gromacs 2018.5 in double precision a couple times now and it
> keeps on failing the same tests every time. This is on Ubuntu 18.04 with an
> i9 7900X.
>
> The cmake I used was:
> cmake .. -DREGRESSIONTEST_DOWNLOAD=ON -DGMX_GPU=on -DGMX_BUILD_OWN_FFTW=ON
> -DCMAKE_BUILD_TYPE=Release -DGMX_BUILD_UNITTESTS=ON
> -DCMAKE_INSTALL_PREFIX=/usr/local/gromacs-2018.5

That couldn't have been the cmake invocation as it's not double
precision build (and -DGMX_GPU=on is not valid in a double precision
build).

>
> The problem with make check is:
>
> 34/39 Test #34: regressiontests/simple ...........   Passed    3.01 sec
>       Start 35: regressiontests/complex
> 35/39 Test #35: regressiontests/complex ..........***Failed   34.69 sec
> Will test using executable suffix _d
>              :-) GROMACS - gmx mdrun, 2018.5 (double precision) (-:
>
>                             GROMACS is written by:
>      Emile Apol      Rossen Apostolov      Paul Bauer     Herman J.C. Berendsen
>     Par Bjelkmar    Aldert van Buuren   Rudi van Drunen     Anton Feenstra
>   Gerrit Groenhof    Aleksei Iupinov   Christoph Junghans   Anca Hamuraru
>  Vincent Hindriksen Dimitrios Karkoulis    Peter Kasson        Jiri Kraus
>   Carsten Kutzner      Per Larsson      Justin A. Lemkul    Viveca Lindahl
>   Magnus Lundborg   Pieter Meulenhoff    Erik Marklund      Teemu Murtola
>     Szilard Pall       Sander Pronk      Roland Schulz     Alexey Shvetsov
>    Michael Shirts     Alfons Sijbers     Peter Tieleman    Teemu Virolainen
>  Christian Wennberg    Maarten Wolf
>                            and the project leaders:
>         Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
>
> Copyright (c) 1991-2000, University of Groningen, The Netherlands.
> Copyright (c) 2001-2017, The GROMACS development team at
> Uppsala University, Stockholm University and
> the Royal Institute of Technology, Sweden.
> check out http://www.gromacs.org for more information.
>
> GROMACS is free software; you can redistribute it and/or modify it
> under the terms of the GNU Lesser General Public License
> as published by the Free Software Foundation; either version 2.1
> of the License, or (at your option) any later version.
>
> GROMACS:      gmx mdrun, version 2018.5 (double precision)
> Executable:   /home/david/gromacs-2018.5/build/bin/gmx_d
> Data prefix:  /home/david/gromacs-2018.5 (source tree)
> Working dir:  /home/david/gromacs-2018.5/build/tests/regressiontests-2018.5
> Command line:
>   gmx_d mdrun -h
>
>
> Thanx for Using GROMACS - Have a Nice Day
>
> Mdrun cannot use the requested (or automatic) number of ranks, retrying with 8.
>
> Abnormal return value for ' gmx_d mdrun    -nb cpu   -notunepme
> >mdrun.out 2>&1' was 1
> Retrying mdrun with better settings...
> Mdrun cannot use the requested (or automatic) number of ranks, retrying with 8.
>
> Abnormal return value for ' gmx_d mdrun    -nb cpu   -notunepme
> >mdrun.out 2>&1' was 1
> Retrying mdrun with better settings...
> FAILED. Check checkpot.out (69 errors), checkforce.out (224 errors)
> file(s) in distance_restraints for distance_restraints
> Mdrun cannot use the requested (or automatic) number of ranks, retrying with 8.
>
> Abnormal return value for ' gmx_d mdrun    -nb cpu   -notunepme
> >mdrun.out 2>&1' was 1
> Retrying mdrun with better settings...
> Mdrun cannot use the requested (or automatic) number of ranks, retrying with 8.
>
> Abnormal return value for ' gmx_d mdrun       -notunepme >mdrun.out 2>&1' was 1
> Retrying mdrun with better settings...
> Mdrun cannot use the requested (or automatic) number of ranks, retrying with 8.
>
> Abnormal return value for ' gmx_d mdrun       -notunepme >mdrun.out 2>&1' was 1
> Retrying mdrun with better settings...
> FAILED. Check checkpot.out (16 errors), checkforce.out (1362 errors)
> file(s) in orientation-restraints for orientation-restraints
> 2 out of 51 complex tests FAILED
>
>       Start 36: regressiontests/kernel
> 36/39 Test #36: regressiontests/kernel ...........   Passed   55.12 sec
>       Start 37: regressiontests/freeenergy
> 37/39 Test #37: regressiontests/freeenergy .......   Passed    9.27 sec
>       Start 38: regressiontests/pdb2gmx
> 38/39 Test #38: regressiontests/pdb2gmx ..........   Passed   16.95 sec
>       Start 39: regressiontests/rotation
> 39/39 Test #39: regressiontests/rotation .........   Passed    3.60 sec
>
> 97% tests passed, 1 tests failed out of 39
>
> Label Time Summary:
> GTest              =   4.54 sec*proc (33 tests)
> IntegrationTest    =   2.03 sec*proc (3 tests)
> MpiTest            =   0.20 sec*proc (3 tests)
> UnitTest           =   2.51 sec*proc (30 tests)
>
> Total Test time (real) = 127.19 sec
>
> The following tests FAILED:
>          35 - regressiontests/complex (Failed)
> Errors while running CTest
> CMakeFiles/run-ctest-nophys.dir/build.make:57: recipe for target
> 'CMakeFiles/run-ctest-nophys' failed
> make[3]: *** [CMakeFiles/run-ctest-nophys] Error 8
> CMakeFiles/Makefile2:1385: recipe for target
> 'CMakeFiles/run-ctest-nophys.dir/all' failed
> make[2]: *** [CMakeFiles/run-ctest-nophys.dir/all] Error 2
> CMakeFiles/Makefile2:1165: recipe for target 'CMakeFiles/check.dir/rule' failed
> make[1]: *** [CMakeFiles/check.dir/rule] Error 2
> Makefile:626: recipe for target 'check' failed
> make: *** [check] Error 2
>
> Looking at the log file for the orientation-restraints test shows
> different energies calculated for some terms. The different terms are
> bolded, but if that doesn't carry the LJ, Coulomb, Potential, Kinetic,
> Total, Conserved, Temperature, Pressure and Constr rmsd are all
> different. In most cases significantly.

Looks like the short-range forces are quite a bit off. I can not
reproduce the issue on similar hardware (although with vanilla gcc
7.3).

Can you please do a few tests which might give a hint where the issue is:
- see if the error is reproducible across multiple runs
- try with -DGMX_SIMD=AVX2_256
- try a different compiler (e.g. gcc 8 or clang)

Cheers,
--
Szilárd

>
> From reference_d.log:
>    Energies (kJ/mol)
>            Bond            U-B    Proper Dih.  Improper Dih.      CMAP Dih.
>     8.49937e+02    2.39137e+03    1.86371e+03    1.57563e+02   -3.65191e+02
>           LJ-14     Coulomb-14        LJ (SR)   Coulomb (SR)   Coul. recip.
>     9.32112e+02    1.40820e+04   -*1.21790e+03*   *-2.01406e+04*    4.96236e+01
>   Orient. Rest.   Ori. R. RMSD      Potential    Kinetic En.   Total Energy
>     2.68612e+00    1.95988e+00   *-1.39474e+03*    *3.19608e+03*
> *1.80134e+03*
>   Conserved En.    Temperature Pressure (bar)   Constr. rmsd
>     *1.80134e+03*    *3.03394e+02*   *-4.01423e+01*    *6.77807e-09*
>
> And from my system:
>    Energies (kJ/mol)
>            Bond            U-B    Proper Dih.  Improper Dih.      CMAP Dih.
>     8.49937e+02    2.39137e+03    1.86371e+03    1.57563e+02   -3.65191e+02
>           LJ-14     Coulomb-14        LJ (SR)   Coulomb (SR)   Coul. recip.
>     9.32112e+02    1.40820e+04   *-3.34280e+02   -3.70272e+03*    4.96236e+01
>   Orient. Rest.   Ori. R. RMSD      Potential    Kinetic En.   Total Energy
>     2.68612e+00    1.95988e+00    *1.59268e+04    3.21311e+03*    *1.91399e+04*
>   Conserved En.    Temperature Pressure (bar)   Constr. rmsd
>     *1.91399e+04    3.05010e+02    1.11267e+02    7.18413e-09*
>
> Any help on how resolve this would be greatly appreciated.
>
> Cheers,
>
> David
> --
> Gromacs Users mailing list
>
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.


More information about the gromacs.org_gmx-users mailing list