[gmx-users] Help with a failing test - gromacs 2019.4 - Test 42

Szilárd Páll pall.szilard at gmail.com
Wed Oct 16 14:43:33 CEST 2019


Hi,

The issue is an internal error triggered by the domain decomposition not
liking 14 cores in your CPU which lead to a prime rank count.
To ensure the tests pass I suggest trying to force only one device to be
used in make check, e.g. CUDA_VISIBLE_DEVICES=0 make check; alternatively
you can run the regressiontests manually.

Cheers,
--
Szilárd


On Thu, Oct 10, 2019 at 6:01 PM Raymond Arter <raymondarter at gmail.com>
wrote:

> Hi,
>
> When performing a "make check" on Gromacs 2019.4, I'm getting test 42
> failing.
> It gives the error:
>
>         Mdrun cannot use the requested (or automatic) number of ranks,
> retrying with 8
>
> And the mdrun.out and md.log of swap_x reports:
>
>         The number of ranks you selected (14) contains a large prime factor
> 7.
>
> I've included the necessary parts of the logs below. Any help would be
> appreciated
> since I haven't come across this error before.
>
> Regards,
>
> T.
>
>
> CentOS Linux release 7.6.1810 (Core)
> CPU: Intel Xeon Gold 6132
> Tesla V100
> Cuda: 10.1
> Driver: 418.40.04
>
> Output of "make check"
>
> 42/46 Test #42: regressiontests/complex .............***Failed  145.88 sec
>
> GROMACS:      gmx mdrun, version 2019.4
> Executable:   /gromacs/2019.4/gromacs-2019.4/build/bin/gmx
> Data prefix:  /gromacs/2019.4/gromacs-2019.4 (source tree)
> Working dir:
>  /gromacs/2019.4/gromacs-2019.4/build/tests/regressiontests-2019.4
> Command line:
>   gmx mdrun -h
>
> Thanx for Using GROMACS - Have a Nice Day
>
> Mdrun cannot use the requested (or automatic) number of ranks, retrying
> with 8.
>
> Abnormal return value for ' gmx mdrun    -nb cpu   -notunepme >mdrun.out
> 2>&1' was 1
> Retrying mdrun with better settings...
> Re-running orientation-restraints using CPU-based PME
> Re-running pull_geometry_angle using CPU-based PME
> Re-running pull_geometry_angle-axis using CPU-based PME
> Re-running pull_geometry_dihedral using CPU-based PME
>
> Abnormal return value for ' gmx mdrun       -notunepme >mdrun.out 2>&1' was
> -1
> FAILED. Check mdrun.out, md.log file(s) in swap_x for swap_x
>
> Abnormal return value for ' gmx mdrun       -notunepme >mdrun.out 2>&1' was
> -1
> FAILED. Check mdrun.out, md.log file(s) in swap_y for swap_y
>
> Abnormal return value for ' gmx mdrun       -notunepme >mdrun.out 2>&1' was
> -1
> FAILED. Check mdrun.out, md.log file(s) in swap_z for swap_z
> 3 out of 55 complex tests FAILED
>
> From the following directory:
>
> /gromacs/2019.4/gromacs-2019.4/build/tests/regressiontests-2019.4/complex/swap_x
> and I get the same errors for swap_y and swap_z
>
> == mdrun.out ==
>
> GROMACS:      gmx mdrun, version 2019.4
> Executable:   /gromacs/2019.4/gromacs-2019.4/build/bin/gmx
> Data prefix:  /gromacs/2019.4/gromacs-2019.4 (source tree)
> Working dir:
>
>  /gromacs/2019.4/gromacs-2019.4/build/tests/regressiontests-2019.4/complex/swap_x
> Command line:
>   gmx mdrun -notunepme
>
> Reading file topol.tpr, VERSION 2019.4 (single precision)
> Changing nstlist from 10 to 50, rlist from 1.011 to 1.137
>
> -------------------------------------------------------
> Program:     gmx mdrun, version 2019.4
> Source file: src/gromacs/domdec/domdec_setup.cpp (line 764)
> MPI rank:    0 (out of 14)
>
> Fatal error:
> The number of ranks you selected (14) contains a large prime factor 7. In
> most
> cases this will lead to bad performance. Choose a number with smaller prime
> factors or set the decomposition (option -dd) manually.
>
> For more information and tips for troubleshooting, please check the GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> -------------------------------------------------------
>
> == md.out ==
>
> Changing nstlist from 10 to 50, rlist from 1.011 to 1.137
>
> Initializing Domain Decomposition on 14 ranks
> Dynamic load balancing: locked
> Minimum cell size due to atom displacement: 0.692 nm
> Initial maximum distances in bonded interactions:
>     two-body bonded interactions: 0.403 nm, Exclusion, atoms 184 187
>   multi-body bonded interactions: 0.403 nm, Ryckaert-Bell., atoms 184 187
> Minimum cell size due to bonded interactions: 0.443 nm
> Maximum distance for 3 constraints, at 120 deg. angles, all-trans: 0.459 nm
> Estimated maximum distance required for P-LINCS: 0.459 nm
>
> -------------------------------------------------------
> Program:     gmx mdrun, version 2019.4
> Source file: src/gromacs/domdec/domdec_setup.cpp (line 764)
> MPI rank:    0 (out of 14)
>
> Fatal error:
> The number of ranks you selected (14) contains a large prime factor 7. In
> most
> cases this will lead to bad performance. Choose a number with smaller prime
> factors or set the decomposition (option -dd) manually.
>
> For more information and tips for troubleshooting, please check the GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> -------------------------------------------------------
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list