[gmx-users] Issue with regression test.

Szilárd Páll pall.szilard at gmail.com
Tue Aug 7 14:42:25 CEST 2018


Hi,

Can you share the directory of the failed test, i.e.
regressiontests/complex/nbnxn_vsite?
Can you check running the regressiontests manually using 1/2/4 ranks, e.g.
perl gmxtest.pl complex -nt 1

--
Szilárd


On Wed, Aug 1, 2018 at 12:49 PM Raymond Arter <raymondarter at gmail.com>
wrote:

> Dear All,
>
> I'm building Gromacs 2018.2 and I've run in to a small issue that I would
> like feed back on.
>
> One of the regression tests fails with the following entered in to the log
> file.
>
> ----
> Mdrun cannot use the requested (or automatic) number of ranks, retrying
> with 8.
>
> Abnormal return value for ' gmx mdrun    -nb cpu   -notunepme >mdrun.out
> 2>&1' was 1
> Retrying mdrun with better settings...
>
> Abnormal return value for ' gmx mdrun -ntmpi 12      -notunepme >mdrun.out
> 2>&1' was 1
> Retrying mdrun with better settings...
>
> Abnormal return value for ' gmx mdrun -ntmpi 6      -notunepme >mdrun.out
> 2>&1' was -1
> FAILED. Check mdrun.out, md.log file(s) in nbnxn_vsite for nbnxn_vsite
> Re-running orientation-restraints using CPU-based PME
> Re-running pull_geometry_angle using CPU-based PME
> Re-running pull_geometry_angle-axis using CPU-based PME
> Re-running pull_geometry_dihedral using CPU-based PME
> ----
>
> However, on the advice of a colleague I run the following command in the
> directory:
>
>     gmx mdrun -ntmpi 4 -notunepme
>
> And got the following result:
>
> ----
> Reading file topol.tpr, VERSION 2018.2 (single precision)
> Non-default thread affinity set, disabling internal thread affinity
> Can not increase nstlist because verlet-buffer-tolerance is not set or used
>
> Using 4 MPI threads
> Using 3 OpenMP threads per tMPI thread
>
> On host ****** 4 GPUs auto-selected for this run.
> Mapping of GPU IDs to the 4 GPU tasks in the 4 ranks on this node:
>   PP:0,PP:1,PP:2,PP:3
>
> Back Off! I just backed up traj.trr to ./#traj.trr.1#
>
> Back Off! I just backed up ener.edr to ./#ener.edr.1#
> starting mdrun 'Protein'
> 20 steps,      0.1 ps.
>
> step 20 Turning on dynamic load balancing, because the performance loss due
> to load imbalance is 3.3 %.
>
> Writing final coordinates.
>
> Back Off! I just backed up confout.gro to ./#confout.gro.1#
>
>  Dynamic load balancing report:
>  DLB was turned on during the run due to measured imbalance.
>  Average load imbalance: 13.0%.
>  The balanceable part of the MD step is 25%, load imbalance is computed
> from this.
>  Part of the total run time spent waiting due to load imbalance: 3.3%.
>  Steps where the load balancing was limited by -rdd, -rcon and/or -dds: X 0
> % Y 0 %
>
> NOTE: 6 % of the run time was spent in domain decomposition,
>       17 % of the run time was spent in pair search,
>       you might want to increase nstlist (this has no effect on accuracy)
>
>                Core t (s)   Wall t (s)        (%)
>        Time:        1.932        0.161     1200.0
>                  (ns/day)    (hour/ns)
> Performance:       56.337        0.426
> ----
>
> Is this just a problem with the regression test and the build of Gromacs is
> fine, or
> is there a problem with the build I have done?
>
> Thanks in advance.
>
> R.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list