[gmx-users] Issue with regression test.
Szilárd Páll
pall.szilard at gmail.com
Tue Aug 7 14:42:25 CEST 2018
Hi,
Can you share the directory of the failed test, i.e.
regressiontests/complex/nbnxn_vsite?
Can you check running the regressiontests manually using 1/2/4 ranks, e.g.
perl gmxtest.pl complex -nt 1
--
Szilárd
On Wed, Aug 1, 2018 at 12:49 PM Raymond Arter <raymondarter at gmail.com>
wrote:
> Dear All,
>
> I'm building Gromacs 2018.2 and I've run in to a small issue that I would
> like feed back on.
>
> One of the regression tests fails with the following entered in to the log
> file.
>
> ----
> Mdrun cannot use the requested (or automatic) number of ranks, retrying
> with 8.
>
> Abnormal return value for ' gmx mdrun -nb cpu -notunepme >mdrun.out
> 2>&1' was 1
> Retrying mdrun with better settings...
>
> Abnormal return value for ' gmx mdrun -ntmpi 12 -notunepme >mdrun.out
> 2>&1' was 1
> Retrying mdrun with better settings...
>
> Abnormal return value for ' gmx mdrun -ntmpi 6 -notunepme >mdrun.out
> 2>&1' was -1
> FAILED. Check mdrun.out, md.log file(s) in nbnxn_vsite for nbnxn_vsite
> Re-running orientation-restraints using CPU-based PME
> Re-running pull_geometry_angle using CPU-based PME
> Re-running pull_geometry_angle-axis using CPU-based PME
> Re-running pull_geometry_dihedral using CPU-based PME
> ----
>
> However, on the advice of a colleague I run the following command in the
> directory:
>
> gmx mdrun -ntmpi 4 -notunepme
>
> And got the following result:
>
> ----
> Reading file topol.tpr, VERSION 2018.2 (single precision)
> Non-default thread affinity set, disabling internal thread affinity
> Can not increase nstlist because verlet-buffer-tolerance is not set or used
>
> Using 4 MPI threads
> Using 3 OpenMP threads per tMPI thread
>
> On host ****** 4 GPUs auto-selected for this run.
> Mapping of GPU IDs to the 4 GPU tasks in the 4 ranks on this node:
> PP:0,PP:1,PP:2,PP:3
>
> Back Off! I just backed up traj.trr to ./#traj.trr.1#
>
> Back Off! I just backed up ener.edr to ./#ener.edr.1#
> starting mdrun 'Protein'
> 20 steps, 0.1 ps.
>
> step 20 Turning on dynamic load balancing, because the performance loss due
> to load imbalance is 3.3 %.
>
> Writing final coordinates.
>
> Back Off! I just backed up confout.gro to ./#confout.gro.1#
>
> Dynamic load balancing report:
> DLB was turned on during the run due to measured imbalance.
> Average load imbalance: 13.0%.
> The balanceable part of the MD step is 25%, load imbalance is computed
> from this.
> Part of the total run time spent waiting due to load imbalance: 3.3%.
> Steps where the load balancing was limited by -rdd, -rcon and/or -dds: X 0
> % Y 0 %
>
> NOTE: 6 % of the run time was spent in domain decomposition,
> 17 % of the run time was spent in pair search,
> you might want to increase nstlist (this has no effect on accuracy)
>
> Core t (s) Wall t (s) (%)
> Time: 1.932 0.161 1200.0
> (ns/day) (hour/ns)
> Performance: 56.337 0.426
> ----
>
> Is this just a problem with the regression test and the build of Gromacs is
> fine, or
> is there a problem with the build I have done?
>
> Thanks in advance.
>
> R.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>
More information about the gromacs.org_gmx-users
mailing list