[gmx-developers] Software inconsistency error: pme_loadbal_do called at an interval != nstlist
Berk Hess
hess at kth.se
Sun May 29 13:54:53 CEST 2016
Hi,
This issue is triggered when restrarting from a checkpoint file that is
not from a step % nstlist == 0.
A one-line fix was merged into release-5.1 in March:
https://gerrit.gromacs.org/#/c/5721/
You can apply this fix, or run with -notunepme. Or you can run with
-notunepme and press Crtl-C once to get a new checkpoint file with step
% nstlist == 0 and continue from that.
Cheers,
Berk
On 05/29/2016 01:17 AM, Yorquant Wang wrote:
> Dear All,
>
> I recently met a error:"Software inconsistency error: pme_loadbal_do
> called at an interval != nstlist" when I try to restart one of jobs.
> I used GMX 5.1.2 on K80 GPU nodes with haswell CPU.
>
> Is there anyone know how to solve this error?
> Thank you very much!
>
> cheers
> Yukun Wang
>
> below is the output log files:
>
> :-) GROMACS - gmx mdrun, VERSION 5.1.2 (-:
>
> GROMACS is written by:
> Emile Apol Rossen Apostolov Herman J.C. Berendsen Par Bjelkmar
> Aldert van Buuren Rudi van Drunen Anton Feenstra Sebastian
> Fritsch
> Gerrit Groenhof Christoph Junghans Anca Hamuraru Vincent Hindriksen
> Dimitrios Karkoulis Peter Kasson Jiri Kraus Carsten
> Kutzner
> Per Larsson Justin A. Lemkul Magnus Lundborg Pieter Meulenhoff
> Erik Marklund Teemu Murtola Szilard Pall Sander Pronk
> Roland Schulz Alexey Shvetsov Michael Shirts Alfons Sijbers
> Peter Tieleman Teemu Virolainen Christian Wennberg Maarten Wolf
> and the project leaders:
> Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
>
> Copyright (c) 1991-2000, University of Groningen, The Netherlands.
> Copyright (c) 2001-2015, The GROMACS development team at
> Uppsala University, Stockholm University and
> the Royal Institute of Technology, Sweden.
> check out http://www.gromacs.org for more information.
>
> GROMACS is free software; you can redistribute it and/or modify it
> under the terms of the GNU Lesser General Public License
> as published by the Free Software Foundation; either version 2.1
> of the License, or (at your option) any later version.
>
> GROMACS: gmx mdrun, VERSION 5.1.2
> Executable: /lustre/usr/gromacs/5.1-icc15-impi5-gpu/bin/gmx_mpi
> Data prefix: /lustre/usr/gromacs/5.1-icc15-impi5-gpu
> Command line:
> gmx_mpi mdrun -deffnm 200DPPC-16-WT_COO-_150mV_2 -append -cpi -v
> -resethway -noconfout
>
> No previous checkpoint file present, assuming this is a new run.
>
> Number of logical cores detected (16) does not match the number
> reported by OpenMP (8).
> Consider setting the launch configuration manually!
>
> Running on 1 node with total 16 cores, 16 logical cores, 2 compatible GPUs
> Hardware detected on host gpu35 (the node of MPI rank 0):
> CPU info:
> Vendor: GenuineIntel
> Brand: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
> SIMD instructions most likely to fit this hardware: AVX_256
> SIMD instructions selected at GROMACS compile time: AVX_256
> GPU info:
> Number of GPUs detected: 2
> #0: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible
> #1: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible
>
> Reading file 200DPPC-16-WT_COO-_150mV_2.tpr, VERSION 5.0.4 (single
> precision)
> Note: file tpx version 100, software tpx version 103
> Changing nstlist from 25 to 40, rlist from 1.038 to 1.08
>
> The number of OpenMP threads was set by environment variable
> OMP_NUM_THREADS to 8
> Using 2 MPI processes
> Using 8 OpenMP threads per MPI process
>
> On host gpu35 2 compatible GPUs are present, with IDs 0,1
> On host gpu35 2 GPUs auto-selected for this run.
> Mapping of GPU IDs to the 2 PP ranks in this node: 0,1
>
>
> NOTE: GROMACS was configured without NVML support hence it can not exploit
> application clocks of the detected Tesla K20m GPU to improve
> performance.
> Recompile with the NVML library (compatible with the driver used)
> or set application clocks manually.
>
>
> Non-default thread affinity set probably by the OpenMP library,
> disabling internal thread affinity
>
> WARNING: This run will generate roughly 38667 Mb of data
>
>
> NOTE: DLB will not turn on during the first phase of PME tuning
>
> starting mdrun '200dppc_membrane_16Maculatin-WT-COO-'
> 10000000000 steps, 20000000.0 ps (continuing from step 268830925,
> 537661.8 ps).
>
> -------------------------------------------------------
> Program gmx mdrun, VERSION 5.1.2
> Source code file:
> /home/rpm/rpmbuild/BUILD/gromacs-5.1.2/src/gromacs/ewald/pme-load-balancing.cpp,
> line: 947
>
> Software inconsistency error:
> pme_loadbal_do called at an interval != nstlist
> For more information and tips for troubleshooting, please check the
> GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> -------------------------------------------------------
>
>
> -------------------------------------------------------
> Program gmx mdrun, VERSION 5.1.2
> Source code file:
> /home/rpm/rpmbuild/BUILD/gromacs-5.1.2/src/gromacs/ewald/pme-load-balancing.cpp,
> line: 947
>
> Software inconsistency error:
> pme_loadbal_do called at an interval != nstlist
> For more information and tips for troubleshooting, please check the
> GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> -------------------------------------------------------
>
> Halting parallel program gmx mdrun on rank 1 out of 2
> Halting parallel program gmx mdrun on rank 0 out of 2
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> srun: error: gpu35: tasks 0-1: Exited with exit code 1
>
>
> --
> Yukun Wang
> Postdoc
> Institute for NanoBioTechnology at The Johns Hopkins University
> Cell phone: +1 (443) 509 2191
> Baltimore, MD USA
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20160529/1893ce24/attachment-0001.html>
More information about the gromacs.org_gmx-developers
mailing list