[gmx-developers] Software inconsistency error: pme_loadbal_do called at an interval != nstlist

Berk Hess hess at kth.se
Sun May 29 13:54:53 CEST 2016


Hi,

This issue is triggered when restrarting from a checkpoint file that is 
not from a step % nstlist == 0.
A one-line fix was merged into release-5.1 in March:
https://gerrit.gromacs.org/#/c/5721/

You can apply this fix, or run with -notunepme. Or you can run with 
-notunepme and press Crtl-C once to get a new checkpoint file with step 
% nstlist == 0 and continue from that.

Cheers,

Berk

On 05/29/2016 01:17 AM, Yorquant Wang wrote:
> Dear All,
>
> I recently met a error:"Software inconsistency error: pme_loadbal_do 
> called at an interval != nstlist" when I try to restart one of jobs.
> I used GMX 5.1.2  on K80 GPU nodes with haswell CPU.
>
> Is there anyone know how to solve this error?
> Thank you very much!
>
> cheer​s
> Yukun Wang
>
> below is the output log files:
>
>                  :-) GROMACS - gmx mdrun, VERSION 5.1.2 (-:
>
>                           GROMACS is written by:
>    Emile Apol      Rossen Apostolov  Herman J.C. Berendsen  Par Bjelkmar
>  Aldert van Buuren   Rudi van Drunen     Anton Feenstra   Sebastian 
> Fritsch
> Gerrit Groenhof   Christoph Junghans   Anca Hamuraru  Vincent Hindriksen
>  Dimitrios Karkoulis    Peter Kasson        Jiri Kraus      Carsten 
> Kutzner
>   Per Larsson      Justin A. Lemkul   Magnus Lundborg   Pieter Meulenhoff
>  Erik Marklund      Teemu Murtola       Szilard Pall Sander Pronk
>  Roland Schulz     Alexey Shvetsov     Michael Shirts Alfons Sijbers
>  Peter Tieleman    Teemu Virolainen  Christian Wennberg  Maarten Wolf
>                          and the project leaders:
>       Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
>
> Copyright (c) 1991-2000, University of Groningen, The Netherlands.
> Copyright (c) 2001-2015, The GROMACS development team at
> Uppsala University, Stockholm University and
> the Royal Institute of Technology, Sweden.
> check out http://www.gromacs.org for more information.
>
> GROMACS is free software; you can redistribute it and/or modify it
> under the terms of the GNU Lesser General Public License
> as published by the Free Software Foundation; either version 2.1
> of the License, or (at your option) any later version.
>
> GROMACS:      gmx mdrun, VERSION 5.1.2
> Executable:   /lustre/usr/gromacs/5.1-icc15-impi5-gpu/bin/gmx_mpi
> Data prefix:  /lustre/usr/gromacs/5.1-icc15-impi5-gpu
> Command line:
> gmx_mpi mdrun -deffnm 200DPPC-16-WT_COO-_150mV_2 -append -cpi -v 
> -resethway -noconfout
>
> No previous checkpoint file present, assuming this is a new run.
>
> Number of logical cores detected (16) does not match the number 
> reported by OpenMP (8).
> Consider setting the launch configuration manually!
>
> Running on 1 node with total 16 cores, 16 logical cores, 2 compatible GPUs
> Hardware detected on host gpu35 (the node of MPI rank 0):
> CPU info:
>   Vendor: GenuineIntel
>   Brand:  Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
>   SIMD instructions most likely to fit this hardware: AVX_256
>   SIMD instructions selected at GROMACS compile time: AVX_256
> GPU info:
>   Number of GPUs detected: 2
>   #0: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible
>   #1: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible
>
> Reading file 200DPPC-16-WT_COO-_150mV_2.tpr, VERSION 5.0.4 (single 
> precision)
> Note: file tpx version 100, software tpx version 103
> Changing nstlist from 25 to 40, rlist from 1.038 to 1.08
>
> The number of OpenMP threads was set by environment variable 
> OMP_NUM_THREADS to 8
> Using 2 MPI processes
> Using 8 OpenMP threads per MPI process
>
> On host gpu35 2 compatible GPUs are present, with IDs 0,1
> On host gpu35 2 GPUs auto-selected for this run.
> Mapping of GPU IDs to the 2 PP ranks in this node: 0,1
>
>
> NOTE: GROMACS was configured without NVML support hence it can not exploit
>     application clocks of the detected Tesla K20m GPU to improve 
> performance.
>     Recompile with the NVML library (compatible with the driver used) 
> or set application clocks manually.
>
>
> Non-default thread affinity set probably by the OpenMP library,
> disabling internal thread affinity
>
> WARNING: This run will generate roughly 38667 Mb of data
>
>
> NOTE: DLB will not turn on during the first phase of PME tuning
>
> starting mdrun '200dppc_membrane_16Maculatin-WT-COO-'
> 10000000000 steps, 20000000.0 ps (continuing from step 268830925, 
> 537661.8 ps).
>
> -------------------------------------------------------
> Program gmx mdrun, VERSION 5.1.2
> Source code file: 
> /home/rpm/rpmbuild/BUILD/gromacs-5.1.2/src/gromacs/ewald/pme-load-balancing.cpp, 
> line: 947
>
> Software inconsistency error:
> pme_loadbal_do called at an interval != nstlist
> For more information and tips for troubleshooting, please check the 
> GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> -------------------------------------------------------
>
>
> -------------------------------------------------------
> Program gmx mdrun, VERSION 5.1.2
> Source code file: 
> /home/rpm/rpmbuild/BUILD/gromacs-5.1.2/src/gromacs/ewald/pme-load-balancing.cpp, 
> line: 947
>
> Software inconsistency error:
> pme_loadbal_do called at an interval != nstlist
> For more information and tips for troubleshooting, please check the 
> GROMACS
> website at http://www.gromacs.org/Documentation/Errors
> -------------------------------------------------------
>
> Halting parallel program gmx mdrun on rank 1 out of 2
> Halting parallel program gmx mdrun on rank 0 out of 2
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> srun: error: gpu35: tasks 0-1: Exited with exit code 1​
>
>
> -- 
> Yukun Wang
> Postdoc
> Institute for NanoBioTechnology at The Johns Hopkins University
> Cell phone: +1 (443) 509 2191
> Baltimore, MD USA
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20160529/1893ce24/attachment-0001.html>


More information about the gromacs.org_gmx-developers mailing list