[gmx-users] Assertion failed with single precision only
Prithwish Nandi
prithwish.nandi at ichec.ie
Fri Feb 1 13:41:01 CET 2019
Hi,
I was trying to run a MD run using both single precision and double precision. The dp-run is okay, but the sp-run was terminated with a ‘Assertion Failed’ message.
Any help to resolve this is welcome.
Best regards,
Prithwish
------------------------------------------------------------
The screen output is :
------------------------------------------------------------
comm-mode angular will give incorrect results when the comm group partially crosses a periodic boundary
Using 40 MPI processes
Using 1 OpenMP thread per MPI process
Non-default thread affinity set probably by the OpenMP library,
disabling internal thread affinity
WARNING: This run will generate roughly 11567 Mb of data
Program: gmx mdrun, version 2018.4
Source file: src/gromacs/mdlib/vcm.cpp (line 394)
Function: void do_stopcm_grp(const t_mdatoms &, float (*)[3], float (*)[3], const t_vcm &)
MPI rank: 5 (out of 40)
Assertion failed:
Condition: x
Need x to compute angular momentum correction
—————————————————————————————————
The log file is as follows:
———————————————————————————————————————————
C compiler flags: -xCORE-AVX512 -qopt-zmm-usage=high -mkl=sequential -std=gnu99 -ip -funroll-all-loops -alias-const -ansi-alias -no-prec-div -fimf-domain-exclusion=14 -qoverride-limits
C++ compiler flags: -xCORE-AVX512 -qopt-zmm-usage=high -mkl=sequential -std=c++11 -ip -funroll-all-loops -alias-const -ansi-alias -no-prec-div -fimf-domain-exclusion=14 -qoverride-limits
Running on 1 node with total 40 cores, 40 logical cores
Hardware detected on host n12 (the node of MPI rank 0):
CPU info:
Vendor: Intel
Brand: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
Family: 6 Model: 85 Stepping: 4
Features: aes apic avx avx2 avx512f avx512cd avx512bw avx512vl clfsh cmov cx8 cx16 f16c fma hle htt intel lahf mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp rtm sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
Number of AVX-512 FMA units: 2
Hardware topology: Full, with devices
Sockets, cores, and logical processors:
Socket 0: [ 0] [ 1] [ 2] [ 3] [ 4] [ 5] [ 6] [ 7] [ 8] [ 9] [ 10] [ 11] [ 12] [ 13] [ 14] [ 15] [ 16] [ 17] [ 18] [ 19]
Socket 1: [ 20] [ 21] [ 22] [ 23] [ 24] [ 25] [ 26] [ 27] [ 28] [ 29] [ 30] [ 31] [ 32] [ 33] [ 34] [ 35] [ 36] [ 37] [ 38] [ 39]
Numa nodes:
Node 0 (101696126976 bytes mem): 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Node 1 (103079215104 bytes mem): 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
Latency:
0 1
0 1.00 2.10
1 2.10 1.00
Caches:
L1: 32768 bytes, linesize 64 bytes, assoc. 8, shared 1 ways
L2: 1048576 bytes, linesize 64 bytes, assoc. 16, shared 1 ways
L3: 28835840 bytes, linesize 64 bytes, assoc. 11, shared 20 ways
PCI devices:
0000:00:11.5 Id: 8086:a1d2 Class: 0x0106 Numa: 0
0000:00:17.0 Id: 8086:a182 Class: 0x0106 Numa: 0
0000:02:00.0 Id: 1a03:2000 Class: 0x0300 Numa: 0
0000:18:00.0 Id: 8086:1563 Class: 0x0200 Numa: 0
0000:18:00.1 Id: 8086:1563 Class: 0x0200 Numa: 0
0000:5e:00.0 Id: 8086:24f0 Class: 0x0208 Numa: 0
Input Parameters:
integrator = md
tinit = 0
dt = 0.00333333
nsteps = 300000000
init-step = 0
simulation-part = 1
comm-mode = Angular
nstcomm = 3000
bd-fric = 0
ld-seed = 3420309074
emtol = 10
emstep = 0.01
niter = 20
fcstep = 0
nstcgsteep = 1000
nbfgscorr = 10
rtpi = 0.05
nstxout = 3000
nstvout = 0
nstfout = 0
nstlog = 3000
nstcalcenergy = 3000
nstenergy = 3000
nstxout-compressed = 0
compressed-x-precision = 1000
cutoff-scheme = Verlet
nstlist = 10
ns-type = Grid
pbc = xyz
periodic-molecules = false
verlet-buffer-tolerance = 0.005
rlist = 0.8
coulombtype = PME
coulomb-modifier = Potential-shift
rcoulomb-switch = 0
rcoulomb = 0.8
epsilon-r = 1
epsilon-rf = inf
vdw-type = Cut-off
vdw-modifier = Potential-shift
rvdw-switch = 0
rvdw = 0.8
DispCorr = No
table-extension = 1
fourierspacing = 0
fourier-nx = 8
fourier-ny = 8
fourier-nz = 8
pme-order = 3
ewald-rtol = 1e-05
ewald-rtol-lj = 0.001
lj-pme-comb-rule = Geometric
ewald-geometry = 0
epsilon-surface = 0
implicit-solvent = No
gb-algorithm = Still
nstgbradii = 1
rgbradii = 1
gb-epsilon-solvent = 80
gb-saltconc = 0
gb-obc-alpha = 1
gb-obc-beta = 0.8
gb-obc-gamma = 4.85
gb-dielectric-offset = 0.009
sa-algorithm = Ace-approximation
sa-surface-tension = 2.05016
tcoupl = Nose-Hoover
nsttcouple = 10
nh-chain-length = 1
print-nose-hoover-chain-variables = false
pcoupl = No
pcoupltype = Isotropic
nstpcouple = -1
tau-p = 1
compressibility (3x3):
compressibility[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
compressibility[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
compressibility[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
ref-p (3x3):
ref-p[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
ref-p[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
ref-p[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
refcoord-scaling = No
posres-com (3):
posres-com[0]= 0.00000e+00
posres-com[1]= 0.00000e+00
posres-com[2]= 0.00000e+00
posres-comB (3):
posres-comB[0]= 0.00000e+00
posres-comB[1]= 0.00000e+00
posres-comB[2]= 0.00000e+00
QMMM = false
QMconstraints = 0
QMMMscheme = 0
MMChargeScaleFactor = 1
qm-opts:
ngQM = 0
constraint-algorithm = Lincs
continuation = false
Shake-SOR = false
shake-tol = 0.0001
lincs-order = 4
lincs-iter = 2
lincs-warnangle = 30
nwall = 0
wall-type = 9-3
wall-r-linpot = -1
wall-atomtype[0] = -1
wall-atomtype[1] = -1
wall-density[0] = 0
wall-density[1] = 0
wall-ewald-zfac = 3
pull = false
awh = false
rotation = false
interactiveMD = false
disre = No
disre-weighting = Conservative
disre-mixed = false
dr-fc = 1000
dr-tau = 0
nstdisreout = 100
orire-fc = 0
orire-tau = 0
nstorireout = 100
free-energy = no
cos-acceleration = 0
deform (3x3):
deform[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
simulated-tempering = false
swapcoords = no
userint1 = 0
userint2 = 0
userint3 = 0
userint4 = 0
userreal1 = 0
userreal2 = 0
userreal3 = 0
userreal4 = 0
applied-forces:
electric-field:
grpopts:
nrdf: 14994
ref-t: 150
tau-t: 1
annealing: No
annealing-npoints: 0
acc: 0 0 0
nfreeze: N N N
energygrp-flags[ 0]: 0
Changing nstlist from 10 to 100, rlist from 0.8 to 0.8
Initializing Domain Decomposition on 40 ranks
Dynamic load balancing: off
Minimum cell size due to atom displacement: 0.761 nm
Initial maximum inter charge-group distances:
two-body bonded interactions: 0.153 nm, Exclusion, atoms 8854 8855
Minimum cell size due to bonded interactions: 0.000 nm
Guess for relative PME load: 0.81
Using 0 separate PME ranks, as guessed by mdrun
Scaling the initial minimum size with 1/0.8 (option -dds) = 1.25
Optimizing the DD grid for 40 cells with a minimum initial size of 0.951 nm
The maximum allowed number of cells is: X 25 Y 25 Z 25
Domain decomposition grid 4 x 5 x 2, separate PME ranks 0
PME domain decomposition: 4 x 10 x 1
comm-mode angular will give incorrect results when the comm group partially crosses a periodic boundary
Domain decomposition rank 0, coordinates 0 0 0
The initial number of communication pulses is: X 1 Y 1 Z 1
The initial domain decomposition cell size is: X 6.12 nm Y 4.90 nm Z 12.25 nm
The maximum allowed distance for charge groups involved in interactions is:
non-bonded interactions 0.800 nm
two-body bonded interactions (-rdd) 0.800 nm
multi-body bonded interactions (-rdd) 0.800 nm
virtual site constructions (-rcon) 4.900 nm
atoms separated by up to 5 constraints (-rcon) 4.900 nm
When dynamic load balancing gets turned on, these settings will change to:
The maximum number of communication pulses is: X 1 Y 1 Z 1
The minimum size for domain decomposition cells is 0.800 nm
The requested allowed shrink of DD cells (option -dds) is: 0.80
The allowed shrink of domain decomposition cells is: X 0.13 Y 0.16 Z 0.07
The maximum allowed distance for charge groups involved in interactions is:
non-bonded interactions 0.800 nm
two-body bonded interactions (-rdd) 0.800 nm
multi-body bonded interactions (-rdd) 0.800 nm
virtual site constructions (-rcon) 0.800 nm
Using 40 MPI processes
Using 1 OpenMP thread per MPI process
Non-default thread affinity set probably by the OpenMP library,
disabling internal thread affinity
System total charge: 0.000
Will do PME sum in reciprocal space for electrostatic interactions.
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G. Pedersen
A smooth particle mesh Ewald method
J. Chem. Phys. 103 (1995) pp. 8577-8592
-------- -------- --- Thank You --- -------- --------
Using a Gaussian width (1/beta) of 0.25613 nm for Ewald
Potential shift: LJ r^-12: -1.455e+01 r^-6: -3.815e+00, Ewald -1.250e-05
Initialized non-bonded Ewald correction tables, spacing: 8.35e-04 size: 960
Using SIMD 4x8 nonbonded short-range kernels
Using a 4x8 pair-list setup:
updated every 100 steps, buffer 0.000 nm, rlist 0.800 nm
At tolerance 0.005 kJ/mol/ps per atom, equivalent classical 1x1 list would be:
updated every 100 steps, buffer 0.000 nm, rlist 0.800 nm
Using geometric Lennard-Jones combination rule
Removing pbc first time
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Miyamoto and P. A. Kollman
SETTLE: An Analytical Version of the SHAKE and RATTLE Algorithms for Rigid
Water Models
J. Comp. Chem. 13 (1992) pp. 952-962
More information about the gromacs.org_gmx-users
mailing list