[gmx-users] Trying to use cutoff electrostatics with MARTINI
Graham J.A.
J.A.Graham at soton.ac.uk
Tue Apr 11 12:09:17 CEST 2017
I'm preparing to do some benchmarking and wanted to look at the
performance differences between reaction-field and cutoffs for the
electrostatic model with the MARTINI forcefield.
I've downloaded the recommended MDPs from the MARTINI website, but when
I specify cutoff in the MDP I still get "NxN RF Elec. + LJ" in the
performance tables at the end of the log. The section at the top of
the log shows that GROMACS has recognised that I'm asking for cutoffs.
This is happening with both GROMACS 2016.3 and 5.1.4. Am I
misunderstanding the log output, or is something going wrong?
Example log file
Log file opened on Tue Apr 11 10:59:20 2017
Host: smaug pid: 4670 rank ID: 0 number of ranks: 1
:-) GROMACS - gmx mdrun, 2016.3 (-:
GROMACS is written by:
Emile Apol Rossen Apostolov Herman J.C. Berendsen Par Bjelkmar
Aldert van Buuren Rudi van Drunen Anton Feenstra Gerrit Groenhof
Christoph Junghans Anca Hamuraru Vincent Hindriksen Dimitrios Karkoulis
Peter Kasson Jiri Kraus Carsten Kutzner Per Larsson
Justin A. Lemkul Magnus Lundborg Pieter Meulenhoff Erik Marklund
Teemu Murtola Szilard Pall Sander Pronk Roland Schulz
Alexey Shvetsov Michael Shirts Alfons Sijbers Peter Tieleman
Teemu Virolainen Christian Wennberg Maarten Wolf
and the project leaders:
Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2017, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.
GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.
GROMACS: gmx mdrun, version 2016.3
Executable: /usr/local/gromacs/2016.3-mpich/bin/gmx_mpi
Data prefix: /usr/local/gromacs/2016.3-mpich
Working dir: /home/james/gromacs/membranes/preequil/martini/popc/2048/test
Command line:
gmx_mpi mdrun -ntomp 2 -pin on -v -deffnm npt_cut -nsteps 1000
GROMACS version: 2016.3
Precision: single
Memory model: 64 bit
MPI library: MPI
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 32)
GPU support: CUDA
SIMD instructions: AVX2_256
FFT library: fftw-3.3.5
RDTSCP usage: enabled
TNG support: enabled
Hwloc support: hwloc-1.11.0
Tracing support: disabled
Built on: Mon 20 Mar 15:50:33 GMT 2017
Built by: james at smaug [CMAKE]
Build OS/arch: Linux 4.4.0-66-generic x86_64
Build CPU vendor: Intel
Build CPU brand: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
Build CPU family: 6 Model: 60 Stepping: 3
Build CPU features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma hle htt lahf mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp rtm sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
C compiler: /usr/bin/mpicc.mpich GNU 5.4.0
C compiler flags: -march=core-avx2 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
C++ compiler: /usr/bin/mpicxx.mpich GNU 5.4.0
C++ compiler flags: -march=core-avx2 -std=c++0x -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
CUDA compiler: /usr/local/cuda-8.0/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2016 NVIDIA Corporation;Built on Tue_Jan_10_13:22:03_CST_2017;Cuda compilation tools, release 8.0, V8.0.61
CUDA compiler flags:-gencode;arch=compute_20,code=sm_20;-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_60,code=compute_60;-gencode;arch=compute_61,code=compute_61;-use_fast_math;-D_FORCE_INLINES;;-Xcompiler;,-march=core-avx2,,,,,,;-Xcompiler;-O3,-DNDEBUG,-funroll-all-loops,-fexcess-precision=fast,,;
CUDA driver: 8.0
CUDA runtime: 8.0
Running on 1 node with total 4 cores, 8 logical cores, 1 compatible GPU
Hardware detected on host smaug (the node of MPI rank 0):
CPU info:
Vendor: Intel
Brand: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
Family: 6 Model: 60 Stepping: 3
Features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma hle htt lahf mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp rtm sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
SIMD instructions most likely to fit this hardware: AVX2_256
SIMD instructions selected at GROMACS compile time: AVX2_256
Hardware topology: Full, with devices
Sockets, cores, and logical processors:
Socket 0: [ 0 4] [ 1 5] [ 2 6] [ 3 7]
Numa nodes:
Node 0 (33572290560 bytes mem): 0 1 2 3 4 5 6 7
0 1.00
L1: 32768 bytes, linesize 64 bytes, assoc. 8, shared 2 ways
L2: 262144 bytes, linesize 64 bytes, assoc. 8, shared 2 ways
L3: 8388608 bytes, linesize 64 bytes, assoc. 16, shared 8 ways
PCI devices:
0000:01:00.0 Id: 10de:1380 Class: 0x0300 Numa: 0
0000:00:02.0 Id: 8086:0412 Class: 0x0380 Numa: 0
0000:00:19.0 Id: 8086:153a Class: 0x0200 Numa: 0
0000:00:1f.2 Id: 8086:8c02 Class: 0x0106 Numa: 0
GPU info:
Number of GPUs detected: 1
#0: NVIDIA GeForce GTX 750 Ti, compute cap.: 5.0, ECC: no, stat: compatible
M. J. Abraham, T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess, E.
GROMACS: High performance molecular simulations through multi-level
parallelism from laptops to supercomputers
SoftwareX 1 (2015) pp. 19-25
-------- -------- --- Thank You --- -------- --------
S. Páll, M. J. Abraham, C. Kutzner, B. Hess, E. Lindahl
Tackling Exascale Software Challenges in Molecular Dynamics Simulations with
In S. Markidis & E. Laure (Eds.), Solving Software Challenges for Exascale 8759 (2015) pp. 3-27
-------- -------- --- Thank You --- -------- --------
S. Pronk, S. Páll, R. Schulz, P. Larsson, P. Bjelkmar, R. Apostolov, M. R.
Shirts, J. C. Smith, P. M. Kasson, D. van der Spoel, B. Hess, and E. Lindahl
GROMACS 4.5: a high-throughput and highly parallel open source molecular
simulation toolkit
Bioinformatics 29 (2013) pp. 845-54
-------- -------- --- Thank You --- -------- --------
B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl
GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable
molecular simulation
J. Chem. Theory Comput. 4 (2008) pp. 435-447
-------- -------- --- Thank You --- -------- --------
D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C.
GROMACS: Fast, Flexible and Free
J. Comp. Chem. 26 (2005) pp. 1701-1719
-------- -------- --- Thank You --- -------- --------
E. Lindahl and B. Hess and D. van der Spoel
GROMACS 3.0: A package for molecular simulation and trajectory analysis
J. Mol. Mod. 7 (2001) pp. 306-317
-------- -------- --- Thank You --- -------- --------
H. J. C. Berendsen, D. van der Spoel and R. van Drunen
GROMACS: A message-passing parallel molecular dynamics implementation
Comp. Phys. Comm. 91 (1995) pp. 43-56
-------- -------- --- Thank You --- -------- --------
Input Parameters:
integrator = md
tinit = 0
dt = 0.03
nsteps = 50000
init-step = 0
simulation-part = 1
comm-mode = Linear
nstcomm = 100
bd-fric = 0
ld-seed = -2007382892
emtol = 10
emstep = 0.01
niter = 20
fcstep = 0
nstcgsteep = 1000
nbfgscorr = 10
rtpi = 0.05
nstxout = 0
nstvout = 0
nstfout = 0
nstlog = 1000
nstcalcenergy = 100
nstenergy = 100
nstxout-compressed = 1000
compressed-x-precision = 100
cutoff-scheme = Verlet
nstlist = 20
ns-type = Grid
pbc = xyz
periodic-molecules = false
verlet-buffer-tolerance = 0.005
rlist = 1.307
coulombtype = Cut-off
coulomb-modifier = Potential-shift
rcoulomb-switch = 0
rcoulomb = 1.1
epsilon-r = 15
epsilon-rf = inf
vdw-type = Cut-off
vdw-modifier = Potential-shift
rvdw-switch = 0
rvdw = 1.1
DispCorr = No
table-extension = 1
fourierspacing = 0.12
fourier-nx = 0
fourier-ny = 0
fourier-nz = 0
pme-order = 4
ewald-rtol = 1e-05
ewald-rtol-lj = 0.001
lj-pme-comb-rule = Geometric
ewald-geometry = 0
epsilon-surface = 0
implicit-solvent = No
gb-algorithm = Still
nstgbradii = 1
rgbradii = 1
gb-epsilon-solvent = 80
gb-saltconc = 0
gb-obc-alpha = 1
gb-obc-beta = 0.8
gb-obc-gamma = 4.85
gb-dielectric-offset = 0.009
sa-algorithm = Ace-approximation
sa-surface-tension = 2.05016
tcoupl = V-rescale
nsttcouple = 20
nh-chain-length = 0
print-nose-hoover-chain-variables = false
pcoupl = Parrinello-Rahman
pcoupltype = Semiisotropic
nstpcouple = 20
tau-p = 12
compressibility (3x3):
compressibility[ 0]={ 3.00000e-04, 0.00000e+00, 0.00000e+00}
compressibility[ 1]={ 0.00000e+00, 3.00000e-04, 0.00000e+00}
compressibility[ 2]={ 0.00000e+00, 0.00000e+00, 3.00000e-04}
ref-p (3x3):
ref-p[ 0]={ 1.00000e+00, 0.00000e+00, 0.00000e+00}
ref-p[ 1]={ 0.00000e+00, 1.00000e+00, 0.00000e+00}
ref-p[ 2]={ 0.00000e+00, 0.00000e+00, 1.00000e+00}
refcoord-scaling = No
posres-com (3):
posres-com[0]= 0.00000e+00
posres-com[1]= 0.00000e+00
posres-com[2]= 0.00000e+00
posres-comB (3):
posres-comB[0]= 0.00000e+00
posres-comB[1]= 0.00000e+00
posres-comB[2]= 0.00000e+00
QMMM = false
QMconstraints = 0
QMMMscheme = 0
MMChargeScaleFactor = 1
ngQM = 0
constraint-algorithm = Lincs
continuation = false
Shake-SOR = false
shake-tol = 0.0001
lincs-order = 4
lincs-iter = 1
lincs-warnangle = 30
nwall = 0
wall-type = 9-3
wall-r-linpot = -1
wall-atomtype[0] = -1
wall-atomtype[1] = -1
wall-density[0] = 0
wall-density[1] = 0
wall-ewald-zfac = 3
pull = false
rotation = false
interactiveMD = false
disre = No
disre-weighting = Conservative
disre-mixed = false
dr-fc = 1000
dr-tau = 0
nstdisreout = 100
orire-fc = 0
orire-tau = 0
nstorireout = 100
free-energy = no
cos-acceleration = 0
deform (3x3):
deform[ 0]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 1]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
deform[ 2]={ 0.00000e+00, 0.00000e+00, 0.00000e+00}
simulated-tempering = false
n = 0
n = 0
n = 0
n = 0
n = 0
n = 0
swapcoords = no
userint1 = 0
userint2 = 0
userint3 = 0
userint4 = 0
userreal1 = 0
userreal2 = 0
userreal3 = 0
userreal4 = 0
nrdf: 73726.9 129790
ref-t: 320 320
tau-t: 1 1
annealing: No No
annealing-npoints: 0 0
acc: 0 0 0
nfreeze: N N N
energygrp-flags[ 0]: 0 0
energygrp-flags[ 1]: 0 0
Overriding nsteps with value passed on the command line: 1000 steps, 30 ps
Using 1 MPI process
Using 2 OpenMP threads
1 compatible GPU is present, with ID 0
1 GPU auto-selected for this run.
Mapping of GPU ID to the 1 PP rank in this node: 0
Cut-off's: NS: 1.307 Coulomb: 1.1 LJ: 1.1
System total charge: 0.000
Potential shift: LJ r^-12: -3.186e-01 r^-6: -5.645e-01, Coulomb -9e-01
Using GPU 8x8 non-bonded kernels
Using full Lennard-Jones parameter combination matrix
NOTE: With GPUs, reporting energy group contributions is not supported
Removing pbc first time
Pinning threads with an auto-selected logical core stride of 2
Intra-simulation communication will occur every 20 steps.
Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
0: rest
G. Bussi, D. Donadio and M. Parrinello
Canonical sampling through velocity rescaling
J. Chem. Phys. 126 (2007) pp. 014101
-------- -------- --- Thank You --- -------- --------
There are: 67840 Atoms
Initial temperature: 310.331 K
Started mdrun on rank 0 Tue Apr 11 10:59:20 2017
Step Time
0 0.00000
Energies (kJ/mol)
Bond G96Angle LJ (SR) Coulomb (SR) Potential
3.71293e+04 1.70154e+04 -1.39456e+06 -8.15311e+03 -1.34857e+06
Kinetic En. Total Energy Temperature Pressure (bar)
2.69029e+05 -1.07954e+06 3.17976e+02 -5.26154e+02
Step 1 Warning: Pressure scaling more than 1%. This may mean your system
is not yet equilibrated. Use of Parrinello-Rahman pressure coupling during
equilibration can lead to simulation instability, and is discouraged.
Step 21 Warning: Pressure scaling more than 1%. This may mean your system
is not yet equilibrated. Use of Parrinello-Rahman pressure coupling during
equilibration can lead to simulation instability, and is discouraged.
Step 41 Warning: Pressure scaling more than 1%. This may mean your system
is not yet equilibrated. Use of Parrinello-Rahman pressure coupling during
equilibration can lead to simulation instability, and is discouraged.
Step 201 Warning: Pressure scaling more than 1%. This may mean your system
is not yet equilibrated. Use of Parrinello-Rahman pressure coupling during
equilibration can lead to simulation instability, and is discouraged.
Step Time
1000 30.00000
Writing checkpoint, step 1000 at Tue Apr 11 10:59:24 2017
Energies (kJ/mol)
Bond G96Angle LJ (SR) Coulomb (SR) Potential
3.83384e+04 1.70240e+04 -1.59578e+06 -8.78781e+03 -1.54920e+06
Kinetic En. Total Energy Temperature Pressure (bar)
2.73324e+05 -1.27588e+06 3.23052e+02 2.74106e+01
<====== ############### ==>
<==== A V E R A G E S ====>
<== ############### ======>
Statistics over 1001 steps using 11 frames
Energies (kJ/mol)
Bond G96Angle LJ (SR) Coulomb (SR) Potential
3.83040e+04 1.71072e+04 -1.54225e+06 -8.59811e+03 -1.49543e+06
Kinetic En. Total Energy Temperature Pressure (bar)
2.84298e+05 -1.21114e+06 3.36022e+02 -3.56540e+01
Box-X Box-Y Box-Z
2.55722e+01 2.55722e+01 1.24057e+01
Total Virial (kJ/mol)
1.03901e+05 -1.74106e+02 -3.63408e+02
-1.73936e+02 1.02492e+05 9.74937e+02
-3.63537e+02 9.74952e+02 1.07254e+05
Pressure (bar)
-3.37355e+01 1.92901e-01 1.12682e+00
1.92200e-01 -2.74952e+01 -3.45536e+00
1.12736e+00 -3.45542e+00 -4.57314e+01
Epot (kJ/mol) Coul-SR LJ-SR
POPC-POPC -8.59811e+03 -1.54225e+06
POPC-W 0.00000e+00 0.00000e+00
W-W 0.00000e+00 0.00000e+00
3.19227e+02 3.45563e+02
M E G A - F L O P S A C C O U N T I N G
NB=Group-cutoff nonbonded kernels NxN=N-by-N cluster Verlet kernels
RF=Reaction-Field VdW=Van der Waals QSTab=quadratic-spline table
W3=SPC/TIP3p W4=TIP4p (single or pairs)
V&F=Potential and force V=Potential only F=Force only
Computing: M-Number M-Flops % Flops
Pair Search distance check 186.766128 1680.895 0.2
NxN RF Elec. + LJ [F] 18066.703168 686534.720 97.6
NxN RF Elec. + LJ [V&F] 200.750272 10840.515 1.5
Shift-X 3.459840 20.759 0.0
Bonds 22.550528 1330.481 0.2
Angles 16.400384 2755.265 0.4
Virial 3.462135 62.318 0.0
Stop-CM 0.814080 8.141 0.0
Calc-Ekin 6.919680 186.831 0.0
Total 703419.926 100.0
On 1 MPI rank, each using 2 OpenMP threads
Computing: Num Num Call Wall time Giga-Cycles
Ranks Threads Count (s) total sum %
Neighbor search 1 2 51 0.367 2.637 8.8
Launch GPU ops. 1 2 1001 0.051 0.369 1.2
Force 1 2 1001 0.883 6.345 21.1
Wait GPU local 1 2 1001 1.219 8.755 29.1
NB X/F buffer ops. 1 2 1951 0.308 2.216 7.4
Write traj. 1 2 2 0.289 2.079 6.9
Update 1 2 1001 0.920 6.608 22.0
Rest 0.146 1.051 3.5
Total 4.185 30.060 100.0
GPU timings
Computing: Count Wall t (s) ms/step %
Pair list H2D 51 0.014 0.265 0.6
X / q H2D 1001 0.117 0.117 5.5
Nonbonded F kernel 950 1.779 1.873 83.6
Nonbonded F+prune k. 40 0.104 2.593 4.9
Nonbonded F+ene+prune k. 11 0.033 2.962 1.5
F D2H 1001 0.082 0.082 3.9
Total 2.128 2.126 100.0
Average per-step force GPU/CPU evaluation time ratio: 2.126 ms/0.882 ms = 2.409
Core t (s) Wall t (s) (%)
Time: 8.369 4.185 200.0
(ns/day) (hour/ns)
Performance: 620.039 0.039
Finished mdrun on rank 0 Tue Apr 11 10:59:24 2017
More information about the gromacs.org_gmx-users
mailing list