[gmx-users] Trying to use cutoff electrostatics with MARTINI

Tue Apr 11 12:09:17 CEST 2017

Hi,

I'm preparing to do some benchmarking and wanted to look at the
performance differences between reaction-field and cutoffs for the
electrostatic model with the MARTINI forcefield.

I've downloaded the recommended MDPs from the MARTINI website, but when
I specify cutoff in the MDP I still get "NxN RF Elec. + LJ" in the
performance tables at the end of the log.  The section at the top of
the log shows that GROMACS has recognised that I'm asking for cutoffs.

This is happening with both GROMACS 2016.3 and 5.1.4.  Am I
misunderstanding the log output, or is something going wrong?

Thanks,
James

Example log file
================================

Log file opened on Tue Apr 11 10:59:20 2017
Host: smaug  pid: 4670  rank ID: 0  number of ranks:  1
                      :-) GROMACS - gmx mdrun, 2016.3 (-:

                            GROMACS is written by:
     Emile Apol      Rossen Apostolov  Herman J.C. Berendsen    Par Bjelkmar   
 Aldert van Buuren   Rudi van Drunen     Anton Feenstra    Gerrit Groenhof  
 Christoph Junghans   Anca Hamuraru    Vincent Hindriksen Dimitrios Karkoulis
    Peter Kasson        Jiri Kraus      Carsten Kutzner      Per Larsson    
  Justin A. Lemkul   Magnus Lundborg   Pieter Meulenhoff    Erik Marklund   
   Teemu Murtola       Szilard Pall       Sander Pronk      Roland Schulz   
  Alexey Shvetsov     Michael Shirts     Alfons Sijbers     Peter Tieleman  
  Teemu Virolainen  Christian Wennberg    Maarten Wolf   
                           and the project leaders:
        Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel

Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2017, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.

GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.

GROMACS:      gmx mdrun, version 2016.3
Executable:   /usr/local/gromacs/2016.3-mpich/bin/gmx_mpi
Data prefix:  /usr/local/gromacs/2016.3-mpich
Working dir:  /home/james/gromacs/membranes/preequil/martini/popc/2048/test
Command line:
  gmx_mpi mdrun -ntomp 2 -pin on -v -deffnm npt_cut -nsteps 1000

GROMACS version:    2016.3
Precision:          single
Memory model:       64 bit
MPI library:        MPI
OpenMP support:     enabled (GMX_OPENMP_MAX_THREADS = 32)
GPU support:        CUDA
SIMD instructions:  AVX2_256
FFT library:        fftw-3.3.5
RDTSCP usage:       enabled
TNG support:        enabled
Hwloc support:      hwloc-1.11.0
Tracing support:    disabled
Built on:           Mon 20 Mar 15:50:33 GMT 2017
Built by:           james at smaug [CMAKE]
Build OS/arch:      Linux 4.4.0-66-generic x86_64
Build CPU vendor:   Intel
Build CPU brand:    Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
Build CPU family:   6   Model: 60   Stepping: 3
Build CPU features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma hle htt lahf mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp rtm sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
C compiler:         /usr/bin/mpicc.mpich GNU 5.4.0
C compiler flags:    -march=core-avx2     -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast  
C++ compiler:       /usr/bin/mpicxx.mpich GNU 5.4.0
C++ compiler flags:  -march=core-avx2    -std=c++0x   -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast  
CUDA compiler:      /usr/local/cuda-8.0/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2016 NVIDIA Corporation;Built on Tue_Jan_10_13:22:03_CST_2017;Cuda compilation tools, release 8.0, V8.0.61
CUDA compiler flags:-gencode;arch=compute_20,code=sm_20;-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_60,code=compute_60;-gencode;arch=compute_61,code=compute_61;-use_fast_math;-D_FORCE_INLINES;;-Xcompiler;,-march=core-avx2,,,,,,;-Xcompiler;-O3,-DNDEBUG,-funroll-all-loops,-fexcess-precision=fast,,; 
CUDA driver:        8.0
CUDA runtime:       8.0

Running on 1 node with total 4 cores, 8 logical cores, 1 compatible GPU
Hardware detected on host smaug (the node of MPI rank 0):
  CPU info:
    Vendor: Intel
    Brand:  Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
    Family: 6   Model: 60   Stepping: 3
    Features: aes apic avx avx2 clfsh cmov cx8 cx16 f16c fma hle htt lahf mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp rtm sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
    SIMD instructions most likely to fit this hardware: AVX2_256
    SIMD instructions selected at GROMACS compile time: AVX2_256

  Hardware topology: Full, with devices
    Sockets, cores, and logical processors:
      Socket  0: [   0   4] [   1   5] [   2   6] [   3   7]
    Numa nodes:
      Node  0 (33572290560 bytes mem):   0   1   2   3   4   5   6   7
      Latency:
               0
         0  1.00
    Caches:
      L1: 32768 bytes, linesize 64 bytes, assoc. 8, shared 2 ways
      L2: 262144 bytes, linesize 64 bytes, assoc. 8, shared 2 ways
      L3: 8388608 bytes, linesize 64 bytes, assoc. 16, shared 8 ways
    PCI devices:
      0000:01:00.0  Id: 10de:1380  Class: 0x0300  Numa: 0
      0000:00:02.0  Id: 8086:0412  Class: 0x0380  Numa: 0
      0000:00:19.0  Id: 8086:153a  Class: 0x0200  Numa: 0
      0000:00:1f.2  Id: 8086:8c02  Class: 0x0106  Numa: 0
  GPU info:
    Number of GPUs detected: 1
    #0: NVIDIA GeForce GTX 750 Ti, compute cap.: 5.0, ECC:  no, stat: compatible

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
M. J. Abraham, T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess, E.
Lindahl
GROMACS: High performance molecular simulations through multi-level
parallelism from laptops to supercomputers
SoftwareX 1 (2015) pp. 19-25
-------- -------- --- Thank You --- -------- --------

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Páll, M. J. Abraham, C. Kutzner, B. Hess, E. Lindahl
Tackling Exascale Software Challenges in Molecular Dynamics Simulations with
GROMACS
In S. Markidis & E. Laure (Eds.), Solving Software Challenges for Exascale 8759 (2015) pp. 3-27
-------- -------- --- Thank You --- -------- --------

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
S. Pronk, S. Páll, R. Schulz, P. Larsson, P. Bjelkmar, R. Apostolov, M. R.
Shirts, J. C. Smith, P. M. Kasson, D. van der Spoel, B. Hess, and E. Lindahl
GROMACS 4.5: a high-throughput and highly parallel open source molecular
simulation toolkit
Bioinformatics 29 (2013) pp. 845-54
-------- -------- --- Thank You --- -------- --------

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl
GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable
molecular simulation
J. Chem. Theory Comput. 4 (2008) pp. 435-447
-------- -------- --- Thank You --- -------- --------

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C.
Berendsen
GROMACS: Fast, Flexible and Free
J. Comp. Chem. 26 (2005) pp. 1701-1719
-------- -------- --- Thank You --- -------- --------

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
E. Lindahl and B. Hess and D. van der Spoel
GROMACS 3.0: A package for molecular simulation and trajectory analysis
J. Mol. Mod. 7 (2001) pp. 306-317
-------- -------- --- Thank You --- -------- --------

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
H. J. C. Berendsen, D. van der Spoel and R. van Drunen
GROMACS: A message-passing parallel molecular dynamics implementation
Comp. Phys. Comm. 91 (1995) pp. 43-56
-------- -------- --- Thank You --- -------- --------

Input Parameters:
   integrator                     = md
   tinit                          = 0
   dt                             = 0.03
   nsteps                         = 50000
   init-step                      = 0
   simulation-part                = 1
   comm-mode                      = Linear
   nstcomm                        = 100
   bd-fric                        = 0
   ld-seed                        = -2007382892
   emtol                          = 10
   emstep                         = 0.01
   niter                          = 20
   fcstep                         = 0
   nstcgsteep                     = 1000
   nbfgscorr                      = 10
   rtpi                           = 0.05
   nstxout                        = 0
   nstvout                        = 0
   nstfout                        = 0
   nstlog                         = 1000
   nstcalcenergy                  = 100
   nstenergy                      = 100
   nstxout-compressed             = 1000
   compressed-x-precision         = 100
   cutoff-scheme                  = Verlet
   nstlist                        = 20
   ns-type                        = Grid
   pbc                            = xyz
   periodic-molecules             = false
   verlet-buffer-tolerance        = 0.005
   rlist                          = 1.307
   coulombtype                    = Cut-off
   coulomb-modifier               = Potential-shift
   rcoulomb-switch                = 0
   rcoulomb                       = 1.1
   epsilon-r                      = 15
   epsilon-rf                     = inf
   vdw-type                       = Cut-off
   vdw-modifier                   = Potential-shift
   rvdw-switch                    = 0
   rvdw                           = 1.1
   DispCorr                       = No
   table-extension                = 1
   fourierspacing                 = 0.12
   fourier-nx                     = 0
   fourier-ny                     = 0
   fourier-nz                     = 0
   pme-order                      = 4
   ewald-rtol                     = 1e-05
   ewald-rtol-lj                  = 0.001
   lj-pme-comb-rule               = Geometric
   ewald-geometry                 = 0
   epsilon-surface                = 0
   implicit-solvent               = No
   gb-algorithm                   = Still
   nstgbradii                     = 1
   rgbradii                       = 1
   gb-epsilon-solvent             = 80
   gb-saltconc                    = 0
   gb-obc-alpha                   = 1
   gb-obc-beta                    = 0.8
   gb-obc-gamma                   = 4.85
   gb-dielectric-offset           = 0.009
   sa-algorithm                   = Ace-approximation
   sa-surface-tension             = 2.05016
   tcoupl                         = V-rescale
   nsttcouple                     = 20
   nh-chain-length                = 0
   print-nose-hoover-chain-variables = false
   pcoupl                         = Parrinello-Rahman
   pcoupltype                     = Semiisotropic
   nstpcouple                     = 20
   tau-p                          = 12
   compressibility (3x3):
      compressibility[    0]={ 3.00000e-04,  0.00000e+00,  0.00000e+00}
      compressibility[    1]={ 0.00000e+00,  3.00000e-04,  0.00000e+00}
      compressibility[    2]={ 0.00000e+00,  0.00000e+00,  3.00000e-04}
   ref-p (3x3):
      ref-p[    0]={ 1.00000e+00,  0.00000e+00,  0.00000e+00}
      ref-p[    1]={ 0.00000e+00,  1.00000e+00,  0.00000e+00}
      ref-p[    2]={ 0.00000e+00,  0.00000e+00,  1.00000e+00}
   refcoord-scaling               = No
   posres-com (3):
      posres-com[0]= 0.00000e+00
      posres-com[1]= 0.00000e+00
      posres-com[2]= 0.00000e+00
   posres-comB (3):
      posres-comB[0]= 0.00000e+00
      posres-comB[1]= 0.00000e+00
      posres-comB[2]= 0.00000e+00
   QMMM                           = false
   QMconstraints                  = 0
   QMMMscheme                     = 0
   MMChargeScaleFactor            = 1
qm-opts:
   ngQM                           = 0
   constraint-algorithm           = Lincs
   continuation                   = false
   Shake-SOR                      = false
   shake-tol                      = 0.0001
   lincs-order                    = 4
   lincs-iter                     = 1
   lincs-warnangle                = 30
   nwall                          = 0
   wall-type                      = 9-3
   wall-r-linpot                  = -1
   wall-atomtype[0]               = -1
   wall-atomtype[1]               = -1
   wall-density[0]                = 0
   wall-density[1]                = 0
   wall-ewald-zfac                = 3
   pull                           = false
   rotation                       = false
   interactiveMD                  = false
   disre                          = No
   disre-weighting                = Conservative
   disre-mixed                    = false
   dr-fc                          = 1000
   dr-tau                         = 0
   nstdisreout                    = 100
   orire-fc                       = 0
   orire-tau                      = 0
   nstorireout                    = 100
   free-energy                    = no
   cos-acceleration               = 0
   deform (3x3):
      deform[    0]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      deform[    1]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
      deform[    2]={ 0.00000e+00,  0.00000e+00,  0.00000e+00}
   simulated-tempering            = false
   E-x:
      n = 0
   E-xt:
      n = 0
   E-y:
      n = 0
   E-yt:
      n = 0
   E-z:
      n = 0
   E-zt:
      n = 0
   swapcoords                     = no
   userint1                       = 0
   userint2                       = 0
   userint3                       = 0
   userint4                       = 0
   userreal1                      = 0
   userreal2                      = 0
   userreal3                      = 0
   userreal4                      = 0
grpopts:
   nrdf:     73726.9      129790
   ref-t:         320         320
   tau-t:           1           1
annealing:          No          No
annealing-npoints:           0           0
   acc:	           0           0           0
   nfreeze:           N           N           N
   energygrp-flags[  0]: 0 0
   energygrp-flags[  1]: 0 0

Overriding nsteps with value passed on the command line: 1000 steps, 30 ps

Using 1 MPI process
Using 2 OpenMP threads 

1 compatible GPU is present, with ID 0
1 GPU auto-selected for this run.
Mapping of GPU ID to the 1 PP rank in this node: 0

Cut-off's:   NS: 1.307   Coulomb: 1.1   LJ: 1.1
System total charge: 0.000
Potential shift: LJ r^-12: -3.186e-01 r^-6: -5.645e-01, Coulomb -9e-01

Using GPU 8x8 non-bonded kernels

Using full Lennard-Jones parameter combination matrix

NOTE: With GPUs, reporting energy group contributions is not supported

Removing pbc first time
Pinning threads with an auto-selected logical core stride of 2
Intra-simulation communication will occur every 20 steps.
Center of mass motion removal mode is Linear
We have the following groups for center of mass motion removal:
  0:  rest

++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
G. Bussi, D. Donadio and M. Parrinello
Canonical sampling through velocity rescaling
J. Chem. Phys. 126 (2007) pp. 014101
-------- -------- --- Thank You --- -------- --------

There are: 67840 Atoms
Initial temperature: 310.331 K

Started mdrun on rank 0 Tue Apr 11 10:59:20 2017
           Step           Time
              0        0.00000

   Energies (kJ/mol)
           Bond       G96Angle        LJ (SR)   Coulomb (SR)      Potential
    3.71293e+04    1.70154e+04   -1.39456e+06   -8.15311e+03   -1.34857e+06
    Kinetic En.   Total Energy    Temperature Pressure (bar)
    2.69029e+05   -1.07954e+06    3.17976e+02   -5.26154e+02

Step 1  Warning: Pressure scaling more than 1%. This may mean your system
 is not yet equilibrated. Use of Parrinello-Rahman pressure coupling during
equilibration can lead to simulation instability, and is discouraged.

Step 21  Warning: Pressure scaling more than 1%. This may mean your system
 is not yet equilibrated. Use of Parrinello-Rahman pressure coupling during
equilibration can lead to simulation instability, and is discouraged.

Step 41  Warning: Pressure scaling more than 1%. This may mean your system
 is not yet equilibrated. Use of Parrinello-Rahman pressure coupling during
equilibration can lead to simulation instability, and is discouraged.

Step 201  Warning: Pressure scaling more than 1%. This may mean your system
 is not yet equilibrated. Use of Parrinello-Rahman pressure coupling during
equilibration can lead to simulation instability, and is discouraged.
           Step           Time
           1000       30.00000

Writing checkpoint, step 1000 at Tue Apr 11 10:59:24 2017

   Energies (kJ/mol)
           Bond       G96Angle        LJ (SR)   Coulomb (SR)      Potential
    3.83384e+04    1.70240e+04   -1.59578e+06   -8.78781e+03   -1.54920e+06
    Kinetic En.   Total Energy    Temperature Pressure (bar)
    2.73324e+05   -1.27588e+06    3.23052e+02    2.74106e+01

	<======  ###############  ==>
	<====  A V E R A G E S  ====>
	<==  ###############  ======>

	Statistics over 1001 steps using 11 frames

   Energies (kJ/mol)
           Bond       G96Angle        LJ (SR)   Coulomb (SR)      Potential
    3.83040e+04    1.71072e+04   -1.54225e+06   -8.59811e+03   -1.49543e+06
    Kinetic En.   Total Energy    Temperature Pressure (bar)
    2.84298e+05   -1.21114e+06    3.36022e+02   -3.56540e+01

          Box-X          Box-Y          Box-Z
    2.55722e+01    2.55722e+01    1.24057e+01

   Total Virial (kJ/mol)
    1.03901e+05   -1.74106e+02   -3.63408e+02
   -1.73936e+02    1.02492e+05    9.74937e+02
   -3.63537e+02    9.74952e+02    1.07254e+05

   Pressure (bar)
   -3.37355e+01    1.92901e-01    1.12682e+00
    1.92200e-01   -2.74952e+01   -3.45536e+00
    1.12736e+00   -3.45542e+00   -4.57314e+01

  Epot (kJ/mol)        Coul-SR          LJ-SR   
      POPC-POPC   -8.59811e+03   -1.54225e+06
         POPC-W    0.00000e+00    0.00000e+00
            W-W    0.00000e+00    0.00000e+00

         T-POPC            T-W
    3.19227e+02    3.45563e+02

	M E G A - F L O P S   A C C O U N T I N G

 NB=Group-cutoff nonbonded kernels    NxN=N-by-N cluster Verlet kernels
 RF=Reaction-Field  VdW=Van der Waals  QSTab=quadratic-spline table
 W3=SPC/TIP3p  W4=TIP4p (single or pairs)
 V&F=Potential and force  V=Potential only  F=Force only

 Computing:                               M-Number         M-Flops  % Flops
-----------------------------------------------------------------------------
 Pair Search distance check             186.766128        1680.895     0.2
 NxN RF Elec. + LJ [F]                18066.703168      686534.720    97.6
 NxN RF Elec. + LJ [V&F]                200.750272       10840.515     1.5
 Shift-X                                  3.459840          20.759     0.0
 Bonds                                   22.550528        1330.481     0.2
 Angles                                  16.400384        2755.265     0.4
 Virial                                   3.462135          62.318     0.0
 Stop-CM                                  0.814080           8.141     0.0
 Calc-Ekin                                6.919680         186.831     0.0
-----------------------------------------------------------------------------
 Total                                                  703419.926   100.0
-----------------------------------------------------------------------------

     R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G

On 1 MPI rank, each using 2 OpenMP threads

 Computing:          Num   Num      Call    Wall time         Giga-Cycles
                     Ranks Threads  Count      (s)         total sum    %
-----------------------------------------------------------------------------
 Neighbor search        1    2         51       0.367          2.637   8.8
 Launch GPU ops.        1    2       1001       0.051          0.369   1.2
 Force                  1    2       1001       0.883          6.345  21.1
 Wait GPU local         1    2       1001       1.219          8.755  29.1
 NB X/F buffer ops.     1    2       1951       0.308          2.216   7.4
 Write traj.            1    2          2       0.289          2.079   6.9
 Update                 1    2       1001       0.920          6.608  22.0
 Rest                                           0.146          1.051   3.5
-----------------------------------------------------------------------------
 Total                                          4.185         30.060 100.0
-----------------------------------------------------------------------------

 GPU timings
-----------------------------------------------------------------------------
 Computing:                         Count  Wall t (s)      ms/step       %
-----------------------------------------------------------------------------
 Pair list H2D                         51       0.014        0.265     0.6
 X / q H2D                           1001       0.117        0.117     5.5
 Nonbonded F kernel                   950       1.779        1.873    83.6
 Nonbonded F+prune k.                  40       0.104        2.593     4.9
 Nonbonded F+ene+prune k.              11       0.033        2.962     1.5
 F D2H                               1001       0.082        0.082     3.9
-----------------------------------------------------------------------------
 Total                                          2.128        2.126   100.0
-----------------------------------------------------------------------------

Average per-step force GPU/CPU evaluation time ratio: 2.126 ms/0.882 ms = 2.409

               Core t (s)   Wall t (s)        (%)
       Time:        8.369        4.185      200.0
                 (ns/day)    (hour/ns)
Performance:      620.039        0.039
Finished mdrun on rank 0 Tue Apr 11 10:59:24 2017