[gmx-users] Results of villin headpiece with AMD 8 core

Fri Jan 11 19:55:02 CET 2019

Dear Users,
For those of you considering a workstation build and wonder about AMD processors I have the following results using the included npt and log intro for the villin headpiece in ~ 8000 atoms spc/e. The npt was run from a similar nvt ( 100000 steps ) . The best results were achieved with the simplest command line - letting Gromacs choose threads.
The system became unstable at dt =0.005 ns step. Note the close correspondence between rcoulomb, rvdw and cutoffswitch. Results compare favorably with the E5-2690+GTX Titan demo
http://on-demand.gputechconf.com/gtc/2013/webinar/gromacs-kepler-gpus-gtc-express-webinar.pdf (https://link.getmailspring.com/link/1547231722.local-ad2d5ea3-b061-v1.5.2-31660462@getmailspring.com/0?redirect=http%3A%2F%2Fon-demand.gputechconf.com%2Fgtc%2F2013%2Fwebinar%2Fgromacs-kepler-gpus-gtc-express-webinar.pdf&recipient=Z214LXVzZXJzQGdyb21hY3Mub3Jn)

Core t (s) Wall t (s) (%)
Time: 112.643 14.080 800.0
(ns/day) (hour/ns)
Performance: 1288.622 0.019

define = -DPOSRES ; position restrain the protein and ligand
; Run parameters
integrator = md ; leap-frog integrator
nsteps = 50000 ; 2 * 50000 = 100 ps
dt = 0.0042 ; ns
; Output control
nstenergy = 500 ; save energies every 1.0 ps
nstlog = 500 ; update log file every 1.0 ps
nstxout-compressed = 500 ; save coordinates every 1.0 ps
; Bond parameters
continuation = yes ; continuing from NVT
constraint_algorithm = lincs ; holonomic constraints
constraints = h-bonds
lincs_iter = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to accuracy
lincs-warnangle = 35

; Neighbor searching and vdW
cutoff-scheme = Verlet
ns_type = grid ; search neighboring grid cells
nstlist = 20 ; largely irrelevant with Verlet
rlist = 1.51
vdwtype = cutoff
vdw-modifier = force-switch
rvdw-switch = 1.0
rvdw = 1.1 ; short-range van der Waals cutoff (in nm)

; Electrostatics
coulombtype = PME ; Particle Mesh Ewald for long-range electrostatics
rcoulomb = 1.11
pme_order = 4 ; cubic interpolation
fourierspacing = .12 ; grid spacing for FFT

; Temperature coupling
tcoupl = V-rescale ; modified Berendsen thermostat
tc-grps = Protein Water_and_ions ; two coupling groups - more accurate
tau_t = 0.1 0.1 ; time constant, in ps
ref_t = 300 300 ; reference temperature, one for each group, in K
; Pressure coupling
pcoupl = Berendsen ; pressure coupling is on for NPT
pcoupltype = isotropic ; uniform scaling of box vectors
tau_p = 2.0 ; time constant, in ps
ref_p = 1.0 ; reference pressure, in bar
compressibility = 4.5e-5 ; isothermal compressibility of water, bar^-1
refcoord_scaling = com
; Periodic boundary conditions
pbc = xyz ; 3-D PBC
; Dispersion correction is not used for proteins with the C36 additive FF
DispCorr = no
; Velocity generation
gen_vel = no ; velocity generation off after NVT

=========================== log ===================================
GROMACS: gmx mdrun, version 2018.3
Executable: /usr/local/gromacs/bin/gmx
Data prefix: /usr/local/gromacs
Working dir: /home/pb/Desktop/villin
Command line: gmx mdrun -deffnm villin.md5

GROMACS version: 2018.3
Precision: single
Memory model: 64 bit
MPI library: thread_mpi
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
GPU support: CUDA
SIMD instructions: AVX2_128
FFT library: fftw-3.3.8-sse2-avx-avx2-avx2_128-avx512
RDTSCP usage: enabled
TNG support: enabled
Hwloc support: disabled
Tracing support: disabled
Built on: 2018-11-01 17:44:10
Built by: pb at Ryzen [CMAKE]
Build OS/arch: Linux 4.15.0-20-generic x86_64
Build CPU vendor: AMD
Build CPU brand: AMD Ryzen 7 2700X Eight-Core Processor
Build CPU family: 23 Model: 8 Stepping: 2
Build CPU features: aes amd apic avx avx2 clfsh cmov cx8 cx16 f16c fma htt lahf misalignsse mmx msr nonstop_tsc pclmuldq pdpe1gb popcnt pse rdrnd rdtscp sha sse2 sse3 sse4a sse4.1 sse4.2 ssse3
C compiler: /usr/bin/gcc-6 GNU 6.4.0
C compiler flags: -march=core-avx2 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
C++ compiler: /usr/bin/g++-6 GNU 6.4.0
C++ compiler flags: -march=core-avx2 -std=c++11 -O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast
CUDA compiler: /usr/bin/nvcc nvcc: NVIDIA (R) Cuda compiler driver;Copyright (c) 2005-2017 NVIDIA Corporation;Built on Fri_Nov__3_21:07:56_CDT_2017;Cuda compilation tools, release 9.1, V9.1.85
CUDA compiler flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_70,code=compute_70;-use_fast_math;-D_FORCE_INLINES;; ;-march=core-avx2;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;
CUDA driver: 10.0
CUDA runtime: 9.10

Running on 1 node with total 8 cores, 16 logical cores, 1 compatible GPU
Hardware detected:
CPU info:
Vendor: AMD
Brand: AMD Ryzen 7 2700X Eight-Core Processor
Family: 23 Model: 8 Stepping: 2
Features: aes amd apic avx avx2 clfsh cmov cx8 cx16 f16c fma htt lahf misalignsse mmx msr nonstop_tsc pclmuldq pdpe1gb popcnt pse rdrnd rdtscp sha sse2 sse3 sse4a sse4.1 sse4.2 ssse3
Hardware topology: Basic
Sockets, cores, and logical processors:
Socket 0: [ 0 1] [ 2 3] [ 4 5] [ 6 7] [ 8 9] [ 10 11] [ 12 13] [ 14 15]
GPU info:
Number of GPUs detected: 1
#0: NVIDIA GeForce GTX 1080 Ti, compute cap.: 6.1, ECC: no, stat: compatible