[gmx-developers] Branches w/working OpenCL support
Mirco Wahab
mirco.wahab at chemie.tu-freiberg.de
Thu Jun 18 23:18:12 CEST 2015
Hi Mark,
On 17.06.2015 23:31, Mark Abraham wrote:
> I've solved three correctness issues with the OpenCL implementation now,
> and NPT on AMD looks quite nice. Can you please try the code at
> https://gerrit.gromacs.org/#/c/4314/26 and let us know if the situation
> is improved?
I tried but could not really compile under VS-2013/MSVC-18. The problem
is, as is already stated in a comment, the macro expansion into an
alignment declaration, which is not supported by MSVC:
[basedefinitions.h]
244:
#if defined(_MSC_VER) && (_MSC_VER >= 1700 || defined(__ICL))
# define GMX_ALIGNMENT 1
# define GMX_ALIGNED(type, alignment)
__declspec(align(alignment*sizeof(type))) type
#elif defined(__GNUC__) || defined(__clang__)
# define GMX_ALIGNMENT 1
...
I then changed the "(alignment*sizeof(type))" expression
into "32" which would be probably the value it's been
expanded to:
# define GMX_ALIGNED(type, alignment) __declspec(align(32)) type
but tested other values (16, 64).
I'm not sure this is the show-stopper, but any trial yielded the
same result. mdrun segfaults immediately after initializing
independent of the input file (but working fine in nb=cpu mode).
I'm attaching the log file here.
regards,
M.
--------------------- 8< [crashed md.log w/gpu] ------------------------
Log file opened on Thu Jun 18 23:09:02 2015
Host: DENEB pid: 4884 rank ID: 0 number of ranks: 1
:-) GROMACS - gmx mdrun, VERSION 5.1-beta1-dev (-:
GROMACS is written by:
Emile Apol Rossen Apostolov Herman J.C. Berendsen Par
Bjelkmar
Aldert van Buuren Rudi van Drunen Anton Feenstra Sebastian
Fritsch
Gerrit Groenhof Christoph Junghans Anca Hamuraru Vincent
Hindriksen
Dimitrios Karkoulis Peter Kasson Carsten Kutzner Per
Larsson
Justin A. Lemkul Magnus Lundborg Pieter Meulenhoff Erik
Marklund
Teemu Murtola Szilard Pall Sander Pronk Roland
Schulz
Alexey Shvetsov Michael Shirts Alfons Sijbers Peter
Tieleman
Teemu Virolainen Christian Wennberg Maarten Wolf
and the project leaders:
Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2015, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.
GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.
GROMACS: gmx mdrun, VERSION 5.1-beta1-dev
Executable: D:\GromacsCL\bin\gmx.exe
Data prefix: D:\GromacsCL
Command line:
gmx mdrun -v
GROMACS version: VERSION 5.1-beta1-dev
Precision: single
Memory model: 64 bit
MPI library: thread_mpi
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 32)
GPU support: enabled
OpenCL support: enabled
invsqrt routine: gmx_software_invsqrt(x)
SIMD instructions: SSE2
FFT library: fftw3
RDTSCP usage: enabled
C++11 compilation: disabled
TNG support: enabled
Tracing support: disabled
Built on: Unknown date
Built by: Anonymous at unknown [CMAKE]
Build OS/arch: Windows-6.2 AMD64
Build CPU vendor: AuthenticAMD
Build CPU brand: AMD Phenom(tm) II X6 1090T Processor
Build CPU family: 16 Model: 10 Stepping: 0
Build CPU features: apic clfsh cmov cx8 cx16 htt lahf_lm misalignsse mmx
msr nonstop_tsc pdpe1gb popcnt pse rdtscp sse2 sse3 sse4a
C compiler: C:/Program Files (x86)/Microsoft Visual Studio
12.0/VC/bin/x86_amd64/cl.exe MSVC 18.0.31101.0
C compiler flags: /DWIN32 /D_WINDOWS /W3 /MD /O2 /Ob2 /D NDEBUG
C++ compiler: C:/Program Files (x86)/Microsoft Visual Studio
12.0/VC/bin/x86_amd64/cl.exe MSVC 18.0.31101.0
C++ compiler flags: /DWIN32 /D_WINDOWS /W3 /GR /EHsc /wd4800
/wd4355 /wd4996 /wd4305 /wd4244 /wd4101 /wd4267 /wd4090 /MD /O2 /Ob2 /D
NDEBUG
Boost version: 1.57.0 (external)
OpenCL include dir: C:/Program Files (x86)/AMD APP SDK/3.0-0-Beta/include
OpenCL library: C:/Program Files (x86)/AMD APP
SDK/3.0-0-Beta/lib/x86_64/OpenCL.lib
OpenCL version: 2.0
Running on 1 node with total 6 cores, 6 hardware threads, 1 compatible GPU
Hardware detected:
CPU info:
Vendor: AuthenticAMD
Brand: AMD Phenom(tm) II X6 1090T Processor
Family: 16 model: 10 stepping: 0
CPU features: apic clfsh cmov cx8 cx16 htt lahf_lm misalignsse mmx
msr nonstop_tsc pdpe1gb popcnt pse rdtscp sse2 sse3 sse4a
SIMD instructions most likely to fit this hardware: SSE2
SIMD instructions selected at GROMACS compile time: SSE2
GPU info:
Number of GPUs detected: 1
#0: name: Pitcairn, vendor: Advanced Micro Devices, Inc., device
version: OpenCL 1.2 AMD-APP (1642.5), stat: compatible
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl
GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable
molecular simulation
J. Chem. Theory Comput. 4 (2008) pp. 435-447
-------- -------- --- Thank You --- -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C.
Berendsen
GROMACS: Fast, Flexible and Free
J. Comp. Chem. 26 (2005) pp. 1701-1719
-------- -------- --- Thank You --- -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
E. Lindahl and B. Hess and D. van der Spoel
GROMACS 3.0: A package for molecular simulation and trajectory analysis
J. Mol. Mod. 7 (2001) pp. 306-317
-------- -------- --- Thank You --- -------- --------
++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
H. J. C. Berendsen, D. van der Spoel and R. van Drunen
GROMACS: A message-passing parallel molecular dynamics implementation
Comp. Phys. Comm. 91 (1995) pp. 43-56
-------- -------- --- Thank You --- -------- --------
Input Parameters:
integrator = md
tinit = 0
dt = 0.001
nsteps = 100000
init-step = 0
simulation-part = 1
comm-mode = Linear
nstcomm = 100
bd-fric = 0
ld-seed = 665864
emtol = 10
emstep = 0.01
niter = 20
fcstep = 0
nstcgsteep = 1000
nbfgscorr = 10
rtpi = 0.05
nstxout = 0
nstvout = 0
nstfout = 0
nstlog = 100
nstcalcenergy = 100
nstenergy = 1000
nstxout-compressed = 100
compressed-x-precision = 1000
cutoff-scheme = Verlet
nstlist = 25
ns-type = Grid
pbc = xyz
periodic-molecules = FALSE
verlet-buffer-tolerance = 0.005
rlist = 1.097
rlistlong = 1.097
nstcalclr = 25
coulombtype = Reaction-Field
coulomb-modifier = Potential-shift
rcoulomb-switch = 0
rcoulomb = 1
epsilon-r = 1
epsilon-rf = inf
vdw-type = Cut-off
vdw-modifier = Potential-shift
rvdw-switch = 0
rvdw = 1
DispCorr = No
table-extension = 1
fourierspacing = 0.16
fourier-nx = 0
fourier-ny = 0
fourier-nz = 0
pme-order = 4
ewald-rtol = 1e-005
ewald-rtol-lj = 0.001
lj-pme-comb-rule = Geometric
ewald-geometry = 0
epsilon-surface = 0
implicit-solvent = No
gb-algorithm = Still
nstgbradii = 1
rgbradii = 1
gb-epsilon-solvent = 80
gb-saltconc = 0
gb-obc-alpha = 1
gb-obc-beta = 0.8
gb-obc-gamma = 4.85
gb-dielectric-offset = 0.009
sa-algorithm = Ace-approximation
sa-surface-tension = 2.05016
tcoupl = V-rescale
nsttcouple = 25
nh-chain-length = 0
print-nose-hoover-chain-variables = FALSE
pcoupl = No
pcoupltype = Isotropic
nstpcouple = -1
tau-p = 2
compressibility (3x3):
compressibility[ 0]={0.00000e+000, 0.00000e+000, 0.00000e+000}
compressibility[ 1]={0.00000e+000, 0.00000e+000, 0.00000e+000}
compressibility[ 2]={0.00000e+000, 0.00000e+000, 0.00000e+000}
ref-p (3x3):
ref-p[ 0]={0.00000e+000, 0.00000e+000, 0.00000e+000}
ref-p[ 1]={0.00000e+000, 0.00000e+000, 0.00000e+000}
ref-p[ 2]={0.00000e+000, 0.00000e+000, 0.00000e+000}
refcoord-scaling = No
posres-com (3):
posres-com[0]=0.00000e+000
posres-com[1]=0.00000e+000
posres-com[2]=0.00000e+000
posres-comB (3):
posres-comB[0]=0.00000e+000
posres-comB[1]=0.00000e+000
posres-comB[2]=0.00000e+000
QMMM = FALSE
QMconstraints = 0
QMMMscheme = 0
MMChargeScaleFactor = 1
qm-opts:
ngQM = 0
constraint-algorithm = Lincs
continuation = FALSE
Shake-SOR = FALSE
shake-tol = 0.0001
lincs-order = 4
lincs-iter = 1
lincs-warnangle = 30
nwall = 0
wall-type = 9-3
wall-r-linpot = -1
wall-atomtype[0] = -1
wall-atomtype[1] = -1
wall-density[0] = 0
wall-density[1] = 0
wall-ewald-zfac = 3
pull = FALSE
rotation = FALSE
interactiveMD = FALSE
disre = No
disre-weighting = Conservative
disre-mixed = FALSE
dr-fc = 1000
dr-tau = 0
nstdisreout = 100
orire-fc = 0
orire-tau = 0
nstorireout = 100
free-energy = no
cos-acceleration = 0
deform (3x3):
deform[ 0]={0.00000e+000, 0.00000e+000, 0.00000e+000}
deform[ 1]={0.00000e+000, 0.00000e+000, 0.00000e+000}
deform[ 2]={0.00000e+000, 0.00000e+000, 0.00000e+000}
simulated-tempering = FALSE
E-x:
n = 0
E-xt:
n = 0
E-y:
n = 0
E-yt:
n = 0
E-z:
n = 0
E-zt:
n = 0
swapcoords = no
adress = FALSE
userint1 = 0
userint2 = 0
userint3 = 0
userint4 = 0
userreal1 = 0
userreal2 = 0
userreal3 = 0
userreal4 = 0
grpopts:
nrdf: 17493
ref-t: 300
tau-t: 0.2
annealing: No
annealing-npoints: 0
acc: 0 0 0
nfreeze: N N N
energygrp-flags[ 0]: 0
More information about the gromacs.org_gmx-developers
mailing list