[gmx-developers] Branches w/working OpenCL support

Roland Schulz roland at utk.edu
Thu Jun 18 23:31:58 CEST 2015


On Thu, Jun 18, 2015 at 5:12 PM, Mirco Wahab <
mirco.wahab at chemie.tu-freiberg.de> wrote:

> Hi Mark,
>
> On 17.06.2015 23:31, Mark Abraham wrote:
> > I've solved three correctness issues with the OpenCL implementation now,
> > and NPT on AMD looks quite nice. Can you please try the code at
> > https://gerrit.gromacs.org/#/c/4314/26 and let us know if the situation
> > is improved?
>
> I tried but could not really compile under VS-2013/MSVC-18. The problem
> is, as is already stated in a comment, the macro expansion into an
> alignment declaration, which is not supported by MSVC:
>

This is weird. Because it works for me with VS-2013. Do you have the latest
Update (2013 Update 4)?

Roland

>
> [basedefinitions.h]
> 244:
>    #if defined(_MSC_VER) && (_MSC_VER >= 1700 || defined(__ICL))
>    #  define GMX_ALIGNMENT 1
>    #  define GMX_ALIGNED(type, alignment)
> __declspec(align(alignment*sizeof(type))) type
>    #elif defined(__GNUC__) || defined(__clang__)
>    #  define GMX_ALIGNMENT 1
>    ...
>
>
> I then changed the "(alignment*sizeof(type))" expression
> into "32" which would be probably the value it's been
> expanded to:
>   # define GMX_ALIGNED(type, alignment) __declspec(align(32)) type
> but tested other values (16, 64).
>
> I'm not sure this is the show-stopper, but any trial yielded the
> same result. mdrun segfaults immediately after initializing
> independent of the input file (but working fine in nb=cpu mode).
>
> I'm attaching the log file here.
>
> regards,
>
> M.
>
>
>
> --------------------- 8< [crashed md.log w/gpu] ------------------------
> Log file opened on Thu Jun 18 23:09:02 2015
> Host: DENEB  pid: 4884  rank ID: 0  number of ranks:  1
>                 :-) GROMACS - gmx mdrun, VERSION 5.1-beta1-dev (-:
>
>                              GROMACS is written by:
>       Emile Apol      Rossen Apostolov  Herman J.C. Berendsen    Par
> Bjelkmar
>   Aldert van Buuren   Rudi van Drunen     Anton Feenstra   Sebastian
> Fritsch
>    Gerrit Groenhof   Christoph Junghans   Anca Hamuraru    Vincent
> Hindriksen
>   Dimitrios Karkoulis    Peter Kasson     Carsten Kutzner      Per
> Larsson
>    Justin A. Lemkul   Magnus Lundborg   Pieter Meulenhoff    Erik
> Marklund
>     Teemu Murtola       Szilard Pall       Sander Pronk      Roland
> Schulz
>    Alexey Shvetsov     Michael Shirts     Alfons Sijbers     Peter
> Tieleman
>    Teemu Virolainen  Christian Wennberg    Maarten Wolf
>                             and the project leaders:
>          Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel
>
> Copyright (c) 1991-2000, University of Groningen, The Netherlands.
> Copyright (c) 2001-2015, The GROMACS development team at
> Uppsala University, Stockholm University and
> the Royal Institute of Technology, Sweden.
> check out http://www.gromacs.org for more information.
>
> GROMACS is free software; you can redistribute it and/or modify it
> under the terms of the GNU Lesser General Public License
> as published by the Free Software Foundation; either version 2.1
> of the License, or (at your option) any later version.
>
> GROMACS:      gmx mdrun, VERSION 5.1-beta1-dev
> Executable:   D:\GromacsCL\bin\gmx.exe
> Data prefix:  D:\GromacsCL
> Command line:
>    gmx mdrun -v
>
> GROMACS version:    VERSION 5.1-beta1-dev
> Precision:          single
> Memory model:       64 bit
> MPI library:        thread_mpi
> OpenMP support:     enabled (GMX_OPENMP_MAX_THREADS = 32)
> GPU support:        enabled
> OpenCL support:     enabled
> invsqrt routine:    gmx_software_invsqrt(x)
> SIMD instructions:  SSE2
> FFT library:        fftw3
> RDTSCP usage:       enabled
> C++11 compilation:  disabled
> TNG support:        enabled
> Tracing support:    disabled
> Built on:           Unknown date
> Built by:           Anonymous at unknown [CMAKE]
> Build OS/arch:      Windows-6.2 AMD64
> Build CPU vendor:   AuthenticAMD
> Build CPU brand:    AMD Phenom(tm) II X6 1090T Processor
> Build CPU family:   16   Model: 10   Stepping: 0
> Build CPU features: apic clfsh cmov cx8 cx16 htt lahf_lm misalignsse mmx
> msr nonstop_tsc pdpe1gb popcnt pse rdtscp sse2 sse3 sse4a
> C compiler:         C:/Program Files (x86)/Microsoft Visual Studio
> 12.0/VC/bin/x86_amd64/cl.exe MSVC 18.0.31101.0
> C compiler flags:        /DWIN32 /D_WINDOWS /W3  /MD /O2 /Ob2 /D NDEBUG
> C++ compiler:       C:/Program Files (x86)/Microsoft Visual Studio
> 12.0/VC/bin/x86_amd64/cl.exe MSVC 18.0.31101.0
> C++ compiler flags:      /DWIN32 /D_WINDOWS /W3 /GR /EHsc /wd4800
> /wd4355 /wd4996 /wd4305 /wd4244 /wd4101 /wd4267 /wd4090  /MD /O2 /Ob2 /D
> NDEBUG
> Boost version:      1.57.0 (external)
> OpenCL include dir: C:/Program Files (x86)/AMD APP SDK/3.0-0-Beta/include
> OpenCL library:     C:/Program Files (x86)/AMD APP
> SDK/3.0-0-Beta/lib/x86_64/OpenCL.lib
> OpenCL version:     2.0
>
>
> Running on 1 node with total 6 cores, 6 hardware threads, 1 compatible GPU
> Hardware detected:
>    CPU info:
>      Vendor: AuthenticAMD
>      Brand:  AMD Phenom(tm) II X6 1090T Processor
>      Family: 16  model: 10  stepping:  0
>      CPU features: apic clfsh cmov cx8 cx16 htt lahf_lm misalignsse mmx
> msr nonstop_tsc pdpe1gb popcnt pse rdtscp sse2 sse3 sse4a
>      SIMD instructions most likely to fit this hardware: SSE2
>      SIMD instructions selected at GROMACS compile time: SSE2
>    GPU info:
>      Number of GPUs detected: 1
>      #0: name: Pitcairn, vendor: Advanced Micro Devices, Inc., device
> version: OpenCL 1.2 AMD-APP (1642.5), stat: compatible
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> B. Hess and C. Kutzner and D. van der Spoel and E. Lindahl
> GROMACS 4: Algorithms for highly efficient, load-balanced, and scalable
> molecular simulation
> J. Chem. Theory Comput. 4 (2008) pp. 435-447
> -------- -------- --- Thank You --- -------- --------
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C.
> Berendsen
> GROMACS: Fast, Flexible and Free
> J. Comp. Chem. 26 (2005) pp. 1701-1719
> -------- -------- --- Thank You --- -------- --------
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> E. Lindahl and B. Hess and D. van der Spoel
> GROMACS 3.0: A package for molecular simulation and trajectory analysis
> J. Mol. Mod. 7 (2001) pp. 306-317
> -------- -------- --- Thank You --- -------- --------
>
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> H. J. C. Berendsen, D. van der Spoel and R. van Drunen
> GROMACS: A message-passing parallel molecular dynamics implementation
> Comp. Phys. Comm. 91 (1995) pp. 43-56
> -------- -------- --- Thank You --- -------- --------
>
> Input Parameters:
>     integrator                     = md
>     tinit                          = 0
>     dt                             = 0.001
>     nsteps                         = 100000
>     init-step                      = 0
>     simulation-part                = 1
>     comm-mode                      = Linear
>     nstcomm                        = 100
>     bd-fric                        = 0
>     ld-seed                        = 665864
>     emtol                          = 10
>     emstep                         = 0.01
>     niter                          = 20
>     fcstep                         = 0
>     nstcgsteep                     = 1000
>     nbfgscorr                      = 10
>     rtpi                           = 0.05
>     nstxout                        = 0
>     nstvout                        = 0
>     nstfout                        = 0
>     nstlog                         = 100
>     nstcalcenergy                  = 100
>     nstenergy                      = 1000
>     nstxout-compressed             = 100
>     compressed-x-precision         = 1000
>     cutoff-scheme                  = Verlet
>     nstlist                        = 25
>     ns-type                        = Grid
>     pbc                            = xyz
>     periodic-molecules             = FALSE
>     verlet-buffer-tolerance        = 0.005
>     rlist                          = 1.097
>     rlistlong                      = 1.097
>     nstcalclr                      = 25
>     coulombtype                    = Reaction-Field
>     coulomb-modifier               = Potential-shift
>     rcoulomb-switch                = 0
>     rcoulomb                       = 1
>     epsilon-r                      = 1
>     epsilon-rf                     = inf
>     vdw-type                       = Cut-off
>     vdw-modifier                   = Potential-shift
>     rvdw-switch                    = 0
>     rvdw                           = 1
>     DispCorr                       = No
>     table-extension                = 1
>     fourierspacing                 = 0.16
>     fourier-nx                     = 0
>     fourier-ny                     = 0
>     fourier-nz                     = 0
>     pme-order                      = 4
>     ewald-rtol                     = 1e-005
>     ewald-rtol-lj                  = 0.001
>     lj-pme-comb-rule               = Geometric
>     ewald-geometry                 = 0
>     epsilon-surface                = 0
>     implicit-solvent               = No
>     gb-algorithm                   = Still
>     nstgbradii                     = 1
>     rgbradii                       = 1
>     gb-epsilon-solvent             = 80
>     gb-saltconc                    = 0
>     gb-obc-alpha                   = 1
>     gb-obc-beta                    = 0.8
>     gb-obc-gamma                   = 4.85
>     gb-dielectric-offset           = 0.009
>     sa-algorithm                   = Ace-approximation
>     sa-surface-tension             = 2.05016
>     tcoupl                         = V-rescale
>     nsttcouple                     = 25
>     nh-chain-length                = 0
>     print-nose-hoover-chain-variables = FALSE
>     pcoupl                         = No
>     pcoupltype                     = Isotropic
>     nstpcouple                     = -1
>     tau-p                          = 2
>     compressibility (3x3):
>        compressibility[    0]={0.00000e+000, 0.00000e+000, 0.00000e+000}
>        compressibility[    1]={0.00000e+000, 0.00000e+000, 0.00000e+000}
>        compressibility[    2]={0.00000e+000, 0.00000e+000, 0.00000e+000}
>     ref-p (3x3):
>        ref-p[    0]={0.00000e+000, 0.00000e+000, 0.00000e+000}
>        ref-p[    1]={0.00000e+000, 0.00000e+000, 0.00000e+000}
>        ref-p[    2]={0.00000e+000, 0.00000e+000, 0.00000e+000}
>     refcoord-scaling               = No
>     posres-com (3):
>        posres-com[0]=0.00000e+000
>        posres-com[1]=0.00000e+000
>        posres-com[2]=0.00000e+000
>     posres-comB (3):
>        posres-comB[0]=0.00000e+000
>        posres-comB[1]=0.00000e+000
>        posres-comB[2]=0.00000e+000
>     QMMM                           = FALSE
>     QMconstraints                  = 0
>     QMMMscheme                     = 0
>     MMChargeScaleFactor            = 1
> qm-opts:
>     ngQM                           = 0
>     constraint-algorithm           = Lincs
>     continuation                   = FALSE
>     Shake-SOR                      = FALSE
>     shake-tol                      = 0.0001
>     lincs-order                    = 4
>     lincs-iter                     = 1
>     lincs-warnangle                = 30
>     nwall                          = 0
>     wall-type                      = 9-3
>     wall-r-linpot                  = -1
>     wall-atomtype[0]               = -1
>     wall-atomtype[1]               = -1
>     wall-density[0]                = 0
>     wall-density[1]                = 0
>     wall-ewald-zfac                = 3
>     pull                           = FALSE
>     rotation                       = FALSE
>     interactiveMD                  = FALSE
>     disre                          = No
>     disre-weighting                = Conservative
>     disre-mixed                    = FALSE
>     dr-fc                          = 1000
>     dr-tau                         = 0
>     nstdisreout                    = 100
>     orire-fc                       = 0
>     orire-tau                      = 0
>     nstorireout                    = 100
>     free-energy                    = no
>     cos-acceleration               = 0
>     deform (3x3):
>        deform[    0]={0.00000e+000, 0.00000e+000, 0.00000e+000}
>        deform[    1]={0.00000e+000, 0.00000e+000, 0.00000e+000}
>        deform[    2]={0.00000e+000, 0.00000e+000, 0.00000e+000}
>     simulated-tempering            = FALSE
>     E-x:
>        n = 0
>     E-xt:
>        n = 0
>     E-y:
>        n = 0
>     E-yt:
>        n = 0
>     E-z:
>        n = 0
>     E-zt:
>        n = 0
>     swapcoords                     = no
>     adress                         = FALSE
>     userint1                       = 0
>     userint2                       = 0
>     userint3                       = 0
>     userint4                       = 0
>     userreal1                      = 0
>     userreal2                      = 0
>     userreal3                      = 0
>     userreal4                      = 0
> grpopts:
>     nrdf:       17493
>     ref-t:         300
>     tau-t:         0.2
> annealing:          No
> annealing-npoints:           0
>     acc:                   0           0           0
>     nfreeze:           N           N           N
>     energygrp-flags[  0]: 0
>
> --
> Gromacs Developers mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-developers_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-developers
> or send a mail to gmx-developers-request at gromacs.org.
>



-- 
ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20150618/2aeb1271/attachment-0001.html>


More information about the gromacs.org_gmx-developers mailing list