[gmx-users] GROMACS 4.6 with GPU acceleration (double presion)
szilard.pall at cbr.su.se
Mon Apr 22 14:31:17 CEST 2013
On Tue, Apr 9, 2013 at 6:52 PM, David van der Spoel <spoel at xray.bmc.uu.se>wrote:
> On 2013-04-09 18:06, Mikhail Stukan wrote:
>> Dear experts,
>> I have the following question. I am trying to compile GROMACS 4.6.1 with
>> GPU acceleration and have the following diagnostics:
>> # cmake .. -DGMX_DOUBLE=ON -DGMX_BUILD_OWN_FFTW=ON -DGMX_GPU=ON
>> -DCUDA_TOOLKIT_ROOT_DIR=/usr/**local/cuda -DCUDA_HOST_COMPILER=/usr/bin/*
>> *gcc -DCUDA_PROPAGATE_HOST_FLAGS=**OFF
>> CMake Error at cmake/gmxManageGPU.cmake:46 (message):
>> GPU acceleration is not available in double precision!
>> Call Stack (most recent call first):
>> CMakeLists.txt:143 (include)
>> Are there any plans to have double precision with GPU acceleration in the
>> coming version of GROMACS or this will not happen in the nearest future.
>> The hardware does not support it yet AFAIK.
It does, NVIDIA GPUs do have double precision support. The error message
seems clear enough to me, but as this question has come up multiple times,
let me elaborate what are the reasons for not including double-precision
CUDA kernels in the 4.6 release.
The peak performance of double precision arithmetic on NVIDIA hardware can
be anywhere between 2-24x slower than single precision. While most
professional cards (Tesla, Quadro) have double precision performance more
in the reasonable/expected range of half to third of the single precision,
consumer GPUs (=GeForce) are crippled in double precision. Most notably,
the latest (Kepler) GeForce took the differentiation to a new level by
providing 24x lower performance in double than in single precision. What
makes things even more confusing from a user's perspective is that while
the Tesla K10 is a professional card, it also suffers from the reduced
double precision performance (as do some of the Quadros as well like the
Additionally, we use the highly efficient atomic operations in the GPU
accelerated kernels which are available only in single precision. Emulating
these is possible (using atomicCAS), but it results in an additional,
considerable performance hit.
I guess it is clear that have we had double precision GPU acceleration
support in the 4.6 release, a large number of users could/would have ended
up disappointed by the GPU acceleration performance simply because their
GPU (consumer or professional) happens to have reduced double precision
performance. Of course, detecting such cases and warning about them or
automatically switching off GPU acceleration is possible, but with the
amount of (human) resources available and considering that double precision
is, while sometimes wanted, rarely *needed*, we decided to not provide GPU
acceleration in double precision.
We do intend to implement other precision modes in the future (including
double precision and probably an additional mixed precision), but such
features will come earliest in the next major release or later.
If anyone thinks that double precision GPU kernels are a priority, feel
free to let yourself heard *and* provide your use case in the comments
section of the feature request page I just opened:
> Thanks and regards,
> David van der Spoel, Ph.D., Professor of Biology
> Dept. of Cell & Molec. Biol., Uppsala University.
> Box 596, 75124 Uppsala, Sweden. Phone: +46184714205.
> spoel at xray.bmc.uu.se http://folding.bmc.uu.se
> gmx-users mailing list gmx-users at gromacs.org
> * Please search the archive at http://www.gromacs.org/**
> Support/Mailing_Lists/Search<http://www.gromacs.org/Support/Mailing_Lists/Search>before posting!
> * Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/**Support/Mailing_Lists<http://www.gromacs.org/Support/Mailing_Lists>
More information about the gromacs.org_gmx-users