[gmx-users] Short how-to for installing GROMACS with CUDA ...

Thu Dec 17 18:06:29 CET 2015

PS: One more thing. If the CUDA SDK samples linked against the CUDA runtime
library (libcudart) did really work and gmx/mdrun did not (assuming the
same driver/kernel module), the only reasonable explanation I can think of
is that the two were using different runtimes. Note that GROMACS sets
RPATH, so it does not need nor is it affected by LD_LIBRARY_PATH tinkering
while the SDK samples need LD_LIBRARY_PATH to point to the correct
libcudart!

--
Szilárd

On Thu, Dec 17, 2015 at 6:03 PM, Szilárd Páll <pall.szilard at gmail.com>
wrote:

> Hi,
>
> On Thu, Dec 17, 2015 at 4:40 PM, Stéphane Téletchéa <
> stephane.teletchea at univ-nantes.fr> wrote:
>
>> Le 17/12/2015 12:16, Szilárd Páll a écrit :
>>
>> Dear Szilárd,
>>
>>> Stéphane,
>>>
>>> On Wed, Dec 16, 2015 at 6:21 PM, Téletchéa Stéphane <
>>> stephane.teletchea at univ-nantes.fr> wrote:
>>>
>>> >Dear all,
>>>> >
>>>> >I have struggled recently in getting gromacs-aware of cuda
>>>> capabilities.
>>>> >After searching for a "while" (one afternoon), I removed the
>>>> >ubuntu-provided
>>>> >drivers and packages (which worked in the past) and installed
>>>> everything
>>>> >"from scratch".
>>>>
>>>  From your article:
>>> "It seems that with very recent revisions of gromacs it is not possible
>>> anymore to use the bundled NVIDIA packages from the official Ubuntu
>>> repositories. "
>>>
>>> The "incompatibility" can only come from a driver-runtime mismatch or
>>> broken installation, IMO. There is no incompatibility, at least not that
>>> I
>>> know of.
>>>
>>
>> Well, this is what I thought also, but nvidia-smi worked correctly, and
>> cuda examples
>> too, so it seems gromacs is more sensitive to the minor number (352.63 in
>> ubuntu repositories,
>> 352.39 in the cuda driver bundled with the cuda package).
>>
>
> I have never encountered cases where GROMACS would be "more sensitive"
> than other codes, e.g. an SDK sample.
> Secondly, normally there should not be any explicit dependence of GROMACS
> binaries on NVIDIA driver components. Binaries get linked against
> libcudart, the CUDA runtime part of the toolkit installation (I suggest you
> verify that with ldd, IIRC some older cmake did link against libcuda.so
> too). Hence, any dependence on the driver version _should_ only be indirect.
>
>
>>
>> If you wish I can try to reproduce with more debug information, but this
>> is what led me to
>> the article, because other cuda-aware codes seemed to not be affected.
>
>
> That would be useful, both to satisfy my.our curiosity and beneficial for
> the community to know where the pitfalls are.
> I suggest you purge your NVIDIA driver and toolkit and start with fresh
> copy. Note that if you do not purge your NVIDIA driver installed through
> the binary blob before installing the packages, you will get some
> components overwritten and potentially end up a big mess. Same can happen
> with the toolkit.
>
> --
> Szilárd
>
>
>>
>>
>>> If your installation works, e.g. if i) nvidia-smi works and lists your
>>> GPU
>>> (meaning that the driver works) ii) you can compile and run CUDA SDK
>>> examples (meaning that the CUDA runtime works and it's compatible with
>>> the
>>> driver) , GROMACS should work too!
>>>
>>
>> Well, no. We can dig on this if you wish, it seemed gromacs for more
>> "sensitive"
>> to other cuda codes, but I can provide details if you wish (and open a
>> bug report).
>>
>>
>>> I suggest you add these two steps to your article because they are quite
>>> essential in diagnosing whether there is something wrong with the driver
>>> or
>>> the runtime (/runtime-driver combo).
>>>
>>
>> Well, this seemed weird to me also, but the point is that previously I
>> could use
>> the binaries from the ubuntu repositories and get the proper cuda-aware
>> gromacs
>> and with the recent packages this is no more the case.
>> I'm building gromacs with the same scripts as I'm using since 2007, just
>> adjusting paths
>> or option as they evolve, but this is the first time I have to dig that
>> much. It may be a
>> side effect (from the ubuntu packages), but from their repository the
>> files were properly
>> package I think. Again, I can go into details in a bug report if needed.
>>
>>
>>>
>>> >It seems this is both coming from NVIDIA requiring the
>>>> >"GPU deployment kit" in addition to the cuda toolkit, and from GROMACS
>>>> >only warning about the missing NVML (but not failing while asking for
>>>> GPU
>>>> >compilation).
>>>> >
>>>>
>>> Note that NVML is*recommended*  but not mandatory and used to control the
>>> application clocks on the GPU. Note that you can set the applicaiton
>>> clocks
>>> manually with nvidia-smi.
>>> (For more details see this blog post by Jiri Kraus:
>>>
>>> http://devblogs.nvidia.com/parallelforall/increase-performance-gpu-boost-k80-autoboost/
>>> )
>>>
>>
>> Many thanks for the link, i only read the NVML description page, and got
>> more or less
>> to the conclusion that it *should* be helpful, this may also lead to more
>> lock-up I think in real
>> scenarios where dust enter the fan (on personal workstations like in my
>> case for instance),
>> but I'll see from experience.
>>
>> I'll add a note about this, thanks for the feedback
>>
>>
>>> Cheers,
>>> --
>>> Szilárd
>>>
>>
>> Best,
>>
>> Stéphane
>>
>>
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
>> posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
>> send a mail to gmx-users-request at gromacs.org.
>>
>
>