[gmx-users] WG: WG: Issue with CUDA and gromacs

Szilárd Páll pall.szilard at gmail.com
Fri Mar 15 16:27:29 CET 2019


Hi,

Please share log files with an external service attachments are not
accepted on the list.

Also, when checking the error with the patch supplied, please run the
following cases -- no long runs are needed just want to know which of these
runs and which of these doesn't:
- ntmpi 1 -ntomp 22 -pin on
- ntmpi 1 -ntomp 22 -pin off
- ntmpi 1 -ntomp 23 -pin off
- ntmpi 1 -ntomp 23 -pinstride 1 -pin on
- ntmpi 1 -ntomp 23 -pinstride 2 -pin on
- ntmpi 23 -ntomp 1 -pinstride 1 -pin on
- ntmpi 23 -ntomp 1 -pinstride 2 -pin on

Thanks,
--
Szilárd


On Fri, Mar 15, 2019 at 4:04 PM Tafelmeier, Stefanie <
Stefanie.Tafelmeier at zae-bayern.de> wrote:

> Hi Szilárd,
>
> thanks for the quick reply.
> About the first suggestion, I'll try and give feedback soon.
>
> Regarding the second, I attached the log-file for the case of
> mdrun -v -nt 25
> Which ends in the known error message.
>
> Again, thanks a lot for your information and help.
>
> Best wishes,
> Steffi
>
>
>
> -----Ursprüngliche Nachricht-----
> Von: gromacs.org_gmx-users-bounces at maillist.sys.kth.se [mailto:
> gromacs.org_gmx-users-bounces at maillist.sys.kth.se] Im Auftrag von Szilárd
> Páll
> Gesendet: Freitag, 15. März 2019 15:30
> An: Discussion list for GROMACS users
> Betreff: Re: [gmx-users] WG: WG: Issue with CUDA and gromacs
>
> Hi Stefanie,
>
> Unless and until the error and performance-related concerns prove to be
> related, let's keep those separate.
>
> I'd first focus on the former. To be honest, I've never encountered such an
> issue where if you use more than a certain number of threads, the run
> aborts with that error. To investigate further can you please apply the
> following patch file which hopefully give more context to the error:
> https://termbin.com/uhgp
> (e.g. you can execute the following to accomplish that:
> curl https://termbin.com/uhgp > devicebuffer.cuh.patch && patch -p0 <
> devicebuffer.cuh.patch)
>
> Regarding the performance-related questions, can you please share a full
> log file of the runs so we can see the machine config, simulation
> system/settings, etc. Without that it is hard to judge what's best for your
> case. However, if you only have a single GPU (which seems to be the case
> based on the log excerpts) along those two rather beefy CPUs, than you will
> likely not get much benefit from using all cores and it is normal that you
> see little to no improvement from using cores of a second CPU socket.
>
> Cheers,
> --
> Szilárd
>
>
> On Thu, Mar 14, 2019 at 12:47 PM Tafelmeier, Stefanie <
> Stefanie.Tafelmeier at zae-bayern.de> wrote:
>
> > Dear all,
> >
> > I was not sure if the email before reached you, but again many thanks for
> > your reply Szilárd.
> >
> > As written below we are still facing a problem with the performance of
> > your workstation.
> > I wrote before because of the error message when keeping occurring for
> > mdrun simulation:
> >
> > Assertion failed:
> > Condition: stat == cudaSuccess
> > Asynchronous H2D copy failed
> >
> > As I mentioned all Versions to install (Gormacs, Cuda, nvcc, gcc) are the
> > newest once now.
> >
> > If I run mdrun without further settings it will lead to this error
> > message. If I run it and choose the thread amount directly the mdrun is
> > performing well. But only for –nt numbers between 1 – 22. Higher ones
> again
> > lead to the before mentioned error message.
> >
> > In order to investigate in more detail, I tried different versions for
> > –nt, –ntmpi – ntomp also combined with –npme:
> > -       The best performance in the sense of ns/day is with –nt 22
> > respectively –ntomp 22 alone. But then only 22 threads are involved.
> Which
> > is fine if I run more than one mdrun simultaneously, as I can distribute
> > the other 66 threads. The GPU usage is then around 65%.
> > -       A similar good performance is reached with mdrun  -ntmpi 4 -ntomp
> > 18 -npme 1 -pme gpu -nb gpu. But then 44 threads are involved. The GPU
> > usage is then around 50%.
> >
> > I read the information on
> >
> http://manual.gromacs.org/documentation/5.1/user-guide/mdrun-performance.html
> > which was very helpful, bur some things are still not clear now to me:
> > I was wondering if there is any other enhancement of the performance? Or
> > what is the reason, that –nt maximum is at 22 threads? Could this be
> > connected to the sockets (see details below) of your workstation?
> > It is not clear to me how a number of thread (-nt) higher 22 can lead to
> > the error regarding the Asynchronous H2D copy)
> >
> > Please excuse all these questions. I would appreciate a lot  if you might
> > have a hint for this problem as well.
> >
> > Best regards,
> > Steffi
> >
> > -----
> >
> > The workstation details are:
> > Running on 1 node with total 44 cores, 88 logical cores, 1 compatible GPU
> > Hardware detected:
> >
> >   CPU info:
> >     Vendor: Intel
> >     Brand:  Intel(R) Xeon(R) Gold 6152 CPU @ 2.10GHz
> >     Family: 6   Model: 85   Stepping: 4
> >     Features: aes apic avx avx2 avx512f avx512cd avx512bw avx512vl clfsh
> > cmov cx8 cx16 f16c fma hle htt intel lahf mmx msr nonstop_tsc pcid
> pclmuldq
> > pdcm pdpe1gb popcnt pse rdrnd rdtscp rtm sse2 sse3 sse4.1 sse4.2 ssse3
> tdt
> > x2apic
> >
> >     Number of AVX-512 FMA units: 2
> >   Hardware topology: Basic
> >     Sockets, cores, and logical processors:
> >       Socket  0: [   0  44] [   1  45] [   2  46] [   3  47] [   4  48] [
> >  5  49] [   6  50] [   7  51] [   8  52] [   9  53] [  10  54] [  11  55]
> > [  12  56] [  13  57] [  14  58] [  15  59] [  16  60] [  17  61] [  18
> > 62] [  19  63] [  20  64] [  21  65]
> >       Socket  1: [  22  66] [  23  67] [  24  68] [  25  69] [  26  70] [
> > 27  71] [  28  72] [  29  73] [  30  74] [  31  75] [  32  76] [  33  77]
> > [  34  78] [  35  79] [  36  80] [  37  81] [  38  82] [  39  83] [  40
> > 84] [  41  85] [  42  86] [  43  87]
> >   GPU info:
> >     Number of GPUs detected: 1
> >     #0: NVIDIA Quadro P6000, compute cap.: 6.1, ECC:  no, stat:
> compatible
> >
> > -----
> >
> >
> >
> > -----Ursprüngliche Nachricht-----
> > Von: gromacs.org_gmx-users-bounces at maillist.sys.kth.se [mailto:
> > gromacs.org_gmx-users-bounces at maillist.sys.kth.se] Im Auftrag von
> Szilárd
> > Páll
> > Gesendet: Donnerstag, 31. Januar 2019 17:15
> > An: Discussion list for GROMACS users
> > Betreff: Re: [gmx-users] WG: Issue with CUDA and gromacs
> >
> > On Thu, Jan 31, 2019 at 2:14 PM Szilárd Páll <pall.szilard at gmail.com>
> > wrote:
> > >
> > > On Wed, Jan 30, 2019 at 5:15 PM Tafelmeier, Stefanie
> > > <Stefanie.Tafelmeier at zae-bayern.de> wrote:
> > > >
> > > > Dear all,
> > > >
> > > > We are facing an issue with the CUDA toolkit.
> > > > We tried several combinations of gromacs versions and CUDA Toolkits.
> > No Toolkit older than 9.2 was possible to try as there are no driver for
> > nvidia available for a Quadro P6000.
> > > > Gromacs
> > >
> > > Install the latest 410.xx drivers and it will work; the NVIDIA driver
> > > download website (https://www.nvidia.com/Download/index.aspx)
> > > recommends 410.93.
> > >
> > > Here's a system with CUDA 10-compatible driver running o a system with
> > > a P6000: https://termbin.com/ofzo
> >
> > Sorry, I misread that as "CUDA >=9.2 was not possible".
> >
> > Note that the driver is backward compatible, so you can use a new
> > driver with older CUDA versions.
> >
> > Also note that the oldest driver NVIDIA claims to have P6000 support
> > is 390.59 which is, as far as I know, one gen older than the 396 that
> > the CUDA 9.2 toolkit came with. This is however, not something I'd
> > recommend pursuing, use a new driver from the official site with any
> > CUDA version that GROMACS supports and it should be fine.
> >
> > >
> > > > CUDA
> > > >
> > > > Error message
> > > >
> > > > 2019
> > > >
> > > > 10.0
> > > >
> > > > gmx mdrun:
> > > > Assertion failed:
> > > > Condition: stat == cudaSuccess
> > > > Asynchronous H2D copy failed
> > > >
> > > > 2019
> > > >
> > > > 9.2
> > > >
> > > > gmx mdrun:
> > > > Assertion failed:
> > > > Condition: stat == cudaSuccess
> > > > Asynchronous H2D copy failed
> > > >
> > > > 2018.5
> > > >
> > > > 9.2
> > > >
> > > > gmx mdrun: Fatal error:
> > > > HtoD cudaMemcpyAsync failed: invalid argument
> > >
> > > Can we get some more details on these, please? complete log files
> > > would be a good start.
> > >
> > > > 5.1.5
> > > >
> > > > 9.2
> > > >
> > > > Installation make: nvcc fatal   : Unsupported gpu architecture
> > 'compute_20'*
> > > >
> > > > 2016.2
> > > >
> > > > 9.2
> > > >
> > > > Installation make: nvcc fatal   : Unsupported gpu architecture
> > 'compute_20'*
> > > >
> > > >
> > > > *We also tried to set the target CUDA architectures as described in
> > the installation guide (
> > manual.gromacs.org/documentation/2019/install-guide/index.html).
> > Unfortunately it didn't work.
> > >
> > > What does it mean that it didn't work? Can you share the command you
> > > used and what exactly did not work?
> > >
> > > For the P6000 which is a "compute capability 6.1" device (for anyone
> > > who needs to look it up, go here:
> > > https://developer.nvidia.com/cuda-gpus), you should set
> > > cmake ../ -DGMX_CUDA_TARGET_SM="61"
> > >
> > > --
> > > Szilárd
> > >
> > > > Performing simulations on CPU only always works, yet of cause are
> more
> > slowly than they could be with additionally using the GPU.
> > > > The issue #2761 (https://redmine.gromacs.org/issues/2762) seems
> > similar to our problem.
> > > > Even though this issue is still open, we wanted to ask if you can
> give
> > us any information about how to solve this problem?
> > > >
> > > > Many thanks in advance.
> > > > Best regards,
> > > > Stefanie Tafelmeier
> > > >
> > > >
> > > > Further details if necessary:
> > > > The workstation:
> > > > 2 x Xeon Gold 6152 @ 3,7Ghz (22 K, 44Th, AVX512)
> > > > Nvidia Quadro P6000 with 3840 Cuda-Cores
> > > >
> > > > The simulations system:
> > > > Long chain alkanes (previously used with gromacs 5.1.5 and CUDA 7.5 -
> > worked perfectly)
> > > >
> > > >
> > > >
> > > >
> > > > ZAE Bayern
> > > > Stefanie Tafelmeier
> > > > Bereich Energiespeicherung/Division Energy Storage
> > > > Thermische Energiespeicher/Thermal Energy Storage
> > > > Walther-Meißner-Str. 6
> > > > 85748 Garching
> > > >
> > > > Tel.: +49 89 329442-75
> > > > Fax: +49 89 329442-12
> > > > Stefanie.tafelmeier at zae-bayern.de<mailto:
> > Stefanie.tafelmeier at zae-bayern.de>
> > > > http://www.zae-bayern.de<http://www.zae-bayern.de/>
> > > >
> > > >
> > > > ZAE Bayern - Bayerisches Zentrum für Angewandte Energieforschung e.
> V.
> > > > Vorstand/Board:
> > > > Prof. Dr. Hartmut Spliethoff (Vorsitzender/Chairman),
> > > > Prof. Dr. Vladimir Dyakonov
> > > > Sitz/Registered Office: Würzburg
> > > > Registergericht/Register Court: Amtsgericht Würzburg
> > > > Registernummer/Register Number: VR 1386
> > > >
> > > > Sämtliche Willenserklärungen, z. B. Angebote, Aufträge, Anträge und
> > Verträge, sind für das ZAE Bayern nur in schriftlicher und ordnungsgemäß
> > unterschriebener Form rechtsverbindlich. Diese E-Mail ist ausschließlich
> > zur Nutzung durch den/die vorgenannten Empfänger bestimmt. Jegliche
> > unbefugte Offenbarung, Nutzung oder Verbreitung, sei es insgesamt oder
> > teilweise, ist untersagt. Sollten Sie diese E-Mail irrtümlich erhalten
> > haben, benachrichtigen Sie bitte unverzüglich den Absender und löschen
> Sie
> > diese E-Mail.
> > > >
> > > > Any declarations of intent, such as quotations, orders, applications
> > and contracts, are legally binding for ZAE Bayern only if expressed in a
> > written and duly signed form. This e-mail is intended solely for use by
> the
> > recipient(s) named above. Any unauthorised disclosure, use or
> > dissemination, whether in whole or in part, is prohibited. If you have
> > received this e-mail in error, please notify the sender immediately and
> > delete this e-mail.
> > > >
> > > >
> > > > --
> > > > Gromacs Users mailing list
> > > >
> > > > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> > > >
> > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > > >
> > > > * For (un)subscribe requests visit
> > > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users
> or
> > send a mail to gmx-users-request at gromacs.org.
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.


More information about the gromacs.org_gmx-users mailing list