[gmx-users] WG: Issue with CUDA and gromacs

Tafelmeier, Stefanie Stefanie.Tafelmeier at zae-bayern.de
Mon Feb 25 13:03:58 CET 2019


Many thanks Páll, for your reply.

As you suggested, we installed now the most up-to-date versions: the Gromacs 2019.1 Version and the Cuda 10 + Driver-version 415.27 as well as the gcc-version 7.3.0.

If we run gmx mdrun, still there is the error:
Assertion failed:
Condition: stat == cudaSuccess
Asynchronous H2D copy failed

In order to get more detailed information about  the error source, I performed the regression test.
The test 42 and 46 failed. 

I try to understand the meaning behind the tests and why they have not passed, but I would absolutely appreciate if someone could sent me an explanation of what could have caused the failing. Or if someone faced the same problem before and knows a remedy.
The failing-output is given below.

Many thanks in advance for your help.

Best regards,
Steffi


---------------------------------------------------------------------------------------------
42/46 Test #42: regressiontests/complex .............***Failed   88.99 sec
                      :-) GROMACS - gmx mdrun, 2019.1 (-:

                            GROMACS is written by:
     Emile Apol      Rossen Apostolov      Paul Bauer     Herman J.C. Berendsen
    Par Bjelkmar      Christian Blau   Viacheslav Bolnykh     Kevin Boyd
 Aldert van Buuren   Rudi van Drunen     Anton Feenstra       Alan Gray
  Gerrit Groenhof     Anca Hamuraru    Vincent Hindriksen  M. Eric Irrgang
  Aleksei Iupinov   Christoph Junghans     Joe Jordan     Dimitrios Karkoulis
    Peter Kasson        Jiri Kraus      Carsten Kutzner      Per Larsson
  Justin A. Lemkul    Viveca Lindahl    Magnus Lundborg     Erik Marklund
    Pascal Merz     Pieter Meulenhoff    Teemu Murtola       Szilard Pall
    Sander Pronk      Roland Schulz      Michael Shirts    Alexey Shvetsov
   Alfons Sijbers     Peter Tieleman      Jon Vincent      Teemu Virolainen
 Christian Wennberg    Maarten Wolf
                           and the project leaders:
        Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel

Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2018, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.

GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.

GROMACS:      gmx mdrun, version 2019.1
Executable:   /home/pcm-mess/gromacs-2019.1/build/bin/gmx
Data prefix:  /home/pcm-mess/gromacs-2019.1 (source tree)
Working dir:  /home/pcm-mess/gromacs-2019.1/build/tests/regressiontests-2019.1
Command line:
  gmx mdrun -h


Thanx for Using GROMACS - Have a Nice Day

Mdrun cannot use the requested (or automatic) number of ranks, retrying with 8.

Abnormal return value for ' gmx mdrun    -nb cpu   -notunepme >mdrun.out 2>&1' was 1
Retrying mdrun with better settings...

Abnormal return value for ' gmx mdrun -ntmpi 1      -notunepme >mdrun.out 2>&1' was -1
FAILED. Check mdrun.out, md.log file(s) in distance_restraints for distance_restraints
FAILED. Check checkpot.out (24 errors), checkforce.out (1706 errors) file(s) in nbnxn-free-energy for nbnxn-free-energy
FAILED. Check checkpot.out (23 errors), checkforce.out (1913 errors) file(s) in nbnxn-free-energy-vv for nbnxn-free-energy-vv

Abnormal return value for ' gmx mdrun       -notunepme >mdrun.out 2>&1' was -1
FAILED. Check mdrun.out, md.log file(s) in nbnxn-vdw-force-switch for nbnxn-vdw-force-switch

Abnormal return value for ' gmx mdrun       -notunepme >mdrun.out 2>&1' was -1
FAILED. Check mdrun.out, md.log file(s) in nbnxn-vdw-potential-switch for nbnxn-vdw-potential-switch
Re-running nbnxn-vdw-potential-switch using CPU-based PME

Abnormal return value for ' gmx mdrun       -notunepme >mdrun.out 2>&1' was -1
FAILED. Check mdrun.out, md.log file(s) in nbnxn_pme for nbnxn_pme
Re-running nbnxn_pme using CPU-based PME

Abnormal return value for ' gmx mdrun -ntmpi 6      -notunepme >mdrun.out 2>&1' was 1
Retrying mdrun with better settings...

Abnormal return value for ' gmx mdrun       -notunepme >mdrun.out 2>&1' was -1
FAILED. Check mdrun.out, md.log file(s) in octahedron for octahedron
Re-running octahedron using CPU-based PME

Abnormal return value for ' gmx mdrun -ntmpi 1      -notunepme >mdrun.out 2>&1' was -1
FAILED. Check mdrun.out, md.log file(s) in orientation-restraints for orientation-restraints
Re-running orientation-restraints using CPU-based PME

Abnormal return value for ' gmx mdrun -ntmpi 1  -pme cpu    -notunepme >mdrun.out 2>&1' was -1
FAILED. Check mdrun.out, md.log file(s) in orientation-restraints/pme-cpu for orientation-restraints-pme-cpu

Abnormal return value for ' gmx mdrun       -notunepme >mdrun.out 2>&1' was -1
FAILED. Check mdrun.out, md.log file(s) in position-restraints for position-restraints
FAILED. Check checkpot.out (42 errors), checkforce.out (762 errors) file(s) in pull_constraint for pull_constraint
Re-running pull_geometry_angle using CPU-based PME
Re-running pull_geometry_angle-axis using CPU-based PME
Re-running pull_geometry_dihedral using CPU-based PME

Abnormal return value for ' gmx mdrun       -notunepme >mdrun.out 2>&1' was -1
FAILED. Check mdrun.out, md.log file(s) in swap_x for swap_x
Re-running swap_x using CPU-based PME

Abnormal return value for ' gmx mdrun       -notunepme >mdrun.out 2>&1' was -1
FAILED. Check mdrun.out, md.log file(s) in swap_y for swap_y
Re-running swap_y using CPU-based PME

Abnormal return value for ' gmx mdrun       -notunepme >mdrun.out 2>&1' was -1
FAILED. Check mdrun.out, md.log file(s) in swap_z for swap_z
Re-running swap_z using CPU-based PME
FAILED. Check checkpot.out (22 errors), checkforce.out (21 errors) file(s) in tip4p_continue for tip4p_continue
15 out of 61 complex tests FAILED

      Start 43: regressiontests/kernel
43/46 Test #43: regressiontests/kernel ..............   Passed  105.69 sec
      Start 44: regressiontests/freeenergy
44/46 Test #44: regressiontests/freeenergy ..........   Passed   16.89 sec
      Start 45: regressiontests/rotation
45/46 Test #45: regressiontests/rotation ............   Passed    6.86 sec
      Start 46: regressiontests/essentialdynamics
46/46 Test #46: regressiontests/essentialdynamics ...***Failed    6.75 sec
                      :-) GROMACS - gmx mdrun, 2019.1 (-:

                            GROMACS is written by:
     Emile Apol      Rossen Apostolov      Paul Bauer     Herman J.C. Berendsen
    Par Bjelkmar      Christian Blau   Viacheslav Bolnykh     Kevin Boyd
 Aldert van Buuren   Rudi van Drunen     Anton Feenstra       Alan Gray
  Gerrit Groenhof     Anca Hamuraru    Vincent Hindriksen  M. Eric Irrgang
  Aleksei Iupinov   Christoph Junghans     Joe Jordan     Dimitrios Karkoulis
    Peter Kasson        Jiri Kraus      Carsten Kutzner      Per Larsson
  Justin A. Lemkul    Viveca Lindahl    Magnus Lundborg     Erik Marklund
    Pascal Merz     Pieter Meulenhoff    Teemu Murtola       Szilard Pall
    Sander Pronk      Roland Schulz      Michael Shirts    Alexey Shvetsov
   Alfons Sijbers     Peter Tieleman      Jon Vincent      Teemu Virolainen
 Christian Wennberg    Maarten Wolf
                           and the project leaders:
        Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel

Copyright (c) 1991-2000, University of Groningen, The Netherlands.
Copyright (c) 2001-2018, The GROMACS development team at
Uppsala University, Stockholm University and
the Royal Institute of Technology, Sweden.
check out http://www.gromacs.org for more information.

GROMACS is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation; either version 2.1
of the License, or (at your option) any later version.

GROMACS:      gmx mdrun, version 2019.1
Executable:   /home/pcm-mess/gromacs-2019.1/build/bin/gmx
Data prefix:  /home/pcm-mess/gromacs-2019.1 (source tree)
Working dir:  /home/pcm-mess/gromacs-2019.1/build/tests/regressiontests-2019.1
Command line:
  gmx mdrun -h


Thanx for Using GROMACS - Have a Nice Day


Abnormal return value for ' gmx mdrun      -ei sam.edi -eo flooding2.xvg >mdrun.out 2>&1' was -1
Essential dynamics tests FAILED with 1 errors!


96% tests passed, 2 tests failed out of 46

Label Time Summary:
GTest              = 111.86 sec*proc (40 tests)
IntegrationTest    =  90.22 sec*proc (5 tests)
MpiTest            =   0.06 sec*proc (3 tests)
SlowTest           =   7.19 sec*proc (1 test)
UnitTest           =  14.44 sec*proc (34 tests)

Total Test time (real) = 345.44 sec

The following tests FAILED:
         42 - regressiontests/complex (Failed)
         46 - regressiontests/essentialdynamics (Failed)
Errors while running CTest
CMakeFiles/run-ctest-nophys.dir/build.make:57: recipe for target 'CMakeFiles/run-ctest-nophys' failed
make[3]: *** [CMakeFiles/run-ctest-nophys] Error 8
CMakeFiles/Makefile2:1397: recipe for target 'CMakeFiles/run-ctest-nophys.dir/all' failed
make[2]: *** [CMakeFiles/run-ctest-nophys.dir/all] Error 2
CMakeFiles/Makefile2:1177: recipe for target 'CMakeFiles/check.dir/rule' failed
make[1]: *** [CMakeFiles/check.dir/rule] Error 2
Makefile:626: recipe for target 'check' failed
make: *** [check] Error 2

------------------------------------------------------------------------------------------------------------------------------






-----Ursprüngliche Nachricht-----
Von: gromacs.org_gmx-users-bounces at maillist.sys.kth.se [mailto:gromacs.org_gmx-users-bounces at maillist.sys.kth.se] Im Auftrag von Szilárd Páll
Gesendet: Donnerstag, 31. Januar 2019 17:15
An: Discussion list for GROMACS users
Betreff: Re: [gmx-users] WG: Issue with CUDA and gromacs

On Thu, Jan 31, 2019 at 2:14 PM Szilárd Páll <pall.szilard at gmail.com> wrote:
>
> On Wed, Jan 30, 2019 at 5:15 PM Tafelmeier, Stefanie
> <Stefanie.Tafelmeier at zae-bayern.de> wrote:
> >
> > Dear all,
> >
> > We are facing an issue with the CUDA toolkit.
> > We tried several combinations of gromacs versions and CUDA Toolkits. No Toolkit older than 9.2 was possible to try as there are no driver for nvidia available for a Quadro P6000.
> > Gromacs
>
> Install the latest 410.xx drivers and it will work; the NVIDIA driver
> download website (https://www.nvidia.com/Download/index.aspx)
> recommends 410.93.
>
> Here's a system with CUDA 10-compatible driver running o a system with
> a P6000: https://termbin.com/ofzo

Sorry, I misread that as "CUDA >=9.2 was not possible".

Note that the driver is backward compatible, so you can use a new
driver with older CUDA versions.

Also note that the oldest driver NVIDIA claims to have P6000 support
is 390.59 which is, as far as I know, one gen older than the 396 that
the CUDA 9.2 toolkit came with. This is however, not something I'd
recommend pursuing, use a new driver from the official site with any
CUDA version that GROMACS supports and it should be fine.

>
> > CUDA
> >
> > Error message
> >
> > 2019
> >
> > 10.0
> >
> > gmx mdrun:
> > Assertion failed:
> > Condition: stat == cudaSuccess
> > Asynchronous H2D copy failed
> >
> > 2019
> >
> > 9.2
> >
> > gmx mdrun:
> > Assertion failed:
> > Condition: stat == cudaSuccess
> > Asynchronous H2D copy failed
> >
> > 2018.5
> >
> > 9.2
> >
> > gmx mdrun: Fatal error:
> > HtoD cudaMemcpyAsync failed: invalid argument
>
> Can we get some more details on these, please? complete log files
> would be a good start.
>
> > 5.1.5
> >
> > 9.2
> >
> > Installation make: nvcc fatal   : Unsupported gpu architecture 'compute_20'*
> >
> > 2016.2
> >
> > 9.2
> >
> > Installation make: nvcc fatal   : Unsupported gpu architecture 'compute_20'*
> >
> >
> > *We also tried to set the target CUDA architectures as described in the installation guide (manual.gromacs.org/documentation/2019/install-guide/index.html). Unfortunately it didn't work.
>
> What does it mean that it didn't work? Can you share the command you
> used and what exactly did not work?
>
> For the P6000 which is a "compute capability 6.1" device (for anyone
> who needs to look it up, go here:
> https://developer.nvidia.com/cuda-gpus), you should set
> cmake ../ -DGMX_CUDA_TARGET_SM="61"
>
> --
> Szilárd
>
> > Performing simulations on CPU only always works, yet of cause are more slowly than they could be with additionally using the GPU.
> > The issue #2761 (https://redmine.gromacs.org/issues/2762) seems similar to our problem.
> > Even though this issue is still open, we wanted to ask if you can give us any information about how to solve this problem?
> >
> > Many thanks in advance.
> > Best regards,
> > Stefanie Tafelmeier
> >
> >
> > Further details if necessary:
> > The workstation:
> > 2 x Xeon Gold 6152 @ 3,7Ghz (22 K, 44Th, AVX512)
> > Nvidia Quadro P6000 with 3840 Cuda-Cores
> >
> > The simulations system:
> > Long chain alkanes (previously used with gromacs 5.1.5 and CUDA 7.5 - worked perfectly)
> >
> >
> >
> >
> > ZAE Bayern
> > Stefanie Tafelmeier
> > Bereich Energiespeicherung/Division Energy Storage
> > Thermische Energiespeicher/Thermal Energy Storage
> > Walther-Meißner-Str. 6
> > 85748 Garching
> >
> > Tel.: +49 89 329442-75
> > Fax: +49 89 329442-12
> > Stefanie.tafelmeier at zae-bayern.de<mailto:Stefanie.tafelmeier at zae-bayern.de>
> > http://www.zae-bayern.de<http://www.zae-bayern.de/>
> >
> >
> > ZAE Bayern - Bayerisches Zentrum für Angewandte Energieforschung e. V.
> > Vorstand/Board:
> > Prof. Dr. Hartmut Spliethoff (Vorsitzender/Chairman),
> > Prof. Dr. Vladimir Dyakonov
> > Sitz/Registered Office: Würzburg
> > Registergericht/Register Court: Amtsgericht Würzburg
> > Registernummer/Register Number: VR 1386
> >
> > Sämtliche Willenserklärungen, z. B. Angebote, Aufträge, Anträge und Verträge, sind für das ZAE Bayern nur in schriftlicher und ordnungsgemäß unterschriebener Form rechtsverbindlich. Diese E-Mail ist ausschließlich zur Nutzung durch den/die vorgenannten Empfänger bestimmt. Jegliche unbefugte Offenbarung, Nutzung oder Verbreitung, sei es insgesamt oder teilweise, ist untersagt. Sollten Sie diese E-Mail irrtümlich erhalten haben, benachrichtigen Sie bitte unverzüglich den Absender und löschen Sie diese E-Mail.
> >
> > Any declarations of intent, such as quotations, orders, applications and contracts, are legally binding for ZAE Bayern only if expressed in a written and duly signed form. This e-mail is intended solely for use by the recipient(s) named above. Any unauthorised disclosure, use or dissemination, whether in whole or in part, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete this e-mail.
> >
> >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a mail to gmx-users-request at gromacs.org.


More information about the gromacs.org_gmx-users mailing list