[gmx-developers] cudaStreamSynchronize failed in cu_blockwait_nb

Carsten Kutzner ckutzne at gwdg.de
Mon Oct 22 17:23:21 CEST 2012


Hi Szilárd,

thanks a lot for fixing it!

Carsten


On Oct 22, 2012, at 5:20 PM, Szilárd Páll <szilard.pall at cbr.su.se> wrote:

> Hi,
> 
> The CUDA plain cut-off kernel's pointer was incorrectly assigned (stupid copy-paste bug). Just pushed a bugfix: https://gerrit.gromacs.org/#/c/1553/
> 
> Cheers,
> --
> Szilárd
> 
> 
> On Fri, Oct 19, 2012 at 3:20 PM, Szilárd Páll <szilard.pall at cbr.su.se> wrote:
> Hi,
> 
> That sounds like a nasty bug that I have not seen for quite a while. This happens generally when some serious memory corruption puts the GPU in a "bad state". For the future, you could try to reset the GPU by reloading the driver, but if that does not help you will have to reboot.
> 
> I was able to reproduce the bug and in fact on our development machine the NVIDIA driver seems to get into a messed up state in which mdrun will hang, no matter whether I launch in on the GTX 580 or 680. Reloading the driver seems to fix this issue.
> 
> Thanks for the report, I'll looking into this bug and will give you an update!
> 
> Cheers,
> --
> Szilárd
> 
> 
> 
> On Fri, Oct 19, 2012 at 12:01 PM, Carsten Kutzner <ckutzne at gwdg.de> wrote:
> Hi,
> 
> we updated to the newest driver, but later I found that this crash is caused by
> a .tpr file with Coulomb-type=cutoff instead of PME:
> 
> - I start with a PME .tpr file that runs with the recent 4.6 on both a GTX580 and 680,
>   and even using both
> - I change to cutoff setting (no other changes!); this tpr still runs on the 580,
>   but on the 680 produces the fatal error:
>   "cudaStreamSynchronize failed in cu_blockwait_nb: unspecified launch failure"
>   Moreover, after that any other mdrun using any GPU on that node will read in the
>   previously working, PME) .tpr file and then hang. After rebooting, I can again
>   run the PME .tpr file.
> 
> Carsten
> 
> 
> 
> On Oct 17, 2012, at 3:10 PM, Szilárd Páll <szilard.pall at cbr.su.se> wrote:
> 
> > HI,
> >
> > Your driver might be simply too old for a GTX680. You'll need at least a very late 295.xx driver and preferably the 304.54 (or later).
> >
> > Cheers,
> > --
> > Szilárd
> >
> >
> > On Wed, Oct 17, 2012 at 2:10 PM, Carsten Kutzner <ckutzne at gwdg.de> wrote:
> > BTW this executable works on a GTX580, but shows the fatal error
> > on a GTX680 - both mounted in the same workstation.
> >
> > Carsten
> >
> >
> > On Oct 17, 2012, at 12:05 PM, Carsten Kutzner <ckutzne at gwdg.de> wrote:
> >
> > > Hi,
> > >
> > > what am I doing wrong if I get this error code:
> > >
> > > -------------------------------------------------------
> > > Program mdrun_threads, VERSION 4.6-dev-20121016-4af4561
> > > Source code file: /home/ckutzne/installations/git-gromacs-4-6-department/src/mdlib/nbnxn_cuda/nbnxn_cuda.cu, line: 558
> > >
> > > Fatal error:
> > > cudaStreamSynchronize failed in cu_blockwait_nb: unspecified launch failure
> > >
> > > For more information and tips for troubleshooting, please check the GROMACS
> > > website at http://www.gromacs.org/Documentation/Errors
> > > -------------------------------------------------------
> > >
> > > Thanks,
> > > Carsten
> > > --
> > > gmx-developers mailing list
> > > gmx-developers at gromacs.org
> > > http://lists.gromacs.org/mailman/listinfo/gmx-developers
> > > Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-developers-request at gromacs.org.
> >
> >
> > --
> > Dr. Carsten Kutzner
> > Max Planck Institute for Biophysical Chemistry
> > Theoretical and Computational Biophysics
> > Am Fassberg 11, 37077 Goettingen, Germany
> > Tel. +49-551-2012313, Fax: +49-551-2012302
> > http://www.mpibpc.mpg.de/grubmueller/kutzner
> >
> > --
> > gmx-developers mailing list
> > gmx-developers at gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-developers
> > Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-developers-request at gromacs.org.
> >
> > --
> > gmx-developers mailing list
> > gmx-developers at gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-developers
> > Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-developers-request at gromacs.org.
> 
> 
> --
> Dr. Carsten Kutzner
> Max Planck Institute for Biophysical Chemistry
> Theoretical and Computational Biophysics
> Am Fassberg 11, 37077 Goettingen, Germany
> Tel. +49-551-2012313, Fax: +49-551-2012302
> http://www.mpibpc.mpg.de/grubmueller/kutzner
> 
> --
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-developers-request at gromacs.org.
> 
> 
> -- 
> gmx-developers mailing list
> gmx-developers at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-developers
> Please don't post (un)subscribe requests to the list. Use the 
> www interface or send it to gmx-developers-request at gromacs.org.


--
Dr. Carsten Kutzner
Max Planck Institute for Biophysical Chemistry
Theoretical and Computational Biophysics
Am Fassberg 11, 37077 Goettingen, Germany
Tel. +49-551-2012313, Fax: +49-551-2012302
http://www.mpibpc.mpg.de/grubmueller/kutzner




More information about the gromacs.org_gmx-developers mailing list