[gmx-developers] cudaStreamSynchronize failed in cu_blockwait_nb

Berk Hess hess at kth.se
Mon Oct 22 17:50:46 CEST 2012


On 10/22/2012 05:47 PM, Shirts, Michael (mrs5pt) wrote:
>> This was just supposed to be a fast test system; then I must have forgotten to
>> switch
>> back to PME - triggering the fatal error. We do not use plain cutoff for
>> serious things :)
> But it is good practice that when things fail, even silly parameter choices,
> they fail gracefully, as it does help find OTHER bugs.
This was supposed to work, not fail and even hang the driver (and the 
fix is submitted).
I was thinking again about changing cut-off electrostatics from a 
warning to a note,
as some people still seem to be using it. But I guess there could be 
valid uses of it.

Cheers,

Berk
>
> Best,
> ~~~~~~~~~~~~
> Michael Shirts
> Assistant Professor
> Department of Chemical Engineering
> University of Virginia
> michael.shirts at virginia.edu
> (434)-243-1821
>
>
>> From: Carsten Kutzner <ckutzne at gwdg.de>
>> Reply-To: Discussion list for GROMACS development <gmx-developers at gromacs.org>
>> Date: Mon, 22 Oct 2012 17:35:03 +0200
>> To: Discussion list for GROMACS development <gmx-developers at gromacs.org>
>> Subject: Re: [gmx-developers] cudaStreamSynchronize failed in cu_blockwait_nb
>>
>> On Oct 22, 2012, at 5:25 PM, Berk Hess <hess at kth.se> wrote:
>>
>>> Just curious, why are you running plain cut-off?
>> This was just supposed to be a fast test system; then I must have forgotten to
>> switch
>> back to PME - triggering the fatal error. We do not use plain cutoff for
>> serious things :)
>>
>> Carsten
>>
>>> (I didn't even make CPU kernels for that, the RF kernels is then used)
>>>
>>> Cheers,
>>>
>>> Berk
>>>
>>> On 10/22/2012 05:23 PM, Carsten Kutzner wrote:
>>>> Hi Szilárd,
>>>>
>>>> thanks a lot for fixing it!
>>>>
>>>> Carsten
>>>>
>>>>
>>>> On Oct 22, 2012, at 5:20 PM, Szilárd Páll <szilard.pall at cbr.su.se> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> The CUDA plain cut-off kernel's pointer was incorrectly assigned (stupid
>>>>> copy-paste bug). Just pushed a bugfix: https://gerrit.gromacs.org/#/c/1553/
>>>>>
>>>>> Cheers,
>>>>> --
>>>>> Szilárd
>>>>>
>>>>>
>>>>> On Fri, Oct 19, 2012 at 3:20 PM, Szilárd Páll <szilard.pall at cbr.su.se>
>>>>> wrote:
>>>>> Hi,
>>>>>
>>>>> That sounds like a nasty bug that I have not seen for quite a while. This
>>>>> happens generally when some serious memory corruption puts the GPU in a
>>>>> "bad state". For the future, you could try to reset the GPU by reloading
>>>>> the driver, but if that does not help you will have to reboot.
>>>>>
>>>>> I was able to reproduce the bug and in fact on our development machine the
>>>>> NVIDIA driver seems to get into a messed up state in which mdrun will hang,
>>>>> no matter whether I launch in on the GTX 580 or 680. Reloading the driver
>>>>> seems to fix this issue.
>>>>>
>>>>> Thanks for the report, I'll looking into this bug and will give you an
>>>>> update!
>>>>>
>>>>> Cheers,
>>>>> --
>>>>> Szilárd
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Oct 19, 2012 at 12:01 PM, Carsten Kutzner <ckutzne at gwdg.de> wrote:
>>>>> Hi,
>>>>>
>>>>> we updated to the newest driver, but later I found that this crash is
>>>>> caused by
>>>>> a .tpr file with Coulomb-type=cutoff instead of PME:
>>>>>
>>>>> - I start with a PME .tpr file that runs with the recent 4.6 on both a
>>>>> GTX580 and 680,
>>>>>    and even using both
>>>>> - I change to cutoff setting (no other changes!); this tpr still runs on
>>>>> the 580,
>>>>>    but on the 680 produces the fatal error:
>>>>>    "cudaStreamSynchronize failed in cu_blockwait_nb: unspecified launch
>>>>> failure"
>>>>>    Moreover, after that any other mdrun using any GPU on that node will read
>>>>> in the
>>>>>    previously working, PME) .tpr file and then hang. After rebooting, I can
>>>>> again
>>>>>    run the PME .tpr file.
>>>>>
>>>>> Carsten
>>>>>
>>>>>
>>>>>
>>>>> On Oct 17, 2012, at 3:10 PM, Szilárd Páll <szilard.pall at cbr.su.se> wrote:
>>>>>
>>>>>> HI,
>>>>>>
>>>>>> Your driver might be simply too old for a GTX680. You'll need at least a
>>>>>> very late 295.xx driver and preferably the 304.54 (or later).
>>>>>>
>>>>>> Cheers,
>>>>>> --
>>>>>> Szilárd
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 17, 2012 at 2:10 PM, Carsten Kutzner <ckutzne at gwdg.de> wrote:
>>>>>> BTW this executable works on a GTX580, but shows the fatal error
>>>>>> on a GTX680 - both mounted in the same workstation.
>>>>>>
>>>>>> Carsten
>>>>>>
>>>>>>
>>>>>> On Oct 17, 2012, at 12:05 PM, Carsten Kutzner <ckutzne at gwdg.de> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> what am I doing wrong if I get this error code:
>>>>>>>
>>>>>>> -------------------------------------------------------
>>>>>>> Program mdrun_threads, VERSION 4.6-dev-20121016-4af4561
>>>>>>> Source code file:
>>>>>>> /home/ckutzne/installations/git-gromacs-4-6-department/src/mdlib/nbnxn_cu
>>>>>>> da/nbnxn_cuda.cu, line: 558
>>>>>>>
>>>>>>> Fatal error:
>>>>>>> cudaStreamSynchronize failed in cu_blockwait_nb: unspecified launch
>>>>>>> failure
>>>>>>>
>>>>>>> For more information and tips for troubleshooting, please check the
>>>>>>> GROMACS
>>>>>>> website at http://www.gromacs.org/Documentation/Errors
>>>>>>> -------------------------------------------------------
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Carsten
>>>>>>> --
>>>>>>> gmx-developers mailing list
>>>>>>> gmx-developers at gromacs.org
>>>>>>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>>>>>>> Please don't post (un)subscribe requests to the list. Use the www
>>>>>>> interface or send it to gmx-developers-request at gromacs.org.
>>>>>> --
>>>>>> Dr. Carsten Kutzner
>>>>>> Max Planck Institute for Biophysical Chemistry
>>>>>> Theoretical and Computational Biophysics
>>>>>> Am Fassberg 11, 37077 Goettingen, Germany
>>>>>> Tel. +49-551-2012313, Fax: +49-551-2012302
>>>>>> http://www.mpibpc.mpg.de/grubmueller/kutzner
>>>>>>
>>>>>> --
>>>>>> gmx-developers mailing list
>>>>>> gmx-developers at gromacs.org
>>>>>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>>>>>> Please don't post (un)subscribe requests to the list. Use the
>>>>>> www interface or send it to gmx-developers-request at gromacs.org.
>>>>>>
>>>>>> --
>>>>>> gmx-developers mailing list
>>>>>> gmx-developers at gromacs.org
>>>>>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>>>>>> Please don't post (un)subscribe requests to the list. Use the
>>>>>> www interface or send it to gmx-developers-request at gromacs.org.
>>>>> --
>>>>> Dr. Carsten Kutzner
>>>>> Max Planck Institute for Biophysical Chemistry
>>>>> Theoretical and Computational Biophysics
>>>>> Am Fassberg 11, 37077 Goettingen, Germany
>>>>> Tel. +49-551-2012313, Fax: +49-551-2012302
>>>>> http://www.mpibpc.mpg.de/grubmueller/kutzner
>>>>>
>>>>> --
>>>>> gmx-developers mailing list
>>>>> gmx-developers at gromacs.org
>>>>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>>>>> Please don't post (un)subscribe requests to the list. Use the
>>>>> www interface or send it to gmx-developers-request at gromacs.org.
>>>>>
>>>>>
>>>>> -- 
>>>>> gmx-developers mailing list
>>>>> gmx-developers at gromacs.org
>>>>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>>>>> Please don't post (un)subscribe requests to the list. Use the
>>>>> www interface or send it to gmx-developers-request at gromacs.org.
>>>> --
>>>> Dr. Carsten Kutzner
>>>> Max Planck Institute for Biophysical Chemistry
>>>> Theoretical and Computational Biophysics
>>>> Am Fassberg 11, 37077 Goettingen, Germany
>>>> Tel. +49-551-2012313, Fax: +49-551-2012302
>>>> http://www.mpibpc.mpg.de/grubmueller/kutzner
>>>>
>>> -- 
>>> gmx-developers mailing list
>>> gmx-developers at gromacs.org
>>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>>> Please don't post (un)subscribe requests to the list. Use the www interface
>>> or send it to gmx-developers-request at gromacs.org.
>>
>> --
>> Dr. Carsten Kutzner
>> Max Planck Institute for Biophysical Chemistry
>> Theoretical and Computational Biophysics
>> Am Fassberg 11, 37077 Goettingen, Germany
>> Tel. +49-551-2012313, Fax: +49-551-2012302
>> http://www.mpibpc.mpg.de/grubmueller/kutzner
>>
>> -- 
>> gmx-developers mailing list
>> gmx-developers at gromacs.org
>> http://lists.gromacs.org/mailman/listinfo/gmx-developers
>> Please don't post (un)subscribe requests to the list. Use the
>> www interface or send it to gmx-developers-request at gromacs.org.




More information about the gromacs.org_gmx-developers mailing list