[gmx-users] [gmx-developers] Fatal error: cudaStreamSynchronize failed in cu_blockwait_nb

Szilárd Páll pall.szilard at gmail.com
Thu Jan 30 16:47:02 CET 2014


On Thu, Jan 30, 2014 at 4:19 PM, AOWI (Anders Ossowicki)
<AOWI at novozymes.com> wrote:
>> Well, with a 24k system a single iteration can be done in 2-3 ms, so those 3.3 seconds are mostly initialization and some number of steps - could be one, ten, or even hundred.
> Sure, but it fails even with -nsteps 1.
>
>> That doesn't tell much, could you add a -g to the CXX flags?
> Same thing:

There should be line numbers below - and perhaps a bit more
information on what's causing the error - at least that's what I'm
hoping for.

One other thing you could try is to set "coulombtype = reaction-field"
in the mdp file and re-generate the tpr. These runs will use a
different CUDA kernel. Just guessing, it may not make much difference
at all.

> starting mdrun 'RNASE ZF-1A in water'
> 1 steps,      0.0 ps.
> ========= Program hit error 4 on CUDA API call to cudaStreamSynchronize
> =========     Saved host backtrace up to driver entry point at error
> =========     Host Frame:/usr/lib/nvidia-current/libcuda.so [0x26d660]
> =========     Host Frame:/usr/local/cuda-5.5/lib64/libcudart.so.5.5 (cudaStreamSynchronize + 0x15e) [0x36f5e]
> =========     Host Frame:/usr/bin/../lib/libmd.so.8 (nbnxn_cuda_wait_gpu + 0x222) [0xd45ab5]
> =========     Host Frame:/usr/bin/../lib/libmd.so.8 (do_force_cutsVERLET + 0x1d20) [0xc287a5]
> =========     Host Frame:/usr/bin/../lib/libmd.so.8 (do_force + 0x15d) [0xc2a986]
> =========     Host Frame:mdrun (do_md + 0x3cd4) [0x2450e]
> =========     Host Frame:mdrun (mdrunner + 0x1f14) [0x11b50]
> =========     Host Frame:mdrun (cmain + 0x1dee) [0x2a57d]
> =========     Host Frame:mdrun (main + 0x20) [0x31c18]
> =========     Host Frame:/lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main + 0xed) [0x2176d]
> =========     Host Frame:mdrun [0x75e9]
> =========
>
>>>> - Try running with GMX_EMULATE_GPU env. var. set? This will run the GPU acceleration code-path, but will use CPU kernels (equivalent to the CUDA but slow implementation).
>>> This seems to run correctly.
>> Does correctly mean that you've checked the results or that it completed without a crash?
> Just the latter.
>
> --
> Anders Ossowicki
>
>


More information about the gromacs.org_gmx-users mailing list