[gmx-users] [gmx-developers] Fatal error: cudaStreamSynchronize failed in cu_blockwait_nb
pall.szilard at gmail.com
Thu Jan 30 16:47:02 CET 2014
On Thu, Jan 30, 2014 at 4:19 PM, AOWI (Anders Ossowicki)
<AOWI at novozymes.com> wrote:
>> Well, with a 24k system a single iteration can be done in 2-3 ms, so those 3.3 seconds are mostly initialization and some number of steps - could be one, ten, or even hundred.
> Sure, but it fails even with -nsteps 1.
>> That doesn't tell much, could you add a -g to the CXX flags?
> Same thing:
There should be line numbers below - and perhaps a bit more
information on what's causing the error - at least that's what I'm
One other thing you could try is to set "coulombtype = reaction-field"
in the mdp file and re-generate the tpr. These runs will use a
different CUDA kernel. Just guessing, it may not make much difference
> starting mdrun 'RNASE ZF-1A in water'
> 1 steps, 0.0 ps.
> ========= Program hit error 4 on CUDA API call to cudaStreamSynchronize
> ========= Saved host backtrace up to driver entry point at error
> ========= Host Frame:/usr/lib/nvidia-current/libcuda.so [0x26d660]
> ========= Host Frame:/usr/local/cuda-5.5/lib64/libcudart.so.5.5 (cudaStreamSynchronize + 0x15e) [0x36f5e]
> ========= Host Frame:/usr/bin/../lib/libmd.so.8 (nbnxn_cuda_wait_gpu + 0x222) [0xd45ab5]
> ========= Host Frame:/usr/bin/../lib/libmd.so.8 (do_force_cutsVERLET + 0x1d20) [0xc287a5]
> ========= Host Frame:/usr/bin/../lib/libmd.so.8 (do_force + 0x15d) [0xc2a986]
> ========= Host Frame:mdrun (do_md + 0x3cd4) [0x2450e]
> ========= Host Frame:mdrun (mdrunner + 0x1f14) [0x11b50]
> ========= Host Frame:mdrun (cmain + 0x1dee) [0x2a57d]
> ========= Host Frame:mdrun (main + 0x20) [0x31c18]
> ========= Host Frame:/lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main + 0xed) [0x2176d]
> ========= Host Frame:mdrun [0x75e9]
>>>> - Try running with GMX_EMULATE_GPU env. var. set? This will run the GPU acceleration code-path, but will use CPU kernels (equivalent to the CUDA but slow implementation).
>>> This seems to run correctly.
>> Does correctly mean that you've checked the results or that it completed without a crash?
> Just the latter.
> Anders Ossowicki
More information about the gromacs.org_gmx-users