[gmx-users] [gmx-developers] Fatal error: cudaStreamSynchronize failed in cu_blockwait_nb
AOWI (Anders Ossowicki)
AOWI at novozymes.com
Thu Jan 30 11:49:23 CET 2014
Thanks for your suggestions!
> I would not make any assumptions though, but rather try a few things first:
> - Does the card pass a memtest (sourceforge.net/projects/cudagpumemtest/)?
The memtest ran for about an hour with no errors.
> - Does the installation pass the regressiontests?
No. These four complex tests fail, all with the usual error:
FAILED. Check mdrun.out, md.log files in nbnxn_pme
FAILED. Check mdrun.out, md.log files in nbnxn_rf
FAILED. Check mdrun.out, md.log files in nbnxn_rzero
FAILED. Check mdrun.out, md.log files in nbnxn_vsite
Everything else passes.
> - Is the error reproducible with other inputs?
Yes, so far anything that has caused Gromacs to engage the GPU has failed. Our own runs, the samples from the Gromacs website, and the four tests above.
> Also note that with the default invocation of mdrun you are attempting to use all cores/hardware threads in your machine (I assume a 2x12-core IVB-E node with HT on).
Two Xeon E5-2697V2 processors yes. This is a test server for gauging the potential performance gains of GPGPU with our own runs. We'll stick to a proper CPU-GPU ratio for the performance measurements. This was just me trying to pare it down to the simplest invocation.
We have had no trouble using other CUDA-enabled tools on this particular test server. NAMD, for example, works fine.
More information about the gromacs.org_gmx-users