[gmx-users] segfault on Gromacs 4.6.3 (cuda)

Guanglei Cui amber.mail.archive at gmail.com
Wed Sep 11 22:32:10 CEST 2013


I appreciate your comments/suggestions, Mark. But trust me ... everything
may become "rocket science" when you work for a company with ~10,000
employees.

Cheers,


On Wed, Sep 11, 2013 at 3:23 AM, Mark Abraham <mark.j.abraham at gmail.com>wrote:

> Hi,
>
> There's simply no way you can expect compilers more than three years
> old (icc 11, gcc 4.4) to work seamlessly and produce high performance
> with brand-new hardware and code. That's like working on a Formula One
> car with a flint axe. Installing a new compiler is not rocket science!
>
> Mark
>
>
> On Tue, Sep 10, 2013 at 2:03 AM, Guanglei Cui
> <amber.mail.archive at gmail.com> wrote:
> > Hi Szilard,
> >
> > Thanks again for getting back. You may remember the previous thread I
> > started on regression test failure with icc 11.x compiled binary. Falling
> > back to SSE2 is my solution, and binaries compiled this way are able to
> > pass all regression tests, including the one with GPU switched on.
> However,
> > it is not clear to me if the GPU part is specifically tested in the
> > regression.
> >
> > As I was trying to explain in the original email, the binary works fine
> on
> > a node with proper graphics driver, but crashes on a node where the
> > graphics driver is older than the CUDA SDK used in compilation. I think
> > updating the driver may potentially enable the GPU part. Pure CPU
> > calculation with the same binary seems not working. It is not clear to me
> > if this is caused by the compiler. It's not really simple to update the
> gcc
> > to 4.7 or greater since we use CentOS 5.x in the company. Even CentOS 6.x
> > uses gcc 4.4.x as default.
> >
> > I've just tested the code with -nb cpu. It still crashes. The binary
> > compiled without GPU works as expected and passed all regression tests.
> For
> > now, I can keep separate binaries for GPU and CPU applications before I
> can
> > get gcc 4.7 or greater installed.
> >
> > Best regards,
> > Guanglei
> >
> >
> > On Mon, Sep 9, 2013 at 4:35 PM, Szilárd Páll <szilard.pall at cbr.su.se>
> wrote:
> >
> >> HI,
> >>
> >> First of all, icc 11 is not well tested and there have been reports
> >> about it compiling broken code. This could explain the crash, but
> >> you'd need to do a bit more testing to confirm. Regading the GPU
> >> detection error, if you use a driver which is incompatible with the
> >> CUDA runtime (at least as high API version, see the mdrun log header's
> >> last two lines) and at the moment, some of such cases are not detected
> >> particularly gracefully.
> >>
> >> A few things to try:
> >> - use gcc, 4.7 is as fast or faster than any icc;
> >> - run with the "-nb cpu" option; does it still crash?
> >> - run with GPU detection completely disabled*
> >> - run the regressiontests; try using CPUs only*
> >>
> >> *You can set the GMX_DISABLE_GPU_DETECTION environment variable to
> >> completely disable the GPU detection.
> >>
> >> Cheers,
> >> --
> >> Szilárd
> >>
> >>
> >> On Mon, Sep 9, 2013 at 9:52 PM, Guanglei Cui
> >> <amber.mail.archive at gmail.com> wrote:
> >> > Dear GMX users,
> >> >
> >> > I recently compiled Gromacs 4.6.3 with CUDA (Intel compiler 11.x,
> SSE2,
> >> and
> >> > CUDA SDK 5.0.35). I was doing a test run with simply 'mdrun -deffnm
> >> > eq2_npt_verlet' (letting mdrun figure out what to use). I received the
> >> > error telling me my graphics driver was older than the CUDA SDK, and
> >> > regular CPU code would be used instead. Then, it crashed with
> >> Segmentation
> >> > Fault. The code runs properly on another node where the graphics
> driver
> >> is
> >> > more up to date. I wonder if the crashing is somewhat expected, and
> >> > therefore I should prepare different binaries based on  the
> capabilities
> >> of
> >> > different nodes. Thanks.
> >> >
> >> > Best regards,
> >> > --
> >> > Guanglei Cui
> >> > --
> >> > gmx-users mailing list    gmx-users at gromacs.org
> >> > http://lists.gromacs.org/mailman/listinfo/gmx-users
> >> > * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> >> > * Please don't post (un)subscribe requests to the list. Use the
> >> > www interface or send it to gmx-users-request at gromacs.org.
> >> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >> --
> >> gmx-users mailing list    gmx-users at gromacs.org
> >> http://lists.gromacs.org/mailman/listinfo/gmx-users
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> >> * Please don't post (un)subscribe requests to the list. Use the
> >> www interface or send it to gmx-users-request at gromacs.org.
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >
> >
> >
> > --
> > Guanglei Cui
> > --
> > gmx-users mailing list    gmx-users at gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> > * Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-users-request at gromacs.org.
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> --
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>



-- 
Guanglei Cui



More information about the gromacs.org_gmx-users mailing list