[gmx-users] jwe1050i + jwe0019i errors = SIGSEGV (Fujitsu)

James jamesresearching at gmail.com
Sat Sep 21 14:45:59 CEST 2013


Dear Mark and the rest of the Gromacs team,

Thanks a lot for your response. I have been trying to isolate the problem
and have also been in discussion with the support staff. They suggested it
may be a bug in the gromacs code, and I have tried to isolate the problem
more precisely.

Considering that the calculation is run under MPI with 16 OpenMP cores per
MPI node, the error seems to occur under the following conditions:

A few thousand atoms: 1 or 2 MPI nodes: OK
Double the number of atoms (~15,000): 1 MPI node: OK, 2 MPI nodes: SIGSEGV
error described below.

So it seems that the error occurs for relatively large systems which use
MPI.

The crash mentions the "calc_cell_indices" function (see below). Is this
somehow a problem with memory not being sufficient at the MPI interface at
this function? I'm not sure how to proceed further. Any help would be
greatly appreciated.

Gromacs version is 4.6.3.

Thank you very much for your time.

James


On 4 September 2013 16:05, Mark Abraham <mark.j.abraham at gmail.com> wrote:

> On Sep 4, 2013 7:59 AM, "James" <jamesresearching at gmail.com> wrote:
> >
> > Dear all,
> >
> > I'm trying to run Gromacs on a Fujitsu supercomputer but the software is
> > crashing.
> >
> > I run grompp:
> >
> > grompp_mpi_d -f parameters.mdp -c system.pdb -p overthe.top
> >
> > and it produces the error:
> >
> > jwe1050i-w The hardware barrier couldn't be used and continues processing
> > using the software barrier.
> > taken to (standard) corrective action, execution continuing.
> > error summary (Fortran)
> > error number error level error count
> > jwe1050i w 1
> > total error count = 1
> >
> > but still outputs topol.tpr so I can continue.
>
> There's no value in compiling grompp with MPI or in double precision.
>
> > I then run with
> >
> > export FLIB_FASTOMP=FALSE
> > source /home/username/Gromacs463/bin/GMXRC.bash
> > mpiexec mdrun_mpi_d -ntomp 16 -v
> >
> > but it crashes:
> >
> > starting mdrun 'testrun'
> > 50000 steps, 100.0 ps.
> > jwe0019i-u The program was terminated abnormally with signal number
> SIGSEGV.
> > signal identifier = SEGV_MAPERR, address not mapped to object
> > error occurs at calc_cell_indices._OMP_1 loc 0000000000233474 offset
> > 00000000000003b4
> > calc_cell_indices._OMP_1 at loc 00000000002330c0 called from loc
> > ffffffff02088fa0 in start_thread
> > start_thread at loc ffffffff02088e4c called from loc ffffffff029d19b4 in
> > __thread_start
> > __thread_start at loc ffffffff029d1988 called from o.s.
> > error summary (Fortran)
> > error number error level error count
> > jwe0019i u 1
> > jwe1050i w 1
> > total error count = 2
> > [ERR.] PLE 0014 plexec The process terminated
> >
>
> abnormally.(rank=1)(nid=0x03060006)(exitstatus=240)(CODE=2002,1966080,61440)
> > [ERR.] PLE The program that the user specified may be illegal or
> > inaccessible on the node.(nid=0x03060006)
> >
> > Any ideas what could be wrong? It works on my local intel machine.
>
> Looks like it wasn't compiled correctly for the target machine. What was
> the cmake command, what does mdrun -version output? Also, if this is the K
> computer, probably we can't help, because the compiler docs are officially
> unavailable to us. National secret, and all ;-)
>
> Mark
>
> >
> > Thanks in advance,
> >
> > James
> > --
> > gmx-users mailing list    gmx-users at gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> > * Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-users-request at gromacs.org.
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> --
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>



More information about the gromacs.org_gmx-users mailing list