[gmx-users] Re: Segmentation fault, mdrun_mpi

Ladasky blind.watchmaker at yahoo.com
Sun Oct 7 20:15:17 CEST 2012

Justin Lemkul wrote
> Random segmentation faults are really hard to debug.  Can you resume the
> run 
> using a checkpoint file?  That would suggest maybe an MPI problem or
> something 
> else external to Gromacs.  Without a reproducible system and a debugging 
> backtrace, it's going to be hard to figure out where the problem is coming
> from.

Thanks for that tip, Justin.  I tried to resume one run which failed at 1.06
million cycles, and it WORKED.  It proceeded all the way to the 2.50 million
cycles that I designated.  I now have two separate .trr files, but I suppose
they can be merged.

I don't know whether my crashes are random yet.  I will try re-running that
simulation again from time zero, to see whether it segfaults at the same
place.  If it doesn't, then I have a problem which may have nothing to do

I looked in on memory usage several times while mdrun_mpi was executing. 
Over all, about 3 GB of my computer's 8 GB of RAM were in use.  As I
expected, GROMACS used very little of this.  The mpirun process used a
constant 708K.  I had five mdrun_mpi processes, all of which used slightly
more RAM as they worked, but I didn't notice anything which suggested a
gross memory leak.  The process which used the most RAM was using 14.4 MB
right after it started, rose to 15.9 MB within the first ten minutes or so,
and reached 16.0 MB after four hours.  The process which used the least RAM
started at 10.6 MB and finished at 10.8 MB.  All together, GROMACS was using
about 64 MB.

I have a well-cooled CPU, core temperatures are under 50 degrees when the
system is running under full load.  My system doesn't lock up or crash on
me.  I think that my hardware is good.

View this message in context: http://gromacs.5086.n6.nabble.com/Segmentation-fault-mdrun-mpi-tp5001601p5001760.html
Sent from the GROMACS Users Forum mailing list archive at Nabble.com.

More information about the gromacs.org_gmx-users mailing list