[gmx-developers] Failure to extend orientation restraints simulations - do I need to report a bug or is this an installation error?

Bailey A. ab604 at soton.ac.uk
Mon Nov 25 13:08:28 CET 2013


Dear gromacs developers,



I posted this over in the users forum a couple of days ago, but haven't had much luck there so I hope it's ok to re-post it here (and apologies if this has been posted multiple times, I didn't intend to.)



A week ago I tried to extend a protein simulation using RDC derived orientation restraints using gromacs 4.5.3 and it failed and I then found this bug relating to extending a distance restraint simulation that was giving a similar error to mine: http://bugzilla.gromacs.org/issues/1174

which Mark says is resolved in the latest version.



So I requested an upgrade to gromacs 4.6.4 on our cluster. This has been compiled using:

Intel compilers: v13.1.2

MKL Libraries: v11.1

fftw v3.3.3

OpenMPI library version v1.6.4



When I run an initial test with this command, where nprocs = 4, it works

fine:



mpirun -np $nprocs mdrun_mpi -pd -deffnm pr_test_md -s pr_test_md -cpi pr_test_md.cpt -nice 0 >& pr_test_md.out



But when I try to extend it, I get this segmentation fault in my log file pr_test_md.out :



Reading file pr_test_md.tpr, VERSION 4.6.4 (single precision)



[green0069:28137] * Process received signal *



[green0069:28137] Signal: Segmentation fault (11)



[green0069:28137] Signal code: Address not mapped (1)



[green0069:28137] Failing at address: 0xc0



[green0069:28137] [ 0] /lib64/libpthread.so.0(+0xf500) [0x7fd91a329500]



[green0069:28137] [ 1] /lib64/libc.so.6(_IO_vfprintf+0x39) [0x7fd918efcd49]



[green0069:28137] [ 2] /lib64/libc.so.6(_IO_fprintf+0x88) [0x7fd918f07a28]



[green0069:28137] [ 3] mdrun_mpi(init_orires+0x7f8) [0x7abdf8]



[green0069:28137] [ 4] mdrun_mpi(mdrunner+0x1e74) [0x433c74]



[green0069:28137] [ 5] mdrun_mpi(cmain+0xdea) [0x446f1a]



[green0069:28137] [ 6] mdrun_mpi(main+0x4b) [0x44da1b]



[green0069:28137] [ 7] /lib64/libc.so.6(__libc_start_main+0xfd)

[0x7fd918ed7cdd]



[green0069:28137] [ 8] mdrun_mpi() [0x42d419]



[green0069:28137] * End of error message *



My understanding is that the first error  "Address not mapped" means that the program tried to access a memory location that is not part of the process' address space (e.g. a null pointer). What follows is a backtrace of the functions currently being executed (in reverse order, as found on the stack). I would suspect that the problem relates to Gromacs rather than OpenMPI -- looking at the origin of the trace.



So I'm guessing that it's this error that is the issue:

mdrun_mpi(init_orires+0x7f8) [0x7abdf8]



Would anyone be able to help me with this? The simulation I need to run uses about 180 hours of walltime and I am only allowed 60 hours for one job, hence I need to extend my initial run.



Many thanks,



Alistair Bailey
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-developers/attachments/20131125/b96428b7/attachment.html>


More information about the gromacs.org_gmx-developers mailing list