[gmx-users] Segmentation fault error from mdrun

rainy908 rainy908 at yahoo.com
Wed Dec 7 21:36:13 CET 2011


Hi,

I encounter the following error when trying to execute mdrun:

# Running Gromacs: read TPR and write output to /gpfs disk
 $MPIRUN  $MDRUN -v -nice 0 -np $NSLOTS \
 -s n12_random_50_protein_all.tpr \
 -o n12_random_50_protein_all.trr \
 -c n12_random_50_protein_all.gro \
 -g n12_random_50_protein_all.log \
 -x n12_random_50_protein_all.xtc \
 -e n12_random_50_protein_all.edr

Error:

[compute-0-7:12377] Failing at address: 0x7159fd0
[compute-0-30:07435] [ 1] mdrun [0x761971]
[compute-0-30:07435] *** End of error message ***
[compute-0-29:15535] [ 0] /lib64/libpthread.so.0 [0x39df60e7c0]
[compute-0-29:15535] [ 1] mdrun [0x761d60]
[compute-0-29:15535] *** End of error message ***
[compute-1-29:19799] [ 0] /lib64/libpthread.so.0 [0x33aac0e7c0]
[compute-1-29:19799] [ 1] mdrun [0x762065]
[compute-1-29:19799] *** End of error message ***
[compute-0-29:15537] [ 0] /lib64/libpthread.so.0 [0x39df60e7c0]
[compute-0-29:15537] [ 1] mdrun [0x762065]
[compute-0-29:15537] *** End of error message ***
[compute-0-29:15536] [ 0] /lib64/libpthread.so.0 [0x39df60e7c0]
[compute-0-29:15536] [ 1] mdrun [0x762065]
[compute-0-29:15536] *** End of error message ***
[compute-1-31:11981] [ 0] /lib64/libpthread.so.0 [0x374f00e7c0]
[compute-1-31:11981] [ 1] mdrun [0x761d60]
[compute-1-31:11981] *** End of error message ***
[compute-1-31:11982] [ 0] /lib64/libpthread.so.0 [0x374f00e7c0]
[compute-1-31:11982] [ 1] mdrun [0x761960]
[compute-1-31:11982] *** End of error message ***
[compute-0-29:15538] [ 0] /lib64/libpthread.so.0 [0x39df60e7c0]
[compute-0-29:15538] [ 1] mdrun [0x761960]
[compute-0-29:15538] *** End of error message ***
[compute-0-7:12377] [ 0] /lib64/libpthread.so.0 [0x387c60e7c0]
[compute-0-7:12377] [ 1] mdrun [0x729641]
[compute-0-7:12377] *** End of error message ***
[compute-1-29:19796] [ 0] /lib64/libpthread.so.0 [0x33aac0e7c0]
[compute-1-29:19796] [ 1] mdrun [0x762065]
[compute-1-29:19796] *** End of error message ***
[compute-1-31.local][[50630,1],32][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
--------------------------------------------------------------------------
mpirun noticed that process rank 35 with PID 32477 on node compute-1-8.local exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

This is a parallel job that caused segmentation fault on compute-1-8, thus causing the entire job to fail.

Any input would be most appreciated.

Lily



More information about the gromacs.org_gmx-users mailing list