[gmx-users] error in gmx-4.0 with mpi

Vitaly Chaban vvchaban at gmail.com
Sat Dec 13 20:52:57 CET 2008


Hi,

I have got a problem running a parallel version of gromacs 4.0.


While running the gromacs 4.0 with mpi the following error
permanently appears:

NNODES=4, MYRANK=1, HOSTNAME=merlin-3-9
NNODES=4, MYRANK=0, HOSTNAME=merlin-3-9
NNODES=4, MYRANK=2, HOSTNAME=merlin-2-24
NNODES=4, MYRANK=3, HOSTNAME=merlin-2-24
NODEID=0 argc=1
NODEID=1 argc=1
NODEID=2 argc=1
NODEID=3 argc=1
                         :-)  G  R  O  M  A  C  S  (-:

                   Groningen Machine for Chemical Simulation

                           :-)  VERSION 4.0_rc2  (-:


Reading file topol.tpr, VERSION 3.3.3 (single precision)
Note: tpx file_version 40, software version 58

NOTE: The tpr file used for this simulation is in an old format, for less memory usage and possibly more performance create a new tpr file with an up to date version of grompp

Making 1D domain decomposition 1 x 1 x 4

Back Off! I just backed up ener.edr to ./#ener.edr.1#

WARNING: This run will generate roughly 5946 Mb of data

Dec 13 11:34:47 2008 32301 3 6.1 pServe: getMsgBuffer_() failed.
Fatal error (code 0x94213a0f) in MPI_Scatterv():
MPI_Scatterv(324): MPI_Scatterv(sbuf=0x8b8170, scnts=0x82b000, displs=0x82b010, MPI_BYTE, rbuf=0x8f3ce0, rcount=4680, MPI_BYTE, root=0, MPI_COMM_WORLD) failed
MPIC_Send(50): failure
MPIC_Wait(306): failure
MPIDI_CH3_Progress(421): [ch3:sock] failed to connnect to remote process -1:3
MPIDU_Sock_wait(116): connection failure (set=0,sock=4)
ABORT - process 0
Dec 13 11:34:52 2008 32301 4 6.1 PAM: pjl_rwait: Didn't get all TS to report status.
Dec 13 11:34:52 2008 32301 3 6.1 PAM: pWaitRtask(): ls_rwait/pjl_rwait() failed, Communication time out.
Dec 13 11:34:57 2008 32301 4 6.1 PAM: pjl_rwait: Didn't get all TS to report status.
Dec 13 11:34:57 2008 32301 3 6.1 PAM: pWaitRtask(): ls_rwait/pjl_rwait() failed, Communication time out.
Dec 13 11:34:57 2008 32301 3 6.1 pWaitAll(): NIOS is dead

I used
grompp -np 4 to create topol.tpr
and then
mpirun.lsf /home/gromacs4.0-mpi/bin/mdrun

I have no problems running 1-proc gromacs.

Does anybody have ideas how to fix this?

Thank you very much,
Vitaly




More information about the gromacs.org_gmx-users mailing list