[gmx-users] Gromacs 4.6 crushes in PBS queue system

Mark Abraham mark.j.abraham at gmail.com
Sun Feb 17 10:34:37 CET 2013


On Sat, Feb 16, 2013 at 11:27 PM, Tomek Wlodarski <tomek.wlodarski at gmail.com
> wrote:

> Hi!
>
> I have problem in running gromacs 4.6 in PBS queue...
> I end up with error:
>
>
> [n370:03036] [[19430,0],0]-[[19430,1],8] mca_oob_tcp_msg_recv: readv
> failed: Connection reset by peer (104)
> --------------------------------------------------------------------------
> mpirun noticed that process rank 18 with PID 616 on node n344 exited on
> signal 4 (Illegal instruction).
> --------------------------------------------------------------------------
> [n370:03036] 3 more processes have sent help message
> help-opal-shmem-mmap.txt / mmap on nfs
> [n370:03036] Set MCA parameter "orte_base_help_aggregate" to 0 to see all
> help / error messages
> 3 total processes killed (some possibly by mpirun during cleanup)
>
> I run the same pbs files with older gromacs 4.5.5 (installed with the same
> openmpi, gcc and fftw) and everything is working..
>
> also when I am running gromacs directly on the access node:
>
> mpirun -np 32 /home/users/didymos/gromacs/bin/mdrun_mpi -v -deffnm
> protein-EM-solvated -c protein-EM-solvated.gro
>
> it is running OK.
> Any ideas?
>

Nope. Those messages are all from mpirun after GROMACS already wrote any of
its diagnostic messages. We'd need what GROMACS wrote to stderr, stdout and
the head and tail of the .log file.

Mark



More information about the gromacs.org_gmx-users mailing list