[gmx-users] 1replica/1cpu problem

francesco oteri francesco.oteri at gmail.com
Wed Jul 18 16:32:18 CEST 2012

Dear gromacs users,
I am trying to run a replica exchange simulation using the files you find
in http://dl.dropbox.com /u/40545409/gmx_mailinglist/inputs.tgz

The 4 replicas have been generated, as following:
grompp -p rest2.top -c 03md.gro -n index.ndx -o rest2_0  -f rest2_0.mdp
grompp -p rest2.top -c 03md.gro -n index.ndx -o rest2_1  -f rest2_1.mdp
grompp -p rest2.top -c 03md.gro -n index.ndx -o rest2_2  -f rest2_2.mdp
grompp -p rest2.top -c 03md.gro -n index.ndx -o rest2_3  -f rest2_3.mdp

The simulation was started with command, using gromacs 4.5.5  with the
latest bug fix:

mpirun -np 4  mdrun_mpi -s rest2_.tpr -multi 4 -replex 1000 >& out1

giving the following error:

[etna:10799] *** An error occurred in MPI_comm_size
[etna:10799] *** on communicator MPI_COMM_WORLD
[etna:10799] *** MPI_ERR_COMM: invalid communicator
[etna:10799] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
mpirun has exited due to process rank 0 with PID 10796 on
node etna exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
[etna:10795] 3 more processes have sent help message
help-mpi-errors.txt / mpi_errors_are_fatal
[etna:10795] Set MCA parameter "orte_base_help_aggregate" to 0 to see
all help / error messages

The nice thing is that the same error doesn't appear either if I use
the 4.5.5 without applying tha patches!!!

mpirun -np 4  mdrun_mpi -s rest2_.tpr -multi 4 -replex 1000 >& out2

or the bug fixed with multiple processors per replica:

mpirun -np 8  mdrun_mpi -s rest2_.tpr -multi 4 -replex 1000 >& out3

Since I have to use more then 4 replicas, I need to run 1cpu/replica.

Has someone any idea of the probem?


More information about the gromacs.org_gmx-users mailing list