[gmx-developers] MPICH2 and parallel Gromacs errors
Casey,Richard
Richard.Casey at ColoState.EDU
Fri Jun 20 18:56:15 CEST 2008
Hello,
This issue appears to have been encountered by many people. We've searched all the discussion archives and tried every recommended solution but no luck.
We have MPICH2 v.1.0.7 installed on an Apple G5 cluster (64 CPU's). And installed Gromacs v.3.3.3 with --enable-mpi option.
Single CPU jobs run OK; parallel jobs always fail. For parallel jobs we use:
grompp -v -np 2 -p topol.top (or other values for np for more cpu's)
We launch MPD with:
mpdboot -n 2 -f /common/mpich2/mpd.hosts
We run jobs with:
/common/mpich2/bin/mpiexec -l -n 2 \
/common/gromacs/bin/mdrun_mpi -v -np 2 \
-s /Users/richardcasey/topol.tpr \
-g /Users/richardcasey/md.log \
-e /Users/richardcasey/ener.edr \
-o /Users/richardcasey/traj.trr \
-x /Users/richardcasey/traj.xtc \
-c /Users/richardcasey/confout.gro
The output always says:
-------------------------------------------------------
1: Program mdrun_mpi, VERSION 3.3.3
1: Source code file: init.c, line: 69
1:
1: Fatal error:
1: run input file /Users/richardcasey/topol.tpr was made for 2 nodes,
1: p0_29762: p4_error: : -1
1: while mdrun_mpi expected it to be for 1 nodes.
1: -------------------------------------------------------
We've tried everything (many variations on the above and recommendations from the discussion list) but for some reason mdrun_mpi insists that it use a single-cpu version of the topology file. We've check environment variables and they appear to point to the right directories. /common is NFS mounted on all nodes.
Completely stumped - no idea what is wrong here. Any suggestions?
--------------------------------------------
Richard Casey
More information about the gromacs.org_gmx-developers
mailing list