[gmx-users] question about Gromacs4.0.7: parallel run

Yi Peng muhuohuohuo at gmail.com
Tue May 25 01:53:30 CEST 2010


Hi, everyone,

Recently our school upgraded the clusters for us. And they install the
Gromacs-4.0.7 for us.  Before I always used Gromacs-4.0.3, the scripts used
for parallel running works well.

My script is as follows:

#PBS -l nodes=4:ppn=2
#PBS -N pr-impd1-wt
#PBS -j oe
module load gromacs
module load openmpi-intel
cd $PBS_O_WORKDIR
NPROCS=`wc -l < $PBS_NODEFILE`
/usr/local/bin/pbsdcp -s pr.tpr $TMPDIR
cd $TMPDIR
mpiexec mdrun -multi $NPROCS -maxh 100 -s pr.tpr -e pr.edr -o pr.trr -g
pr.log -c pr.gro
/usr/local/bin/pbsdcp -g '*' $PBS_O_WORKDIR
cd $PBS_O_WORKDIR

But today I tried to use Gromacs-4.0.7 for this. It always has the error
message as follows:
-------------------------------------------------------
Program mdrun, VERSION 4.0.7
Source code file: gmxfio.c, line: 737

Can not open file:
pr7.tpr
-------------------------------------------------------

"Your Bones Got a Little Machine" (Pixies)

Error on node 7, will try to stop all the nodes
Halting parallel program mdrun on CPU 7 out of 8

gcq#212: "Your Bones Got a Little Machine" (Pixies)


-------------------------------------------------------
Program mdrun, VERSION 4.0.7
Source code file: gmxfio.c, line: 737

Can not open file:
pr5.tpr
-------------------------------------------------------

"Your Bones Got a Little Machine" (Pixies)


gcq#212: "Your Bones Got a Little Machine" (Pixies)

Error on node 5, will try to stop all the nodes
Halting parallel program mdrun on CPU 5 out of 8

-------------------------------------------------------
Program mdrun, VERSION 4.0.7
Source code file: gmxfio.c, line: 737

Can not open file:
pr4.tpr
-------------------------------------------------------

"Your Bones Got a Little Machine" (Pixies)


gcq#212: "Your Bones Got a Little Machine" (Pixies)

Error on node 4, will try to stop all the nodes
Halting parallel program mdrun on CPU 4 out of 8
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 6 in communicator MPI_COMM_WORLD
with errorcode -1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------

How can I solve this problem. Where can I add number to each input  file for
pr.tpr?

Thanks!

Yi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20100524/e247d07b/attachment.html>


More information about the gromacs.org_gmx-users mailing list