[gmx-users] question about Gromacs4.0.7: parallel run
Justin A. Lemkul
jalemkul at vt.edu
Tue May 25 01:59:17 CEST 2010
Yi Peng wrote:
> Hi, everyone,
>
> Recently our school upgraded the clusters for us. And they install the
> Gromacs-4.0.7 for us. Before I always used Gromacs-4.0.3, the scripts
> used for parallel running works well.
>
> My script is as follows:
>
> #PBS -l nodes=4:ppn=2
> #PBS -N pr-impd1-wt
> #PBS -j oe
> module load gromacs
> module load openmpi-intel
> cd $PBS_O_WORKDIR
> NPROCS=`wc -l < $PBS_NODEFILE`
> /usr/local/bin/pbsdcp -s pr.tpr $TMPDIR
> cd $TMPDIR
> mpiexec mdrun -multi $NPROCS -maxh 100 -s pr.tpr -e pr.edr -o pr.trr -g
> pr.log -c pr.gro
> /usr/local/bin/pbsdcp -g '*' $PBS_O_WORKDIR
> cd $PBS_O_WORKDIR
>
> But today I tried to use Gromacs-4.0.7 for this. It always has the error
> message as follows:
> -------------------------------------------------------
> Program mdrun, VERSION 4.0.7
> Source code file: gmxfio.c, line: 737
>
> Can not open file:
> pr7.tpr
> -------------------------------------------------------
>
> "Your Bones Got a Little Machine" (Pixies)
>
> Error on node 7, will try to stop all the nodes
> Halting parallel program mdrun on CPU 7 out of 8
>
> gcq#212: "Your Bones Got a Little Machine" (Pixies)
>
>
> -------------------------------------------------------
> Program mdrun, VERSION 4.0.7
> Source code file: gmxfio.c, line: 737
>
> Can not open file:
> pr5.tpr
> -------------------------------------------------------
>
> "Your Bones Got a Little Machine" (Pixies)
>
>
> gcq#212: "Your Bones Got a Little Machine" (Pixies)
>
> Error on node 5, will try to stop all the nodes
> Halting parallel program mdrun on CPU 5 out of 8
>
> -------------------------------------------------------
> Program mdrun, VERSION 4.0.7
> Source code file: gmxfio.c, line: 737
>
> Can not open file:
> pr4.tpr
> -------------------------------------------------------
>
> "Your Bones Got a Little Machine" (Pixies)
>
>
> gcq#212: "Your Bones Got a Little Machine" (Pixies)
>
> Error on node 4, will try to stop all the nodes
> Halting parallel program mdrun on CPU 4 out of 8
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 6 in communicator MPI_COMM_WORLD
> with errorcode -1.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
>
> How can I solve this problem. Where can I add number to each input file
> for pr.tpr?
>
The use of -multi implies that you have a series of .tpr files beginning with
zero, i.e. pr0.tpr, pr1.tpr, etc. So the input files have to be named as mdrun
expects them to be. The name given to the -s flag is a prefix. See, for instance:
http://www.gromacs.org/Documentation/How-tos/REMD#Execution_Steps
-Justin
> Thanks!
>
> Yi
>
--
========================================
Justin A. Lemkul
Ph.D. Candidate
ICTAS Doctoral Scholar
MILES-IGERT Trainee
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin
========================================
More information about the gromacs.org_gmx-users
mailing list