[gmx-users] parallel run hangs (not crashed)
Mark Abraham
Mark.Abraham at anu.edu.au
Wed May 17 05:00:26 CEST 2006
chris.neale at utoronto.ca wrote:
> I am running a system of 185K atoms. The structure is energy minimized and the
> dynamics run appears to be going smoothly until it just hangs. The job still
> exists on the first node, but none of the 4 nodes are doing any work and I don't
> get any error messages.
> GROMPP:
> ${ED}/grompp -np 4 -f grompp_md.mdp -n ${MOL}.ndx -c ${MOL}_m.gro -p ${MOL}.top
> -o ${MOL}_mm.tpr > output.mm_grompp
>
> MDRUN_MPI:
> ${ED}/mdrun_mpi -np 4 -nice 4 -s ${MOL}_mm.tpr -o ${MOL}_mm.trr -c ${MOL}_mm.gro
> -g output.mm_mdrun -v -deffnm run1g 2> output.mm_mdrun_e
>
> LAM SCRIPT:
> #!/bin/sh
> PATH=.:/work/lam/bin:$PATH
> LAMRSH="ssh -x"
> export LAMRSH PATH
> cd ${MYDIR}
> lamboot -v lamhosts
> mpirun N ${MYDIR}/run.sh
> lamhalt
mpirun may not play nicely with a single-processor script that
subsequently runs an MPI child process. Check that isn't your problem,
and also that mpirun isn't a local wrapper around the normal one, as
might happen at a supercomputer facility.
Mark
More information about the gromacs.org_gmx-users
mailing list