[gmx-users] parallel run hangs (not crashed)

Mark Abraham Mark.Abraham at anu.edu.au
Wed May 17 05:00:26 CEST 2006


chris.neale at utoronto.ca wrote:
> I am running a system of 185K atoms. The structure is energy minimized and the
> dynamics run appears to be going smoothly until it just hangs. The job still
> exists on the first node, but none of the 4 nodes are doing any work and I don't
> get any error messages.

> GROMPP:
> ${ED}/grompp -np 4 -f grompp_md.mdp -n ${MOL}.ndx -c ${MOL}_m.gro -p ${MOL}.top
> -o ${MOL}_mm.tpr > output.mm_grompp
> 
> MDRUN_MPI:
> ${ED}/mdrun_mpi -np 4 -nice 4 -s ${MOL}_mm.tpr -o ${MOL}_mm.trr -c ${MOL}_mm.gro
> -g output.mm_mdrun -v -deffnm run1g 2> output.mm_mdrun_e
> 
> LAM SCRIPT:
> #!/bin/sh
> PATH=.:/work/lam/bin:$PATH
> LAMRSH="ssh -x"
> export LAMRSH PATH
> cd ${MYDIR}
> lamboot -v lamhosts
> mpirun N ${MYDIR}/run.sh
> lamhalt

mpirun may not play nicely with a single-processor script that 
subsequently runs an MPI child process. Check that isn't your problem, 
and also that mpirun isn't a local wrapper around the normal one, as 
might happen at a supercomputer facility.

Mark



More information about the gromacs.org_gmx-users mailing list