[gmx-users] parallel problems...

David van der Spoel spoel at xray.bmc.uu.se
Tue Feb 1 09:07:29 CET 2005


On Tue, 2005-02-01 at 15:25 +0100, Arvid Soderhall wrote:
> Hi all
> I have a problem to run things on more than 4 cpus on our brand new 
> cluster. I use a PBS script which launches the command
> ----------------------------------------
> /usr/local/encap/mpich-126-gcc-64/bin/mpirun -machinefile machines -np 4 
> /swc/gromacs/bin/mdrun_mpi -np 4 -v -s RW12_s20_a.tpr -o RW12_s20_a.trr 
> -x RW12_s20_a.xtc -c RW12_s20_a.gro -e RW12_s20_a.edr -g RW12_s20_a.log
> ---------------------------------------------
> This runs nicely, but if I use 6 cpus (or 8) the calculation crashes and 
> an error is reported in the standard out file:
> -----------------------------------------------------------------------------
> rm_10214:  p4_error: semget failed for setnum: 0
> p0_7618: (0.113281) net_recv failed for fd = 7
> p0_7618:  p4_error: net_recv read, errno = : 104
> p0_7618: (4.117188) net_send: could not write to fd=4, errno = 32
> ---------------------------------------------------------------------------------
> Moreover, there are some scattered mdrun_mpi processes that keeps on 
> running (but never on the master node).
> 
> I have tryed to use mpiexec instead of mpirun
> ------------------------------------
> /usr/local/bin/mpiexec -verbose -comm lam -kill -mpich-p4-no-shmem 
> $APPLICATION $RUNFLAGS
> --------------------------------------------
> Then I get a different error message to the error output file:
> --------------------------------------------------------------------------------------
> mpiexec: Warning: parse_args: argument "-mpich-p4-[no-]shmem" ignored since
>   communication library not MPICH/P4.

It seems that you are mixing LAM and MPICH. You should compile with one,
and then take care that you do not have anything of the other
implementations in your path. If yo compile with LAM you should use
mpirun that belongs to lam. Actually mpiexec that comes with rocks linux
determines itself which library you have used so that you can omit all
the arguments to mpiexec.


-- 
David.
________________________________________________________________________
David van der Spoel, PhD, Assoc. Prof., Molecular Biophysics group,
Dept. of Cell and Molecular Biology, Uppsala University.
Husargatan 3, Box 596,          75124 Uppsala, Sweden
phone:  46 18 471 4205          fax: 46 18 511 755
spoel at xray.bmc.uu.se    spoel at gromacs.org   http://xray.bmc.uu.se/~spoel
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++





More information about the gromacs.org_gmx-users mailing list