[gmx-users] GROMACS parallel on multiple nodes - ERROR

David spoel at xray.bmc.uu.se
Thu Feb 26 21:40:01 CET 2004


On Thu, 2004-02-26 at 20:02, Peter Spijker wrote:
> Hi all,
> 
> At this moment I am trying to install GROMACS on a Linux Beowulf Cluster. I
> compiled the code with this command:
> 
> ---
> 
> $
> ./configure --enable-float --prefix=/ul/pspijker/exec/gromacs_mpi --enable-m
> pi --without-motif-libraries
> 
> ---
> 
> Everything compiled fine and running the code on one node didn't give any
> problem. Though I still think that it uses only one processor, since there
> was no difference between the runtime for the same simulation on 2
> processors on the same node as for only 1 processor on that node.
> 
> I am using a PBS script to start my job at the Linux Beowulf Cluster. The
> commands to start GROMACS are (where $MDP, $TPR, $GRO, $TOP and $NDX are the
> files and $NOD is the number of nodes):
> 
> ---
> 
> #!/bin/csh
> 
> #PBS -l nodes=4:ppn=1
> #PBS -N GROMACS
> #PBS -q work
> #PBS -o std.out
> #PBS -e std.err
> #PBS -m e
> 
> ### Set variables
> set NOD=4
> 
> ### Script Commands
> cd $PBS_O_WORKDIR
> 
> ### Set Environments
> setenv CONV_RSH ssh
> setenv LD_LIBRARY_PATH "/usr/lib"
> 
> ### Write info about nodes used
> set n=`wc -l < $PBS_NODEFILE`
> echo 'PBS_NODEFILE ' $PBS_NODEFILE ' has ' $n ' lines'
> cat $PBS_NODEFILE
> echo
> 
> ### Run simulation
> lamboot $PBS_NODEFILE
> /ul/pspijker/exec/gromacs_mpi/i686-pc-linux-gnu/bin/grompp -f $MDP -c
> $GRO -p $TOP -o $TPR -np $NOD -deshuf $NDX -shuffle -sort
> /ul/pspijker/exec/gromacs_mpi/i686-pc-linux-gnu/bin/mdrun -s $TPR -np $NOD
> 
> ### Exit
> echo
> exit 0
> 
> ---
> 
> I cannot see anything being really wrong in the script. When running this
> script the following information is written to the error-file std.err:

you need start the run using mpirun

mpirun -c 4 mdrun


> ---
> 
> ----------------------------------------------------------------------------
> -
> LAM attempted to execute a process on the remote node
> "node1-14.wag.caltech.edu",
> but received some output on the standard error.
> 
> LAM tried to use the remote agent command "/usr/bin/ssh"
> to invoke "echo $SHELL" on the remote node.
> 
> Try invoking the following command at the unix command line:
> 
>         /usr/bin/ssh node1-14.wag.caltech.edu -n echo $SHELL
> 
> When you can get this command to execute successfully by hand, LAM
> will probably be able to function properly.
> ----------------------------------------------------------------------------
> -
> 
> ----------------------------------------------------------------------------
> -
> lamboot encountered some error (see above) during the boot process,
> and will now attempt to kill all nodes that it was previously able to
> boot (if any).
> 
> Please wait for LAM to finish; if you interrupt this process, you may
> have LAM daemons still running on remote nodes.
> ----------------------------------------------------------------------------
> -
> 
> ----------------------------------------------------------------------------
> -
> It seems that there is no lamd running on this host, which indicates
> that the LAM/MPI runtime environment is not operating.  The LAM/MPI
> runtime environment is necessary for MPI programs to run (the MPI
> program tired to invoke the "MPI_Init" function).
> 
> Please run the "lamboot" command the start the LAM/MPI runtime
> environment.  See the LAM/MPI documentation for how to invoke
> "lamboot" across multiple machines.
> ----------------------------------------------------------------------------
> -
> 
> ---
> 
> I tried running the specified command by hand and it replied correctly with:
> /bin/tcsh
> Does this mean I have to change something in the environment? I cannot
> understand why it works with one node, but not with multiple.
> 
> If someone can help me, I would really appreciate it.
> 
> Kind regards,
> 
> Peter Spijker
> 
> ---
> 
> Fulbright Fellow - The Netherland-America Foundation
> 
> California Institute of Technology
> Biochemistry & Molecular Biophysics
> Materials Process and Simulation Center
> MC 139-74 Caltech
> Pasadena, CA-91125
> The United States of America
> 
> Phone: (626)-395-2844
> E-mail: pspijker at wag.caltech.edu
> 
> 
> _______________________________________________
> gmx-users mailing list
> gmx-users at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-users
> Please don't post (un)subscribe requests to the list. Use the 
> www interface or send it to gmx-users-request at gromacs.org.
-- 
David.
________________________________________________________________________
David van der Spoel, PhD, Assist. Prof., Molecular Biophysics group,
Dept. of Cell and Molecular Biology, Uppsala University.
Husargatan 3, Box 596,  	75124 Uppsala, Sweden
phone:	46 18 471 4205		fax: 46 18 511 755
spoel at xray.bmc.uu.se	spoel at gromacs.org   http://xray.bmc.uu.se/~spoel
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++




More information about the gromacs.org_gmx-users mailing list