[gmx-users] problem with LAM

Carsten Kutzner ckutzne at gwdg.de
Tue Jun 27 10:11:30 CEST 2006


Sridhar Acharya wrote:
> Hi All,
> 
> I am facing a problem for parallel mdrun.
> I tried to run in 2 nodes with the following command. The program reports that lamd is not running as follows.
> ####################################################################################################
> mpirun -np 2  /users/soft/GromacsSingle/bin/mdrun_mpi -s b4em_1CYP_WT.tpr -o em_1CYP_WT.trr -np 2
> -----------------------------------------------------------------------------
> It seems that there is no lamd running on this host, which indicates
> that the LAM/MPI runtime environment is not operating.  The LAM/MPI
> runtime environment is necessary for MPI programs to run (the MPI
> program tired to invoke the "MPI_Init" function).
> 
> Please run the "lamboot" command the start the LAM/MPI runtime
> environment.  See the LAM/MPI documentation for how to invoke
> "lamboot" across multiple machines.
> -----------------------------------------------------------------------------
> -----------------------------------------------------------------------------
> It seems that [at least] one of the processes that was started with
> mpirun did not invoke MPI_INIT before quitting (it is possible that
> more than one process did not invoke MPI_INIT -- mpirun was only
> notified of the first one, which was on node n0).
> 
> mpirun can *only* be used with MPI programs (i.e., programs that
> invoke MPI_INIT and MPI_FINALIZE).  You can use the "lamexec" program
> to run non-MPI programs over the lambooted nodes.
> -----------------------------------------------------------------------------
> ##################################################################################################
> 
> But lamd is very well running, because I could get the status of lam nodes with the "lamnodes" command.
> ########################################################################################
> [msridhar at cdfd-grid-node17 WT_SINGLE_PARALLEL]$ lamnodes
> n0      cdfd-grid-node2:1:
> n1      cdfd-grid-node4:1:
> n2      cdfd-grid-node12:1:
> n3      cdfd-grid-node13:1:
> n4      cdfd-grid-node14:1:
> n5      cdfd-grid-node16:1:
> n6      cdfd-grid-node17:1:origin,this_node
> ###########################################################################################
> Do I have to define any paths so that it could recognise this?
> 
> Waiting for your suggessions.
> 
> sridhar

Hi Sridhar,

some things that I would try are:
1. stop LAM by typing lamhalt, check that there is no other lamd of you running on that nodes,
    then boot LAM again and try again
2. Paths in the startup file (assuming bash-style):
    export PATH=/path-to-your-lam-installation/bin:$PATH
    export LAMHOME=/path-to-your-lam-installation/
3. Can you run something like 'mpirun -np 2 ls'?


Carsten


-- 
Dr. Carsten Kutzner
Max Planck Institute for Biophysical Chemistry
Theoretical and Computational Biophysics Department
Am Fassberg 11
37077 Goettingen, Germany
Tel. +49-551-2012313, Fax: +49-551-2012302
http://www.mpibpc.mpg.de/research/dep/grubmueller/
http://www.gwdg.de/~ckutzne




More information about the gromacs.org_gmx-users mailing list