[gmx-users] Gromacs runs with SGE and LAM-MPI
Stéphane Teletchéa
steletch at jouy.inra.fr
Wed Jan 11 11:07:04 CET 2006
I'm encountering difficulties launching jobs on the cluster while using
SGE for launching the job.
I'm using the benchmark molecules as references for the jobs (to be sure
input parameters are not problematic).
My input script is as follows:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#!/bin/bash
#$ -S /bin/bash
#$ -V
#$ -M steletch at jouy.inra.fr
#$ -m eas
#$ -cwd
#$ -o ~/bench/gromacs3.3/d.villin/s64LAM_8_noht.q-8/d.villin-s64LAM.out
#$ -e ~/bench/gromacs3.3/d.villin/s64LAM_8_noht.q-8/d.villin-s64LAM.err
~/Programmes/gromacs-3.3_s64LAM/bin/grompp \
-f ~/Benchmark_Gromacs/d.villin/grompp.mdp \
-p ~/Benchmark_Gromacs/d.villin/topol.top \
-c ~/Benchmark_Gromacs/d.villin/conf.gro \
-o ~/d.villin/s64LAM_8_noht.q-8/d.villin_s64LAM_8_noht.q.tpr \
-po ~/d.villin/s64LAM_8_noht.q-8/d.villin_s64LAM_8_noht.q.mdp \
-np 8 \
-nov
/usr/local/public/lam/bin/mpirun -np 8 ~/gromacs-3.3_s64LAM/bin/mdrun \
-s ~/d.villin/s64LAM_8_noht.q-8/d.villin_s64LAM_8_noht.q.tpr
-o ~/d.villin/s64LAM_8_noht.q-8/d.villin_s64LAM_8_noht.q.trr \
-c ~/d.villin/s64LAM_8_noht.q-8/d.villin_s64LAM_8_noht.q.gro \
-g ~/d.villin/s64LAM_8_noht.q-8/d.villin_s64LAM_8_noht.q.log \
-e ~/d.villin/s64LAM_8_noht.q-8/d.villin_s64LAM_8_noht.q.edr
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If i run interactively the commands, the runs starts and executes
flawlessly (gromacs and LAM/MPI interact as exepected). I'm just
encountering problems while using SGE for launching my jobs (which is
mandatory since we share the cluster amongst users).
On the error logs, i get :
-----------------------------------------------------------------------------
The selected RPI failed to initialize during MPI_INIT. This is a
fatal error; I must abort.
This occurred on host n57 (n2).
The PID of failed process was 2958 (MPI_COMM_WORLD rank: 2)
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code. This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.
PID 2570 failed on node n0 (192.168.1.58) with exit status 1.
-----------------------------------------------------------------------------
We're working hard on it, but i thought some help from the list could
drive us in the right direction.
Thanks a lot in advance for your answers,
S. Téletchéa
More informations :
System under Mandriva Linux LE2005, 64-bits edition
Gromacs version 3.3 (64 bits single precision)
SGE version 6.0u6
LAM/MPI :
laminfo
LAM/MPI: 7.1.1
Prefix: /usr/local/public/lam-7.1.1
Architecture: x86_64-unknown-linux-gnu
Configured by: root
Configured on: Mon Jan 9 15:33:44 CET 2006
Configure host: adm3
Memory manager: ptmalloc2
C bindings: yes
C++ bindings: yes
Fortran bindings: yes
C compiler: gcc
C++ compiler: g++
Fortran compiler: g77
Fortran symbols: double_underscore
C profiling: yes
C++ profiling: yes
Fortran profiling: yes
C++ exceptions: no
Thread support: yes
ROMIO support: yes
IMPI support: no
Debug support: no
Purify clean: no
SSI boot: globus (API v1.1, Module v0.6)
SSI boot: rsh (API v1.1, Module v1.1)
SSI boot: slurm (API v1.1, Module v1.0)
SSI coll: lam_basic (API v1.1, Module v7.1)
SSI coll: shmem (API v1.1, Module v1.0)
SSI coll: smp (API v1.1, Module v1.2)
SSI rpi: crtcp (API v1.1, Module v1.1)
SSI rpi: lamd (API v1.0, Module v7.1)
SSI rpi: sysv (API v1.0, Module v7.1)
SSI rpi: tcp (API v1.0, Module v7.1)
SSI rpi: usysv (API v1.0, Module v7.1)
SSI cr: self (API v1.0, Module v1.0)
--
Stéphane Téletchéa, PhD. http://www.steletch.org
Unité Mathématique Informatique et Génome http://migale.jouy.inra.fr/mig
INRA, Domaine de Vilvert Tél : (33) 134 652 121 / 3086
78352 Jouy-en-Josas cedex, France Fax : (33) 134 652 901
More information about the gromacs.org_gmx-users
mailing list