[gmx-users] mdrun_mpi in HP_MPI LSF/SLURM setup

Larcombe, Lee l.larcombe at cranfield.ac.uk
Fri Apr 15 16:13:52 CEST 2011


Hi gmx-users

We have an HPC setup running HP_MPI and LSF/SLURM. Gromacs 4.5.3 has been compiled with mpi support
The compute nodes on the system contain 2 x dual core Xeons which the system sees as 4 processors

An LSF script called gromacs_run.lsf is as shown below

#BSUB -N
#BSUB -J "gromacsTest5"
#BSUB -u l.larcombe at cranfield.ac.uk
#BSUB -n 4
#BSUB -q short
#BSUB -o %J.log
mpirun -srun mdrun_mpi -v -s xxx.tpr -o xxx.trr

Queued with:

Bsub < gromacs_run.lsf

This is intended to run 1 mdrun on a single node using all four cores of the two xeons. The result is that although the job is only submitted to one compute node, 4 mdruns are launched on each of the 4 cores = 16 jobs. These are all the same as if mdrun has not been compiled with mpi support.

If I tell srun to start just one task with "mpirun -srun -n1 mdrun_mpi -v -s xxx.tpr –o xxx.trr" it starts one job on each core instead of 4:

NNODES=1, MYRANK=0, HOSTNAME=comp195
NNODES=1, MYRANK=0, HOSTNAME=comp195
NNODES=1, MYRANK=0, HOSTNAME=comp195
NNODES=1, MYRANK=0, HOSTNAME=comp195

Logs show 4 mdrun_mpi starts, 4 file read ins and I get 4 of all run files in CWD. I am sure that mdrun_mpi is indeed compiled with mpi support - although our sysadmin did that, not me. For example, if I try and execute "mdrun_mpi –h" I get a message from HP–MPI and have to execute "mpirun mdrun_mpi –h" to see the help text.

Does anyone have any experience of running with this setup  - any ideas?

Thanks
Lee



More information about the gromacs.org_gmx-users mailing list