[gmx-users] openmpi execution using sbatch & mpirun every command executed several times
Vedat Durmaz
durmaz at zib.de
Tue Dec 12 21:33:48 CET 2017
hi everybody,
i'm working on an ubuntu 16.04 system with gromacs-openmpi (5.1.2) installed from the ubuntu repos. everything works fine when i submit job.slurm using sbatch where job.slurm roughly looks like this
-------------------------
#!/bin/bash -l
#SBATCH -N 1
#SBATCH -n 24
#SBATCH -t 02:00:00
cd $4
cp file1 file2
grompp ...
mpirun -np 24 mdrun_mpi ... ### WORKS FINE
-------------------------
and the mdrun_mpi output contains the following statements:
-------------------------
Using 24 MPI processes
Using 1 OpenMP thread per MPI process
mdrun_mpi -v -deffnm em
Number of logical cores detected (24) does not match the number reported by OpenMP (12).
Consider setting the launch configuration manually!
Running on 1 node with total 24 cores, 24 logical cores
-------------------------
now, i want to put the last 3 commands of job.slurm into an extra script (run_gmx.sh)
-------------------------
#!/bin/bash
cp file1 file2
grompp ...
mdrun_mpi ...
-------------------------
that i start with mpirun
-------------------------
#!/bin/bash -l
#SBATCH -N 1
#SBATCH -n 24
#SBATCH -t 02:00:00
cd $4
mpirun -np 24 run_gmx.sh
-------------------------
but now i get strange errors because every non-mpi programm (cp, grompp) is executed 24 times which gives a desaster in the file system.
if i change job.slurm to
-------------------------
#!/bin/bash -l
#SBATCH -N 1
#SBATCH --ntasks-per-node=1
#SBATCH -t 02:00:00
cd $4
mpirun run_gmx.sh
-------------------------
i get the following error in the output:
Number of logical cores detected (24) does not match the number reported by OpenMP (1).
Consider setting the launch configuration manually!
Running on 1 node with total 24 cores, 24 logical cores
Using 1 MPI process
Using 24 OpenMP threads
Fatal error:
Your choice of 1 MPI rank and the use of 24 total threads leads to the use of 24 OpenMP threads, whereas we expect the optimum to be with more MPI ranks with 1 to 6 OpenMP threads. If you want to run with this many OpenMP threads, specify the -ntomp option. But we suggest to increase the number of MPI ranks.
and if i use -ntomp
-------------------------
#!/bin/bash
cp file1 file2
grompp ...
mdrun_mpi -ntomp 24 ...
-------------------------
things seem to work fine but mdrun_mpi is extremely slow as if it was running on one core only. and that's the output:
mdrun_mpi -ntomp 24 -v -deffnm em
Number of logical cores detected (24) does not match the number reported by OpenMP (1).
Consider setting the launch configuration manually!
Running on 1 node with total 24 cores, 24 logical cores
The number of OpenMP threads was set by environment variable OMP_NUM_THREADS to 24 (and the command-line setting agreed with that)
Using 1 MPI process
Using 24 OpenMP threads
NOTE: You requested 24 OpenMP threads, whereas we expect the optimum to be with more MPI ranks with 1 to 6 OpenMP threads.
Non-default thread affinity set probably by the OpenMP library,
disabling internal thread affinity
what am i doing wrong? what a the proper setting for my goal? i need to use an extra script executed with mpirun and i somehow need to reduce 24 executions of serial commands to just one!
any useful hint is appreciated.
take care,
vedat
More information about the gromacs.org_gmx-users
mailing list