[gmx-users] CPU running doesn't match command line
Albert
mailmd2011 at gmail.com
Mon Aug 22 17:36:33 CEST 2016
Hello Mark:
I've recompiled Gromacs without MPI. I run submit the job with the
command line you suggested.
gmx mdrun -ntomp 10 -v -g test.log -pin on -pinoffset 0 -gpu_id 0 -s
test.tpr >& test.info
gmx mdrun -ntomp 10 -v -g test.log -pin on -pinoffset 10 -gpu_id 1 -s
test.tpr >& test.info
I specified 20 cores CPU in all, but I noticed that only 15 cores were
actually being used. I am pretty confused for that.
Here is my log file:
GROMACS: gmx mdrun, VERSION 5.1.3
Executable: /soft/gromacs/5.1.3_intel-thread/bin/gmx
Data prefix: /soft/gromacs/5.1.3_intel-thread
Command line:
gmx mdrun -ntomp 10 -v -g test.log -pin on -pinoffset 0 -gpu_id 0 -s
test.tpr
Hardware detected:
CPU info:
Vendor: GenuineIntel
Brand: Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
SIMD instructions most likely to fit this hardware: AVX_256
SIMD instructions selected at GROMACS compile time: AVX_256
GPU info:
Number of GPUs detected: 2
#0: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC: no, stat:
compatible
#1: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC: no, stat:
compatible
Reading file test.tpr, VERSION 5.1.3 (single precision)
Using 1 MPI thread
Using 10 OpenMP threads
1 GPU user-selected for this run.
Mapping of GPU ID to the 1 PP rank in this node: 0
starting mdrun 'Title'
5000000 steps, 10000.0 ps.
step 80: timed with pme grid 60 60 96, coulomb cutoff 1.000: 1634.1
M-cycles
step 160: timed with pme grid 56 56 84, coulomb cutoff 1.047: 1175.4
M-cycles
GROMACS: gmx mdrun, VERSION 5.1.3
Executable: /soft/gromacs/5.1.3_intel-thread/bin/gmx
Data prefix: /soft/gromacs/5.1.3_intel-thread
Command line:
gmx mdrun -ntomp 10 -v -g test.log -pin on -pinoffset 10 -gpu_id 1 -s
test.tpr
Running on 1 node with total 10 cores, 20 logical cores, 2 compatible GPUs
Hardware detected:
CPU info:
Vendor: GenuineIntel
Brand: Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
SIMD instructions most likely to fit this hardware: AVX_256
SIMD instructions selected at GROMACS compile time: AVX_256
GPU info:
Number of GPUs detected: 2
#0: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC: no, stat:
compatible
#1: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC: no, stat:
compatible
Reading file test.tpr, VERSION 5.1.3 (single precision)
Using 1 MPI thread
Using 10 OpenMP threads
1 GPU user-selected for this run.
Mapping of GPU ID to the 1 PP rank in this node: 1
Applying core pinning offset 10
starting mdrun 'Title'
5000000 steps, 10000.0 ps.
step 80: timed with pme grid 60 60 84, coulomb cutoff 1.000: 657.2
M-cycles
step 160: timed with pme grid 52 52 80, coulomb cutoff 1.096: 622.8
M-cycles
step 240: timed with pme grid 48 48 72, coulomb cutoff 1.187: 593.9
M-cycles
On 08/18/2016 02:13 PM, Mark Abraham wrote:
> Hi,
>
> It's a bit curious to want to run two 8-thread jobs on a machine with 10
> physical cores because you'll get lots of performance imbalance because
> some threads must share the same physical core, but I guess it's a free
> world. As I suggested the other day,
> http://manual.gromacs.org/documentation/2016/user-guide/mdrun-performance.html#examples-for-mdrun-on-one-node
> has
> some examples. The fact you've compiled and linked with an MPI library
> means it may be involving itself in the thread-affinity management, but
> whether it is doing that is something between you, it, the docs and the
> cluster admins. If you're just wanting to run on a single node, do yourself
> a favour and build the thread-MPI flavour.
>
> If so, you probably want more like
> gmx mdrun -ntomp 10 -pin on -pinoffset 0 -gpu_id 0 -s run1
> gmx mdrun -ntomp 10 -pin on -pinoffset 10 -gpu_id 1 -s run2
>
> If you want to use the MPI build, then I suggest you read up on how its
> mpirun will let you manage keeping the threads of processes where you want
> them (ie apart).
>
> Mark
More information about the gromacs.org_gmx-users
mailing list