[gmx-users] CPU running doesn't match command line
Albert
mailmd2011 at gmail.com
Wed Aug 17 09:07:18 CEST 2016
Hello:
Here is the information that you asked for.
gmx_mpi mdrun -s 7.tpr -v -g 7.log -c 7.gro -x 7.xtc -ntomp 8 -gpu_id
0 -pin on
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GROMACS: gmx mdrun, VERSION 5.1.3
Executable: /soft/gromacs/5.1.3_intel/bin/gmx_mpi
Data prefix: /soft/gromacs/5.1.3_intel
Command line:
gmx_mpi mdrun -s 7.tpr -v -g 7.log -c 7.gro -x 7.xtc -ntomp 8 -gpu_id
0 -pin on
GROMACS version: VERSION 5.1.3
Precision: single
Memory model: 64 bit
MPI library: MPI
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 32)
GPU support: enabled
OpenCL support: disabled
invsqrt routine: gmx_software_invsqrt(x)
SIMD instructions: AVX_256
FFT library: fftw-3.3.4-sse2
RDTSCP usage: enabled
C++11 compilation: disabled
TNG support: enabled
Tracing support: disabled
Built on: Thu Aug 11 16:15:26 CEST 2016
Built by: albert at cudaB [CMAKE]
Build OS/arch: Linux 3.16.7-35-desktop x86_64
Build CPU vendor: GenuineIntel
Build CPU brand: Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
Build CPU family: 6 Model: 62 Stepping: 4
Build CPU features: aes apic avx clfsh cmov cx8 cx16 f16c htt lahf_lm
mmx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp
sse2 sse3 sse4.1
sse4.2 ssse3 tdt x2apic
C compiler: /soft/intel/impi/5.1.3.223/bin64/mpicc GNU 4.8.3
C compiler flags: -mavx -Wextra -Wno-missing-field-initializers
-Wno-sign-compare -Wpointer-arith -Wall -Wno-unused -Wunused-value
-Wunused-parameter -
O3 -DNDEBUG -funroll-all-loops -fexcess-precision=fast -Wno-array-bounds
C++ compiler: /soft/intel/impi/5.1.3.223/bin64/mpicxx GNU 4.8.3
C++ compiler flags: -mavx -Wextra -Wno-missing-field-initializers
-Wpointer-arith -Wall -Wno-unused-function -O3 -DNDEBUG
-funroll-all-loops -fexcess-pre
cision=fast -Wno-array-bounds
Boost version: 1.54.0 (external)
CUDA compiler: /usr/local/cuda/bin/nvcc nvcc: NVIDIA (R) Cuda
compiler driver;Copyright (c) 2005-2016 NVIDIA Corporation;Built on
Wed_May__4_21:01:56_CDT
_2016;Cuda compilation tools, release 8.0, V8.0.26
CUDA compiler
flags:-gencode;arch=compute_20,code=sm_20;-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=
sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode
;arch=compute_60,code=compute_60;-gencode;arch=compute_61,code=compute_61;-use_fast_math;;
;-mavx;-Wextra;-Wno-missing-field-initializers;-Wpointer-arith;-Wal
l;-Wno-unused-function;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;-Wno-array-bounds;
CUDA driver: 8.0
CUDA runtime: 8.0
Running on 1 node with total 10 cores, 20 logical cores, 2 compatible GPUs
Hardware detected on host cudaB (the node of MPI rank 0):
CPU info:
Vendor: GenuineIntel
Brand: Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
Family: 6 model: 62 stepping: 4
CPU features: aes apic avx clfsh cmov cx8 cx16 f16c htt lahf_lm mmx
msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp sse2
sse3 sse4.1 ss
e4.2 ssse3 tdt x2apic
SIMD instructions most likely to fit this hardware: AVX_256
SIMD instructions selected at GROMACS compile time: AVX_256
GPU info:
Number of GPUs detected: 2
#0: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC: no, stat:
compatible
#1: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC: no, stat:
compatible
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
gmx_mpi mdrun -s 7.tpr -v -g 7.log -c 7.gro -x 7.xtc -ntomp 8 -gpu_id
1 -pin on -cpi -append -pinoffset 8
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GROMACS: gmx mdrun, VERSION 5.1.3
Executable: /soft/gromacs/5.1.3_intel/bin/gmx_mpi
Data prefix: /soft/gromacs/5.1.3_intel
Command line:
gmx_mpi mdrun -s 7.tpr -v -g 7.log -c 7.gro -x 7.xtc -ntomp 8 -gpu_id
1 -pin on -cpi -append -pinoffset 8
Running on 1 node with total 10 cores, 20 logical cores, 2 compatible GPUs
Hardware detected on host cudaB (the node of MPI rank 0):
CPU info:
Vendor: GenuineIntel
Brand: Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
SIMD instructions most likely to fit this hardware: AVX_256
SIMD instructions selected at GROMACS compile time: AVX_256
GPU info:
Number of GPUs detected: 2
#0: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC: no, stat:
compatible
#1: NVIDIA GeForce GTX 780 Ti, compute cap.: 3.5, ECC: no, stat:
compatible
Reading file 7.tpr, VERSION 5.1.3 (single precision)
Reading checkpoint file state.cpt generated: Wed Aug 17 09:01:46 2016
Using 1 MPI process
Using 8 OpenMP threads
1 GPU user-selected for this run.
Mapping of GPU ID to the 1 PP rank in this node: 1
Applying core pinning offset 8
starting mdrun 'Title'
50000000 steps, 100000.0 ps (continuing from step 5746000, 11492.0 ps).
step 5746080: timed with pme grid 60 60 84, coulomb cutoff 1.000: 2451.9
M-cycles
On 08/16/2016 05:27 PM, Szilárd Páll wrote:
> Most of that copy-pasted info is not what I asked for and overall not
> very useful. You have still not shown any log files (or details on the
> hardware). Share the *relevant* stuff, please!
> --
> Szilárd
>
>
> On Tue, Aug 16, 2016 at 5:07 PM, Albert <mailmd2011 at gmail.com> wrote:
>> Hello:
>>
>> Here is my MDP file:
>>
>> define = -DREST_ON -DSTEP6_4
>> integrator = md
>> dt = 0.002
>> nsteps = 1000000
>> nstlog = 1000
>> nstxout = 0
>> nstvout = 0
>> nstfout = 0
>> nstcalcenergy = 100
>> nstenergy = 1000
>> nstxout-compressed = 10000
>> ;
>> cutoff-scheme = Verlet
>> nstlist = 20
>> rlist = 1.0
>> coulombtype = pme
>> rcoulomb = 1.0
>> vdwtype = Cut-off
>> vdw-modifier = Force-switch
>> rvdw_switch = 0.9
>> rvdw = 1.0
>> ;
>> tcoupl = berendsen
>> tc_grps = PROT MEMB SOL_ION
>> tau_t = 1.0 1.0 1.0
>> ref_t = 310 310 310
>> ;
>> pcoupl = berendsen
>> pcoupltype = semiisotropic
>> tau_p = 5.0
>> compressibility = 4.5e-5 4.5e-5
>> ref_p = 1.0 1.0
>> ;
>> constraints = h-bonds
>> constraint_algorithm = LINCS
>> continuation = yes
>> ;
>> nstcomm = 100
>> comm_mode = linear
>> comm_grps = PROT MEMB SOL_ION
>> ;
>> refcoord_scaling = com
>>
>>
>> I compiled Gromacs with the following settings, using Intel MPI:
>>
>> env CC=mpicc CXX=mpicxx F77=mpif90 FC=mpif90 LDF90=mpif90
>> CMAKE_PREFIX_PATH=/soft/gromacs/fftw-3.3.4:/soft/intel/impi/5.1.3.223 cmake
>> .. -DBUILD_SHARED_LIB=OFF -DBUILD_TESTING=OFF
>> -DCMAKE_INSTALL_PREFIX=/soft/gromacs/5.1.3_intel -DGMX_MPI=ON -DGMX_GPU=ON
>> -DGMX_PREFER_STATIC_LIBS=ON -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda
>>
>>
>> I tried it again with one of the job with options:
>>
>> -ntomp 8 -pin on -pinoffset 8
>>
>>
>> The two submitted jobs can still only use 8 CPU and the speed is extremely
>> slow (10ns/day)....when I remove option "-pin on" from one of the job, it
>> fasten a lot (32ns/day) and 16 CPU were used..... If I only submit one job
>> with option "-pin on", I can obtain 52ns/day..........
>>
>>
>> thx a lot
>>
>>
>> On 08/16/2016 04:59 PM, Szilárd Páll wrote:
>>> Hi,
>>>
>>> Without log and hw configs, I it's hard to tell what's happening.
>>>
>>> By turning off pinning the OS is free to move threads around and it
>>> will try to ensure cores are utilized. However, by leaving threads
>>> up-pinned you risk taking a significant performance hit. So I'd
>>> recommend that you run with correct settings.
>>>
>>> If you start with "-ntomp 8 -pin on -pioffset 8" (and you indeed have
>>> 16 cores no HT), you should be able to see in htop the first eight
>>> cores empty while the last eight occupied.
>>>
>>> Cheers,
>>> --
>>> Szilárd
>>
>> --
>> Gromacs Users mailing list
>>
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!
>>
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
>> * For (un)subscribe requests visit
>> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a
>> mail to gmx-users-request at gromacs.org.
More information about the gromacs.org_gmx-users
mailing list