[gmx-users] 2019.2 not using all available cores

Thu Aug 22 05:23:18 CEST 2019

I've discovered an option that caused 2019.2 to use all of the cores
correctly.

Use "-pin on" and it works as expected, using all 12 cores, CPU load being
show as appropriate (gets up to 68% total CPU utilisation)

Use "-pin auto", which is the default, or "-pin off" and it will only use a
single core (maximum is 8% total CPU utilisation).

Catch ya,

Dr. Dallas Warren
Drug Delivery, Disposition and Dynamics
Monash Institute of Pharmaceutical Sciences, Monash University
381 Royal Parade, Parkville VIC 3052
dallas.warren at monash.edu
---------------------------------
When the only tool you own is a hammer, every problem begins to resemble a
nail.

On Thu, 9 May 2019 at 07:54, Dallas Warren <dallas.warren at monash.edu> wrote:

> gmx 2019.2 compiled using threads only uses a single core mdrun_mpi
> compiled using MPI only uses a single core, gmx 2016.3 using threads
> uses all 12 cores.
>
> For compiling thread version of 2019.2 used:
> cmake .. -DGMX_GPU=ON
> -DCMAKE_INSTALL_PREFIX=/usr/local/gromacs/gromacs-2019.2
>
> For compiling MPI version of 2019.2 used:
> cmake .. -DGMX_MPI=ON -DBUILD_SHARED_LIBS=OFF -DGMX_GPU=ON
> -DCMAKE_CXX_COMPILER=/usr/lib64/mpi/gcc/openmpi/bin/mpiCC
> -DCMAKE_C_COMPILER=/usr/lib64/mpi/gcc/openmpi/bin/mpicc
> -DGMX_BUILD_MDRUN_ONLY=ON
> -DCMAKE_INSTALL_PREFIX=/usr/local/gromacs/gromacs-2019.2
>
> Between building both to of those, deleted the build directory.
>
> ####################################################
> GROMACS:      gmx, version 2019.2
> Executable:   /usr/local/gromacs/gromacs-2019.2/bin/gmx
> Data prefix:  /usr/local/gromacs/gromacs-2019.2
> Working dir:  /home/dallas/experiments/current/19-064/P6DLO
> Command line:
>   gmx -version
>
> GROMACS version:    2019.2
> Precision:          single
> Memory model:       64 bit
> MPI library:        thread_mpi
> OpenMP support:     enabled (GMX_OPENMP_MAX_THREADS = 64)
> GPU support:        CUDA
> SIMD instructions:  AVX_256
> FFT library:        fftw-3.3.8-sse2
> RDTSCP usage:       enabled
> TNG support:        enabled
> Hwloc support:      disabled
> Tracing support:    disabled
> C compiler:         /usr/bin/cc GNU 7.4.0
> C compiler flags:    -mavx     -O3 -DNDEBUG -funroll-all-loops
> -fexcess-precision=fast
> C++ compiler:       /usr/bin/c++ GNU 7.4.0
> C++ compiler flags:  -mavx    -std=c++11   -O3 -DNDEBUG
> -funroll-all-loops -fexcess-precision=fast
> CUDA compiler:      /usr/local/cuda/bin/nvcc nvcc: NVIDIA (R) Cuda
> compiler driver;Copyright (c) 2005-2019 NVIDIA Corporation;Built on
> Fri_Feb__8_19:08:17_PST_2019;Cuda compilation tools, release 10.1,
> V10.1.105
> CUDA compiler
> flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;-D_FORCE_INLINES;;
> ;-mavx;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;
> CUDA driver:        10.10
> CUDA runtime:       10.10
>
> ####################################################
> GROMACS:      mdrun_mpi, version 2019.2
> Executable:   /usr/local/gromacs/gromacs-2019.2/bin/mdrun_mpi
> Data prefix:  /usr/local/gromacs/gromacs-2019.2
> Working dir:  /home/dallas/experiments/current/19-064/P6DLO
> Command line:
>   mdrun_mpi -version
>
> GROMACS version:    2019.2
> Precision:          single
> Memory model:       64 bit
> MPI library:        MPI
> OpenMP support:     enabled (GMX_OPENMP_MAX_THREADS = 64)
> GPU support:        CUDA
> SIMD instructions:  AVX_256
> FFT library:        fftw-3.3.8-sse2
> RDTSCP usage:       enabled
> TNG support:        enabled
> Hwloc support:      disabled
> Tracing support:    disabled
> C compiler:         /usr/lib64/mpi/gcc/openmpi/bin/mpicc GNU 7.4.0
> C compiler flags:    -mavx     -O3 -DNDEBUG -funroll-all-loops
> -fexcess-precision=fast
> C++ compiler:       /usr/lib64/mpi/gcc/openmpi/bin/mpiCC GNU 7.4.0
> C++ compiler flags:  -mavx    -std=c++11   -O3 -DNDEBUG
> -funroll-all-loops -fexcess-precision=fast
> CUDA compiler:      /usr/local/cuda/bin/nvcc nvcc: NVIDIA (R) Cuda
> compiler driver;Copyright (c) 2005-2019 NVIDIA Corporation;Built on
> Fri_Feb__8_19:08:17_PST_2019;Cuda compilation tools, release 10.1,
> V10.1.105
> CUDA compiler
> flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=compute_75;-use_fast_math;-D_FORCE_INLINES;;
> ;-mavx;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;
> CUDA driver:        10.10
> CUDA runtime:       10.10
>
> ####################################################
> /usr/local/gromacs/gromacs-2016.3/bin/gmx -version
>
> GROMACS:      gmx, version 2016.3
> Executable:   /usr/local/gromacs/gromacs-2016.3/bin/gmx
> Data prefix:  /usr/local/gromacs/gromacs-2016.3
> Working dir:  /home/dallas/experiments/current/19-064/P6DLO
> Command line:
>   gmx -version
>
> GROMACS version:    2016.3
> Precision:          single
> Memory model:       64 bit
> MPI library:        thread_mpi
> OpenMP support:     enabled (GMX_OPENMP_MAX_THREADS = 32)
> GPU support:        CUDA
> SIMD instructions:  AVX_256
> FFT library:        fftw-3.3.8-sse2
> RDTSCP usage:       enabled
> TNG support:        enabled
> Hwloc support:      disabled
> Tracing support:    disabled
> Built on:           Tue Mar 21 13:21:15 AEDT 2017
> Built by:           dallas at morph [CMAKE]
> Build OS/arch:      Linux 4.4.49-16-default x86_64
> Build CPU vendor:   Intel
> Build CPU brand:    Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz
> Build CPU family:   6   Model: 45   Stepping: 7
> Build CPU features: aes apic avx clfsh cmov cx8 cx16 htt lahf mmx msr
> nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3
> sse4.1 sse4.2 ssse3 tdt x2apic
> C compiler:         /usr/bin/cc GNU 4.8.5
> C compiler flags:    -mavx     -O3 -DNDEBUG -funroll-all-loops
> -fexcess-precision=fast
> C++ compiler:       /usr/bin/c++ GNU 4.8.5
> C++ compiler flags:  -mavx    -std=c++0x   -O3 -DNDEBUG
> -funroll-all-loops -fexcess-precision=fast
> CUDA compiler:      /usr/local/cuda-8.0/bin/nvcc nvcc: NVIDIA (R) Cuda
> compiler driver;Copyright (c) 2005-2016 NVIDIA Corporation;Built on
> Tue_Jan_10_13:22:03_CST_2017;Cuda compilation tools, release 8.0,
> V8.0.61
> CUDA compiler
> flags:-gencode;arch=compute_20,code=sm_20;-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_60,code=compute_60;-gencode;arch=compute_61,code=compute_61;-use_fast_math;;;-Xcompiler;,-mavx,,,,,,;-Xcompiler;-O3,-DNDEBUG,-funroll-all-loops,-fexcess-precision=fast,,;
> CUDA driver:        10.10
> CUDA runtime:       8.0
>
> Catch ya,
>
> Dr. Dallas Warren
> Drug Delivery, Disposition and Dynamics
> Monash Institute of Pharmaceutical Sciences, Monash University
> 381 Royal Parade, Parkville VIC 3052
> dallas.warren at monash.edu
> ---------------------------------
> When the only tool you own is a hammer, every problem begins to resemble a
> nail.
>