[gmx-users] Domain decomposition

Alexander Alexander alexanderwien2k at gmail.com
Tue Jul 26 11:46:33 CEST 2016


Dear gromacs user,

Now is more than one week that I am engaging with the fatal error due to
domain decomposition, and I have not been succeeded yet, and it is more
painful when I have to test different number of cpu's to see which one
works in a cluster with a long queuing time, means being two or three days
in the queue just to see again the fatal error in two minutes.

These are the dimensions of the cell " 3.53633,   4.17674,   4.99285",
and below is the log file of my test submitted on 2 nodes with total 128
cores, I even reduced to 32 CPU's and even changed from "gmx_mpi mdrun" to
"gmx mdrun", but the problem is still surviving.

Please do not refer me to this link (
http://www.gromacs.org/Documentation/Errors#There_is_no_domain_decomposition_for_n_nodes_that_is_compatible_with_the_given_box_and_a_minimum_cell_size_of_x_nm
)
as I know what is the problem but I can not solve it:


Thanks,

Regards,
Alex



Log file opened on Fri Jul 22 00:55:56 2016
Host: node074  pid: 12281  rank ID: 0  number of ranks:  64

GROMACS:      gmx mdrun, VERSION 5.1.2
Executable:
/home/fb_chem/chemsoft/lx24-amd64/gromacs-5.1.2-mpi/bin/gmx_mpi
Data prefix:  /home/fb_chem/chemsoft/lx24-amd64/gromacs-5.1.2-mpi
Command line:
  gmx_mpi mdrun -ntomp 1 -deffnm min1.6 -s min1.6

GROMACS version:    VERSION 5.1.2
Precision:          single
Memory model:       64 bit
MPI library:        MPI
OpenMP support:     enabled (GMX_OPENMP_MAX_THREADS = 32)
GPU support:        disabled
OpenCL support:     disabled
invsqrt routine:    gmx_software_invsqrt(x)
SIMD instructions:  AVX_128_FMA
FFT library:        fftw-3.2.1
RDTSCP usage:       enabled
C++11 compilation:  disabled
TNG support:        enabled
Tracing support:    disabled
Built on:           Thu Jun 23 14:17:43 CEST 2016
Built by:           reuter at marc2-h2 [CMAKE]
Build OS/arch:      Linux 2.6.32-642.el6.x86_64 x86_64
Build CPU vendor:   AuthenticAMD
Build CPU brand:    AMD Opteron(TM) Processor 6276
Build CPU family:   21   Model: 1   Stepping: 2
Build CPU features: aes apic avx clfsh cmov cx8 cx16 fma4 htt lahf_lm
misalignsse mmx msr nonstop_tsc pclmuldq pdpe1gb popcnt pse rdtscp sse2
sse3 sse4a sse4.1 sse4.2 ssse3 xop
C compiler:         /usr/lib64/ccache/cc GNU 4.4.7
C compiler flags:    -mavx -mfma4 -mxop    -Wundef -Wextra
-Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wall
-Wno-unused -Wunused-value -Wunused-parameter  -O3 -DNDEBUG
-funroll-all-loops  -Wno-array-bounds

C++ compiler:       /usr/lib64/ccache/c++ GNU 4.4.7
C++ compiler flags:  -mavx -mfma4 -mxop    -Wundef -Wextra
-Wno-missing-field-initializers -Wpointer-arith -Wall -Wno-unused-function
-O3 -DNDEBUG -funroll-all-loops  -Wno-array-bounds
Boost version:      1.55.0 (internal)


Running on 2 nodes with total 128 cores, 128 logical cores
  Cores per node:           64
  Logical cores per node:   64
Hardware detected on host node074 (the node of MPI rank 0):
  CPU info:
    Vendor: AuthenticAMD
    Brand:  AMD Opteron(TM) Processor 6276
    Family: 21  model:  1  stepping:  2
    CPU features: aes apic avx clfsh cmov cx8 cx16 fma4 htt lahf_lm
misalignsse mmx msr nonstop_tsc pclmuldq pdpe1gb popcnt pse rdtscp sse2
sse3 sse4a sse4.1 sse4.2 ssse3 xop
    SIMD instructions most likely to fit this hardware: AVX_128_FMA
    SIMD instructions selected at GROMACS compile time: AVX_128_FMA
Initializing Domain Decomposition on 64 ranks
Dynamic load balancing: off
Will sort the charge groups at every domain (re)decomposition
Initial maximum inter charge-group distances:
    two-body bonded interactions: 3.196 nm, LJC Pairs NB, atoms 24 28
  multi-body bonded interactions: 0.397 nm, Ryckaert-Bell., atoms 5 13
Minimum cell size due to bonded interactions: 3.516 nm
Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.218 nm
Estimated maximum distance required for P-LINCS: 0.218 nm
Guess for relative PME load: 0.19
Will use 48 particle-particle and 16 PME only ranks
This is a guess, check the performance at the end of the log file
Using 16 separate PME ranks, as guessed by mdrun
Optimizing the DD grid for 48 cells with a minimum initial size of 3.516 nm
The maximum allowed number of cells is: X 1 Y 1 Z 1

-------------------------------------------------------
Program gmx mdrun, VERSION 5.1.2
Source code file: /home/alex/gromacs-5.1.2/src/gromacs/domdec/domdec.cpp,
line: 6987

Fatal error:
There is no domain decomposition for 48 ranks that is compatible with the
given box and a minimum cell size of 3.51565 nm
Change the number of ranks or mdrun option -rdd
Look in the log file for details on the domain decomposition
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------


More information about the gromacs.org_gmx-users mailing list