[gmx-users] Domain decomposition
Alexander Alexander
alexanderwien2k at gmail.com
Tue Jul 26 11:46:33 CEST 2016
Dear gromacs user,
Now is more than one week that I am engaging with the fatal error due to
domain decomposition, and I have not been succeeded yet, and it is more
painful when I have to test different number of cpu's to see which one
works in a cluster with a long queuing time, means being two or three days
in the queue just to see again the fatal error in two minutes.
These are the dimensions of the cell " 3.53633, 4.17674, 4.99285",
and below is the log file of my test submitted on 2 nodes with total 128
cores, I even reduced to 32 CPU's and even changed from "gmx_mpi mdrun" to
"gmx mdrun", but the problem is still surviving.
Please do not refer me to this link (
http://www.gromacs.org/Documentation/Errors#There_is_no_domain_decomposition_for_n_nodes_that_is_compatible_with_the_given_box_and_a_minimum_cell_size_of_x_nm
)
as I know what is the problem but I can not solve it:
Thanks,
Regards,
Alex
Log file opened on Fri Jul 22 00:55:56 2016
Host: node074 pid: 12281 rank ID: 0 number of ranks: 64
GROMACS: gmx mdrun, VERSION 5.1.2
Executable:
/home/fb_chem/chemsoft/lx24-amd64/gromacs-5.1.2-mpi/bin/gmx_mpi
Data prefix: /home/fb_chem/chemsoft/lx24-amd64/gromacs-5.1.2-mpi
Command line:
gmx_mpi mdrun -ntomp 1 -deffnm min1.6 -s min1.6
GROMACS version: VERSION 5.1.2
Precision: single
Memory model: 64 bit
MPI library: MPI
OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 32)
GPU support: disabled
OpenCL support: disabled
invsqrt routine: gmx_software_invsqrt(x)
SIMD instructions: AVX_128_FMA
FFT library: fftw-3.2.1
RDTSCP usage: enabled
C++11 compilation: disabled
TNG support: enabled
Tracing support: disabled
Built on: Thu Jun 23 14:17:43 CEST 2016
Built by: reuter at marc2-h2 [CMAKE]
Build OS/arch: Linux 2.6.32-642.el6.x86_64 x86_64
Build CPU vendor: AuthenticAMD
Build CPU brand: AMD Opteron(TM) Processor 6276
Build CPU family: 21 Model: 1 Stepping: 2
Build CPU features: aes apic avx clfsh cmov cx8 cx16 fma4 htt lahf_lm
misalignsse mmx msr nonstop_tsc pclmuldq pdpe1gb popcnt pse rdtscp sse2
sse3 sse4a sse4.1 sse4.2 ssse3 xop
C compiler: /usr/lib64/ccache/cc GNU 4.4.7
C compiler flags: -mavx -mfma4 -mxop -Wundef -Wextra
-Wno-missing-field-initializers -Wno-sign-compare -Wpointer-arith -Wall
-Wno-unused -Wunused-value -Wunused-parameter -O3 -DNDEBUG
-funroll-all-loops -Wno-array-bounds
C++ compiler: /usr/lib64/ccache/c++ GNU 4.4.7
C++ compiler flags: -mavx -mfma4 -mxop -Wundef -Wextra
-Wno-missing-field-initializers -Wpointer-arith -Wall -Wno-unused-function
-O3 -DNDEBUG -funroll-all-loops -Wno-array-bounds
Boost version: 1.55.0 (internal)
Running on 2 nodes with total 128 cores, 128 logical cores
Cores per node: 64
Logical cores per node: 64
Hardware detected on host node074 (the node of MPI rank 0):
CPU info:
Vendor: AuthenticAMD
Brand: AMD Opteron(TM) Processor 6276
Family: 21 model: 1 stepping: 2
CPU features: aes apic avx clfsh cmov cx8 cx16 fma4 htt lahf_lm
misalignsse mmx msr nonstop_tsc pclmuldq pdpe1gb popcnt pse rdtscp sse2
sse3 sse4a sse4.1 sse4.2 ssse3 xop
SIMD instructions most likely to fit this hardware: AVX_128_FMA
SIMD instructions selected at GROMACS compile time: AVX_128_FMA
Initializing Domain Decomposition on 64 ranks
Dynamic load balancing: off
Will sort the charge groups at every domain (re)decomposition
Initial maximum inter charge-group distances:
two-body bonded interactions: 3.196 nm, LJC Pairs NB, atoms 24 28
multi-body bonded interactions: 0.397 nm, Ryckaert-Bell., atoms 5 13
Minimum cell size due to bonded interactions: 3.516 nm
Maximum distance for 5 constraints, at 120 deg. angles, all-trans: 0.218 nm
Estimated maximum distance required for P-LINCS: 0.218 nm
Guess for relative PME load: 0.19
Will use 48 particle-particle and 16 PME only ranks
This is a guess, check the performance at the end of the log file
Using 16 separate PME ranks, as guessed by mdrun
Optimizing the DD grid for 48 cells with a minimum initial size of 3.516 nm
The maximum allowed number of cells is: X 1 Y 1 Z 1
-------------------------------------------------------
Program gmx mdrun, VERSION 5.1.2
Source code file: /home/alex/gromacs-5.1.2/src/gromacs/domdec/domdec.cpp,
line: 6987
Fatal error:
There is no domain decomposition for 48 ranks that is compatible with the
given box and a minimum cell size of 3.51565 nm
Change the number of ranks or mdrun option -rdd
Look in the log file for details on the domain decomposition
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------
More information about the gromacs.org_gmx-users
mailing list