[gmx-users] domain decomposition error >60 ns into simulation on a specific machine
Mala L Radhakrishnan
mradhakr at wellesley.edu
Thu Feb 14 22:03:13 CET 2019
Hi Mark,
To my knowledge, she's not using CHARMM-related FF's at all -- I think she
is using Amber03 (Alyssa, correct me if I'm wrong). Visually and RSMD-wise
the trajectory looks totally normal, but is there something specific I
should be looking for in the trajectory, either visually or quantitatively?
Thanks,
Mala
On Thu, Feb 14, 2019 at 3:35 PM Mark Abraham <mark.j.abraham at gmail.com>
wrote:
> Hi,
>
> What does the trajectory look like before it crashes?
>
> We did recently fix a bug relevant to simulations using CHARMM switching
> functions on GPUs, if that could be an explanation. We will probably put
> out a new 2018 version with that fix next week (or so).
>
> Mark
>
> On Thu., 14 Feb. 2019, 20:26 Mala L Radhakrishnan, <mradhakr at wellesley.edu
> >
> wrote:
>
> > Hi all,
> >
> > My student is trying to do a fairly straightforward MD simulation -- a
> > protein complex in water with ions with *no* pull coordinate. It's on an
> > NVidia GPU-based machine and we're running gromacs 2018.3.
> >
> > About 65 ns into the simulation, it dies with:
> >
> > "an atom moved too far between two domain decomposition steps. This
> usually
> > means that your system is not well equilibrated"
> >
> > If we restart at, say, 2 ns before it died, it then runs fine, PAST where
> > it died before, for another ~63 ns or so, and then dies with the same
> > error. We have had far larger and arguably more complex gromacs jobs run
> > fine on this same machine.
> >
> > Even stranger, when we run the same, problematic job on a different
> NVidia
> > GPU-based machine with slightly older CPUs that's running Gromacs 2016.4,
> > it runs fine (it's currently at 200 ns).
> >
> > Below are the Gromacs hardware and compilation specs of the machine on
> > which it died in case that helps anyone:- there is a note at the end of
> > this logfile output that might be useful -- thanks in advance for any
> > ideas.
> > -----------------------------------------
> >
> > GROMACS version: 2018.3
> > Precision: single
> > Memory model: 64 bit
> > MPI library: thread_mpi
> > OpenMP support: enabled (GMX_OPENMP_MAX_THREADS = 64)
> > GPU support: CUDA
> > SIMD instructions: AVX2_256
> > FFT library: fftw-3.3.8-sse2-avx-avx2-avx2_128
> > RDTSCP usage: enabled
> > TNG support: enabled
> > Hwloc support: disabled
> > Tracing support: disabled
> > Built on: 2018-10-31 22:05:13
> > Build OS/arch: Linux 3.10.0-693.21.1.el7.x86_64 x86_64
> > Build CPU vendor: Intel
> > Build CPU brand: Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
> > Build CPU family: 6 Model: 85 Stepping: 4
> > Build CPU features: aes apic avx avx2 avx512f avx512cd avx512bw avx512vl
> > clfsh cmov cx8 cx16 f16c fma hle htt intel lahf m
> > mx msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp rtm
> > sse2 sse3 sse4.1 sse4.2 ssse3 tdt x2apic
> > C compiler: /usr/bin/cc GNU 4.8.5
> > C compiler flags: -march=core-avx2 -O3 -DNDEBUG -funroll-all-loops
> > -fexcess-precision=fast
> > C++ compiler: /usr/bin/c++ GNU 4.8.5
> > C++ compiler flags: -march=core-avx2 -std=c++11 -O3 -DNDEBUG
> > -funroll-all-loops -fexcess-precision=fast
> > CUDA compiler: /usr/local/cuda/bin/nvcc nvcc: NVIDIA (R) Cuda
> compiler
> > driver;Copyright (c) 2005-2018 NVIDIA Corporat
> > ion;Built on Sat_Aug_25_21:08:01_CDT_2018;Cuda compilation tools, release
> > 10.0, V10.0.130
> > CUDA compiler
> >
> >
> flags:-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_37,code=
> >
> >
> sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode
> >
> >
> ;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_70,code=compute_70;-use_fast_math;;;
> >
> >
> ;-march=core-avx2;-std=c++11;-O3;-DNDEBUG;-funroll-all-loops;-fexcess-precision=fast;
> > CUDA driver: 10.0
> > CUDA runtime: 10.0
> > Running on 1 node with total 20 cores, 40 logical cores, 4 compatible
> GPUs
> > Hardware detected:
> > CPU info:
> > Vendor: Intel
> > Brand: Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz
> > Family: 6 Model: 85 Stepping: 4
> > Features: aes apic avx avx2 avx512f avx512cd avx512bw avx512vl clfsh
> > cmov cx8 cx16 f16c fma hle htt intel lahf mmx msr
> > nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdrnd rdtscp rtm sse2
> > sse3 sse4.1 sse4.2 ssse3 tdt x2apic
> > Number of AVX-512 FMA units: Cannot run AVX-512 detection - assuming
> 2
> > Hardware topology: Basic
> > Sockets, cores, and logical processors:
> > Socket 0: [ 0 20] [ 1 21] [ 2 22] [ 3 23] [ 4 24] [
> > 5 25] [ 6 26] [ 7 27] [ 8 28] [ 9
> > 29]
> > Socket 1: [ 10 30] [ 11 31] [ 12 32] [ 13 33] [ 14 34] [
> > 15 35] [ 16 36] [ 17 37] [ 18 38] [ 19
> > 39]
> > GPU info:
> > Number of GPUs detected: 4
> > #0: NVIDIA GeForce GTX 1080 Ti, compute cap.: 6.1, ECC: no, stat:
> > compatible
> > #1: NVIDIA GeForce GTX 1080 Ti, compute cap.: 6.1, ECC: no, stat:
> > compatible
> > #2: NVIDIA GeForce GTX 1080 Ti, compute cap.: 6.1, ECC: no, stat:
> > compatible
> > #3: NVIDIA GeForce GTX 1080 Ti, compute cap.: 6.1, ECC: no, stat:
> > compatible
> >
> > Highest SIMD level requested by all nodes in run: AVX_512
> > SIMD instructions selected at compile time: AVX2_256
> > This program was compiled for different hardware than you are running on,
> > which could influence performance. This build might have been configured
> on
> > a
> > login node with only a single AVX-512 FMA unit (in which case AVX2 is
> > faster),
> > while the node you are running on has dual AVX-512 FMA units.
> >
> >
> >
> > --
> > Mala L. Radhakrishnan
> > Whitehead Associate Professor of Critical Thought
> > Associate Professor of Chemistry
> > Wellesley College
> > 106 Central Street
> > Wellesley, MA 02481
> > (781)283-2981
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>
--
Mala L. Radhakrishnan
Whitehead Associate Professor of Critical Thought
Associate Professor of Chemistry
Wellesley College
106 Central Street
Wellesley, MA 02481
(781)283-2981
More information about the gromacs.org_gmx-users
mailing list