[gmx-users] help with poor performance on gromacs on Cray linux

Mark Abraham mark.j.abraham at gmail.com
Fri Jul 4 10:29:12 CEST 2014


On Sat, Jun 28, 2014 at 11:44 PM, Tom <dnaafm at gmail.com> wrote:

> Dear Mark,
>
> Thanks a lot for your kind help!
> I notice this on the head of *log file.
> It is saying "
>
>
>
>
> *Binary not matching hardware - you might be losing
> performance.Acceleration most likely to fit this hardware:
> AVX_128_FMAAcceleration selected at GROMACS compile time: SSE2*I compiled
> the gmx *tpr on the local machine and sent to the cluster.
>

That's about a factor of three you're losing. Compile for your target, like
the message says.


> Please help take a look the following *log file:
>
> -------------------------------------------------------------------------------------
>
> Log file opened on Tue Jun 17 16:13:17 2014
> Host: nid02116  pid: 22133  nodeid: 0  nnodes:  80
> Gromacs version:    VERSION 4.6.5
> Precision:          single
> Memory model:       64 bit
> MPI library:        MPI
> OpenMP support:     enabled
> GPU support:        disabled
> invsqrt routine:    gmx_software_invsqrt(x)
> CPU acceleration:   SSE2
> FFT library:        fftw-3.3.2-sse2
> Large file support: enabled
> RDTSCP usage:       enabled
> Built on:           Tue Jun 17 13:20:08 EDT 2014
> Built by:           ****@***-ext5 [CMAKE]
> Build OS/arch:      Linux 2.6.32.59-0.7.2-default x86_64
> Build CPU vendor:   AuthenticAMD
> Build CPU brand:    AMD Opteron(tm) Processor 6140
> Build CPU family:   16   Model: 9   Stepping: 1
> Build CPU features: apic clfsh cmov cx8 cx16 htt lahf_lm misalignsse mmx
> msr nonstop_tsc pdpe1gb popcnt pse rdtscp sse2 sse3 sse4a
> C compiler:         /opt/cray/xt-asyncpe/5.26/bin/cc GNU
> /opt/cray/xt-asyncpe/5.26/bin/cc: INFO: Compiling with
> CRAYPE_COMPILE_TARGET=native.
>

As discussed, the Cray compilers do not do as good a job as the
Cray-provided gcc or icc compilers. Use those.

Mark

C compiler flags:   -msse2    -Wextra -Wno-missing-field-initializers
> -Wno-sign-compare -Wall -Wno-unused -Wunused-value -Wno-unused-parameter
> -Wno-array-bounds -Wno-maybe-uninitialized -Wno-strict-overflow
> -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast  -O3
> -DNDEBUG
> ........
>
> Initializing Domain Decomposition on 80 nodes
> Dynamic load balancing: auto
> Will sort the charge groups at every domain (re)decomposition
>
> NOTE: Periodic molecules are present in this system. Because of this, the
> domain decomposition algorithm cannot easily determine the minimum cell
> size that it requires for treating bonded interactions. Instead, domain
> decomposition will assume that half the non-bonded cut-off will be a
> suitable lower bound.
>
> Minimum cell size due to bonded interactions: 0.600 nm
> Using 16 separate PME nodes, per user request
> Scaling the initial minimum size with 1/0.8 (option -dds) = 1.25
> Optimizing the DD grid for 64 cells with a minimum initial size of 0.750 nm
> The maximum allowed number of cells is: X 11 Y 11 Z 20
> Domain decomposition grid 4 x 4 x 4, separate PME nodes 16
> PME domain decomposition: 4 x 4 x 1
> Interleaving PP and PME nodes
> This is a particle-particle only node
>
> Domain decomposition nodeid 0, coordinates 0 0 0
>
> Using two step summing over 5 groups of on average 12.8 processes
>
> Using 80 MPI processes
>
> Detecting CPU-specific acceleration.
> Present hardware specification:
> Vendor: AuthenticAMD
> Brand:  AMD Opteron(TM) Processor 6274
> Family: 21  Model:  1  Stepping:  2
> Features: aes apic avx clfsh cmov cx8 cx16 fma4 htt lahf_lm misalignsse mmx
> msr nonstop_tsc pclmuldq pdpe1gb popcnt pse rdtscp sse2 sse3 sse4a sse4.1
> sse4.2 ssse3 xop
> Acceleration most likely to fit this hardware: AVX_128_FMA
> Acceleration selected at GROMACS compile time: SSE2
>
>
>
>
> *Binary not matching hardware - you might be losing
> performance.Acceleration most likely to fit this hardware:
> AVX_128_FMAAcceleration selected at GROMACS compile time: SSE2*
>
> Table routines are used for coulomb: FALSE
> Table routines are used for vdw:     FALSE
> Will do PME sum in reciprocal space.
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
> U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee and L. G.
> Pedersen
> A smooth particle mesh Ewald method
> J. Chem. Phys. 103 (1995) pp. 8577-8592
> -------- -------- --- Thank You --- -------- --------
>
> Will do ordinary reciprocal space Ewald sum.
> Using a Gaussian width (1/beta) of 0.384195 nm for Ewald
> Cut-off's:   NS: 1.2   Coulomb: 1.2   LJ: 1.2
> Long Range LJ corr.: <C6> 6.2437e-04
> System total charge: 0.000
> Generated table with 1100 data points for Ewald.
> Tabscale = 500 points/nm
> Generated table with 1100 data points for LJ6.
> Tabscale = 500 points/nm
> Generated table with 1100 data points for LJ12.
> Tabscale = 500 points/nm
> Generated table with 1100 data points for 1-4 COUL.
> Tabscale = 500 points/nm
> Generated table with 1100 data points for 1-4 LJ6.
> Tabscale = 500 points/nm
> Generated table with 1100 data points for 1-4 LJ12.
> Tabscale = 500 points/nm
> Potential shift: LJ r^-12: 0.000 r^-6 0.000, Ewald 0.000e+00
> Initialized non-bonded Ewald correction tables, spacing: 7.23e-04 size:
> 3046
>
>
> Non-default thread affinity set probably by the OpenMP library,
> disabling internal thread affinity
>
> ++++ PLEASE READ AND CITE THE FOLLOWING REFERENCE ++++
>
> ----------------------
>
> Best regards,
>
> Thom
>
>
>
>
>
> Message: 1
> > Date: Fri, 27 Jun 2014 23:49:53 +0200
> > From: Mark Abraham <mark.j.abraham at gmail.com>
> > To: Discussion list for GROMACS users <gmx-users at gromacs.org>
> > Subject: Re: [gmx-users] help with poor performance on gromacs on Cray
> >         linux
> > Message-ID:
> >         <CAMNuMAQG=n7FNSxKAg9hXJGXQQ+MZX5+L+F73r=UOA_1i4S8=
> > Q at mail.gmail.com>
> > Content-Type: text/plain; charset=UTF-8
> >
> > That thread referred to the Cray compilers (these machines ship several),
> > but whether that is relevant we don't know. Showing the top and bottom
> .log
> > file chunks is absolutely critical if you want performance feedback.
> >
> > Mark
> > On Jun 26, 2014 10:55 PM, "Tom" <dnaafm at gmail.com> wrote:
> >
> > > Justin,
> > >
> > > I compared the peromance (the time spent for mdrun) using md.log files
> > for
> > > the same simulatin run on Cary Linux and any other Linux system.
> > >
> > > I agree different hardware can have different performance.
> > > But these tests were run on the supper computer clusters with very good
> > > reputations of performance. The one on Cray is very slow.
> > >
> > > I am the first time to run gmc on Cray linux. I am doubting if there is
> > any
> > > wrong for my installation.
> > >
> > > From the previous dissusion, gmx looks to have performance problem
> > > on Cray linux:
> > >
> > >
> >
> https://mailman-1.sys.kth.se/pipermail/gromacs.org_gmx-users/2013-May/081473.html
> > >
> > > I am also wondering if the newest version solved this issue.
> > >
> > > Thanks!
> > >
> > > Thom
> > >
> > >
> > > >
> > > > Message: 2
> > > > Date: Mon, 23 Jun 2014 08:17:55 -0400
> > > > From: Justin Lemkul <jalemkul at vt.edu>
> > > > To: gmx-users at gromacs.org
> > > > Subject: Re: [gmx-users] help with poor performance on gromacs on
> Cray
> > > >         linux
> > > > Message-ID: <53A81AF3.7090401 at vt.edu>
> > > > Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> > > >
> > > >
> > > >
> > > > On 6/23/14, 1:12 AM, Tom wrote:
> > > > > Dear Gromacs Developers and Experts:
> > > > >
> > > > > I noticed that the performance of gromacs on Cray linux clusters is
> > > only
> > > > > 36.7% of the normal.
> > > > >
> > > >
> > > > Normal what?  Another run on the same system?  You can't directly
> > compare
> > > > different clusters with different hardware.
> > > >
> > > > >
> > > > > The following is the detail about the installation
> > > > > --------------------------
> > > > > CC=gcc FC=ifort F77=ifort CXX=icpc
> > > > > CMAKE_PREFIX_PATH=/opt/cray/modulefiles/cray-mpich/6.3.0
> > > > > cmake .. -DGMX_BUILD_OWN_FFTW=ON -DGMX_GPU=OFF -DGMX_MPI=ON
> > > > > -DBUILD_SHARED_LIBS=off -DCMAKE_SKIP_RPATH=ON
> > > > > -DCMAKE_INSTALL_PREFIX=~/App/GROMACS
> > > > > make F77=gfortran
> > > > > make install
> > > > > ----------------------
> > > > >
> > > > > This is the bash_profile:
> > > > > ---------------------
> > > > > module swap PrgEnv-pgi PrgEnv-gnu
> > > > > module load cmake
> > > > > export PATH=/home/test/App/GROMACS/bin:$PATH
> > > > > -------------------------
> > > > >
> > > > > Is there any suggestion for my installation to improve the
> > efficiency?
> > > > >
> > > >
> > > > More important is the output of the .log file from the simulation.
>  It
> > > > will tell
> > > > you where mdrun spent all its time.
> > > >
> > > > -Justin
> > > >
> > > > --
> > > > ==================================================
> > > >
> > > > Justin A. Lemkul, Ph.D.
> > > > Ruth L. Kirschstein NRSA Postdoctoral Fellow
> > > >
> > > > Department of Pharmaceutical Sciences
> > > > School of Pharmacy
> > > > Health Sciences Facility II, Room 601
> > > > University of Maryland, Baltimore
> > > > 20 Penn St.
> > > > Baltimore, MD 21201
> > > >
> > > > jalemkul at outerbanks.umaryland.edu | (410) 706-7441
> > > > http://mackerell.umaryland.edu/~jalemkul
> > > >
> > > > ==================================================
> > > >
> > > >
> > > >
> > > --
> > > Gromacs Users mailing list
> > >
> > > * Please search the archive at
> > > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > > posting!
> > >
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> > > * For (un)subscribe requests visit
> > > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > > send a mail to gmx-users-request at gromacs.org.
> > >
> >
> >
> > ------------------------------
> >
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> > send a mail to gmx-users-request at gromacs.org.
> >
> >
> > End of gromacs.org_gmx-users Digest, Vol 122, Issue 124
> > *******************************************************
> >
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>


More information about the gromacs.org_gmx-users mailing list