[gmx-users] configuration/installation of gromacs-4.6.1 on heterogeneous cluster
Mark Abraham
mark.j.abraham at gmail.com
Mon May 6 00:45:05 CEST 2013
On Sun, May 5, 2013 at 8:12 PM, Martin Siegert <siegert at sfu.ca> wrote:
> Hi,
>
> On Sat, May 04, 2013 at 09:50:37AM +0200, Mark Abraham wrote:
> > On Sat, May 4, 2013 at 1:41 AM, Martin Siegert <siegert at sfu.ca> wrote:
> >
> > > Hi,
> > >
> > > I am struggling with the configuration and compilation/installation
> > > of gromacs-4.6.1. Our cluster has 2 different processors: the older
> > > generation supports sse4.1, the newer sse4.2. Configuration and
> > > compilation must be done on the headnode of the cluster, which
> > > supports sse4.2. I am using the following command to configure
> > > gromacs-4.6.1:
> > >
> > > CFLAGS='-fpic -O3 -axSSE4.2,SSE4.1 -xSSSE3 -ip -opt-prefetch' \
> > > CXXFLAGS='-fpic -O3 -axSSE4.2,SSE4.1 -xSSSE3 -ip -opt-prefetch' \
> > > FFLAGS='-fpic -O3 -axSSE4.2,SSE4.1 -xSSSE3 -ip -opt-prefetch' \
> > >
> >
> > If the above is successful at generating instructions for SSE>4.1...
>
> -xSSSE3 ensures that the resulting code will run on processors that
> support ssse3 or later and is basically equivalent to -mssse3.
>
> -axSSE4.2,SSE4.1 tells the compiler to generate multiple code paths
> for sse4.1 and sse4.2 while still supporting the default architecture
> ssse3 i.e., when run on a processor that supports sse4.2 then the
> code will use the special instructions for that architecture).
>
OK - but still not something from which GROMACS can derive any significant
advantage, even if it works reliably :-)
We compile all our codes with -axSSE4.2,SSE4.1 -xSSSE3 and they run on
> ssse3 and better.
>
> > CC=mpicc \
> > > CXX=mpicxx \
> > > FC=mpif90 \
> > > LDFLAGS="-lfftw3f -lgoto2 -Wl,-rpath,/usr/local/gromacs-4.6.1/lib" \
> > > cmake -DCMAKE_VERBOSE_MAKEFILE=ON -DGMX_MPI=ON \
> > > -DGMX_CPU_ACCELERATION=SSE4.1 -DGMX_OPENMP=OFF -DGMX_GPU=OFF \
> > > -DCMAKE_INSTALL_PREFIX=/usr/local/gromacs-4.6.1 \
> > > -DCMAKE_SKIP_RPATH=YES \
> > > ../gromacs-4.6.1
> > >
> > > However, after compilation/installation mdrun_mpi fails on nodes that
> > > only support sse4.1 with "Illegal instruction".
> > >
> >
> > ... then this is not a surprise. Your compiler has been allowed to
> generate
> > SSE 4.2 instructions, I suspect.
>
> Not really ... -xssse3 specifies the default architecture that the code
> supports.
>
> > > The CMakeCache.txt file contains the line:
> > >
> > > BUILD_CPU_FEATURES:INTERNAL=aes apic clfsh cmov cx8 cx16 htt lahf_lm
> mmx
> > > msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3
> > > sse4.1 sse4.2 ssse3
> > >
> > > Since this line contains "sse4.2" it appears that the flag
> > > -DGMX_CPU_ACCELERATION=SSE4.1
> > > is ignored.
> > >
> >
> > That list is what features are available on the CPU (mostly for helping
> > detect what acceleration to use, and to solve problems). The target
> > acceleration scheme (i.e. code with heavy use of compiler instrinsics) is
> > SSE4.1, which is what GROMACS will compile for if you leave it alone :-)
> > SSE4.2 is roughly useless for mdrun.
> >
> > What is the correct way of specifying the cpu architecture within the
> > > cmake build system?
>
To clarify: setting GMX_CPU_ACCELERATION=SSE4.1 is all that you require.
IIRC the build-cpu detection still runs in this case, but the results have
no real bearing on the outcome. You can see the value of that variable in
the CMakeCache.txt. The -msse4.1 compiler flag is used if and only if that
is the value of that variable.
> (I never had problems with this with the pre 4.6
> > > versions).
> > >
> >
> > Back then, GROMACS was nearly insensitive to compilation settings,
> because
> > of the use of assembly kernels. Now GROMACS is sensitive to compiler
> > version (in that compilation of SIMD instrinsics needs to work well, and
> > OpenMP needs to be supported, etc.) but one still doesn't generally want
> to
> > mess with the compiler flags. We have some internal disgreement about
> > whether we should be permitting/encouraging/facilitiating
> setting/checking
> > compiler flags. So far nobody has demonstrated a use case that suggests
> we
> > need to support more than "shut up and let GROMACS do its thing!"
>
> For curiosity I tried this, i.e., I did not set CFLAGS, CXXFLAGS and FFLAGS
> and use the gromacs supplied values which turn out to be
>
> -O3 -ip -fPIC -msse4.1
>
> The result is absolutely the same: on the nodes that only support sse4.1
> the
> code fails with "Illegal instruction".
>
That is extremely surprising. As above, the appearance of -msse4.1
indicates GROMACS is doing what is expected for your case.
Any suggestions what else I should try?
>
Compiling on the sse4.1 node would be an interesting experiment, but you
seem to suggest this is impossible. Otherwise, this could be a compiler
bug. GROMACS has a long history of pushing compilers too far ;-) For
example, we know intel 11.1 has some bug with SSE4.1 code generation (
http://redmine.gromacs.org/issues/1126), but whatever is the cause here is
a different issue.
What is the full compiler version?
I am mostly concerned about the following setting in CMakeCache.txt:
>
> //Build CPU brand
> BUILD_CPU_BRAND:INTERNAL=Intel(R) Xeon(R) CPU X5650 @ 2.67GHz
> //Build CPU family
> BUILD_CPU_FAMILY:INTERNAL=6
> //Build CPU features
> BUILD_CPU_FEATURES:INTERNAL=aes apic clfsh cmov cx8 cx16 htt lahf_lm mmx
> msr nonstop_tsc pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3
> sse4.1 sse4.2 ssse3
> //Build CPU model
> BUILD_CPU_MODEL:INTERNAL=44
> //Build CPU stepping
> BUILD_CPU_STEPPING:INTERNAL=2
>
> As the names of the variables say, these are for the BUILD cpu, not
> necessarily for
> the cpu the code needs to run on. Thus, if any of these are used to
> generate
> processor specific code, then this would explain the illegal instruction.
>
Indeed, but the routines that let to the detection whose results are
reported above are used only when GMX_CPU_ACCELERATION is set to (the
default of) "None." That behaviour is overridden by your setting of
GMX_CPU_ACCELERATION. (And even if it were not, on systems capable of only
SSE4.2 or SSSE3, GROMACS would use only SSE4.1, because that's all that we
think is useful.)
Consequently I would need to change these settings, but to what?
>
These are the results of the detection on the build CPU. They are currently
used only to detect whether rdtscp is available. Our detection machinery is
more elaborate, but we use only a tiny fraction of it.
Mark
> Cheers,
> Martin
> --
> gmx-users mailing list gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
More information about the gromacs.org_gmx-users
mailing list