[gmx-users] Gromacs using MKL with Intel 11.1 compilers
Mark Abraham
Mark.Abraham at anu.edu.au
Fri Sep 18 00:35:42 CEST 2009
Steve Cousins wrote:
>
> I just thought I'd mention some troubles and fixes I had with trying to
> get Gromacs 4.0.5 to configure with --with-fft=mkl on a SGI Altix 3700
> BX2 system. I did:
>
> export CC=icc
> export F77=ifort
> export CFLAGS="-O3 -ip -ftz"
> export FFLAGS="-O3 -ip -ftz"
> export LDFLAGS="-O3 -ip -ftz -L/opt/intel/Compiler/11.1/046/mkl/lib/64
> -L/usr/lib"
> export CPPFLAGS="-I/opt/intel/Compiler/11.1/046/mkl/include/fftw
> -I/usr/include"
>
> and then:
>
> ./configure --prefix=/usr/local/gromacs-4-mkl-noopts --without-x
> --enable-fortran --with-fft=mkl
>
> However, this gave messages saying that it couldn't find the mkl
> libraries. This is because MKL doesn't include a library called
> libmkl.so anymore. To get this to work I had to edit the configure
> script and change:
>
> -lmkl
>
> to
>
> -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lguide -lpthread
... or put LIBS="-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lguide
-lpthread" on the configure command line.
> In the end, I found that using MKL is about 20% slower than using fftw3
> when running:
>
> time gmxtest.pl all
>
> This may or may not be useful as they are just serial results with the
> small test programs:
>
> With fftw3:
>
> real 1m17.889s
> user 1m4.660s
> sys 0m8.672s
>
> With MKL:
>
> real 1m34.731s
> user 3m33.024s
> sys 0m15.280s
>
> Maybe the overhead of starting up the threads for such small jobs is
> what is causing the slow-down in MKL.
Plausible, but even if FFT threads were not available, only a fraction
of these tests were using FFT and some of those were probably I/O
dominated. So the relative speed is still unknown. These tests are
designed to assess correct implementation and compilation, and not as
benchmarks or real-world examples. There was a benchmark set years ago
for the last major GROMACS release, which will probably still serve your
purpose http://oldwww.gromacs.org/content/view/24/37/. Lys/PME is the
relevant test. Bear in mind that grompp no longer accepts -np, -sort or
-shuffle arguments, and these should be deleted as necessary. Your speed
should probably beat any of those reported there.
> Anybody have any real-world comparisons of using MKL vs. FFTW3?
No, but in the meantime build both, and if your users are actually using
PME (the main algorithm that uses FFT), they can do a simple speed tests
once they have a full-sized system set up.
Mark
More information about the gromacs.org_gmx-users
mailing list