[gmx-users] Gromacs using MKL with Intel 11.1 compilers

Mark Abraham Mark.Abraham at anu.edu.au
Fri Sep 18 00:35:42 CEST 2009


Steve Cousins wrote:
> 
> I just thought I'd mention some troubles and fixes I had with trying to 
> get Gromacs 4.0.5 to configure with --with-fft=mkl on a SGI Altix 3700 
> BX2 system.  I did:
> 
> export CC=icc
> export F77=ifort
> export CFLAGS="-O3 -ip -ftz"
> export FFLAGS="-O3 -ip -ftz"
> export LDFLAGS="-O3 -ip -ftz -L/opt/intel/Compiler/11.1/046/mkl/lib/64 
> -L/usr/lib"
> export CPPFLAGS="-I/opt/intel/Compiler/11.1/046/mkl/include/fftw 
> -I/usr/include"
> 
> and then:
> 
> ./configure --prefix=/usr/local/gromacs-4-mkl-noopts --without-x 
> --enable-fortran --with-fft=mkl
> 
> However, this gave messages saying that it couldn't find the mkl 
> libraries. This is because MKL doesn't include a library called 
> libmkl.so anymore. To get this to work I had to edit the configure 
> script and change:
> 
>     -lmkl
> 
> to
> 
>     -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lguide -lpthread

... or put LIBS="-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lguide 
-lpthread" on the configure command line.

> In the end, I found that using MKL is about 20% slower than using fftw3 
> when running:
> 
>     time gmxtest.pl all
> 
> This may or may not be useful as they are just serial results with the 
> small test programs:
> 
> With fftw3:
> 
>     real    1m17.889s
>     user    1m4.660s
>     sys     0m8.672s
> 
> With MKL:
> 
>     real    1m34.731s
>     user    3m33.024s
>     sys     0m15.280s
> 
> Maybe the overhead of starting up the threads for such small jobs is 
> what is causing the slow-down in MKL.

Plausible, but even if FFT threads were not available, only a fraction 
of these tests were using FFT and some of those were probably I/O 
dominated. So the relative speed is still unknown. These tests are 
designed to assess correct implementation and compilation, and not as 
benchmarks or real-world examples. There was a benchmark set years ago 
for the last major GROMACS release, which will probably still serve your 
purpose http://oldwww.gromacs.org/content/view/24/37/. Lys/PME is the 
relevant test. Bear in mind that grompp no longer accepts -np, -sort or 
-shuffle arguments, and these should be deleted as necessary. Your speed 
should probably beat any of those reported there.

> Anybody have any real-world comparisons of using MKL vs. FFTW3?

No, but in the meantime build both, and if your users are actually using 
PME (the main algorithm that uses FFT), they can do a simple speed tests 
once they have a full-sized system set up.

Mark



More information about the gromacs.org_gmx-users mailing list