[gmx-users] Cross compiling GROMACS 4.6.3 for native Xeon Phi, thread-mpi problem

Szilárd Páll szilard.pall at cbr.su.se
Mon Sep 16 22:53:54 CEST 2013


On Mon, Sep 16, 2013 at 7:04 PM, PaulC <paul.caheny at uk.fujitsu.com> wrote:
> Hi,
>
>
> I'm attempting to build GROMACS 4.6.3 to run entirely within a single Xeon
> Phi (i.e. native) with either/both Intel MPI/OpenMP for parallelisation
> within the single Xeon Phi.
>
> I followed these instructions from Intel for cross compiling for Xeon Phi
> with cmake:
>
> http://software.intel.com/en-us/articles/cross-compilation-for-intel-xeon-phi-coprocessor-with-cmake
>
> which includes setting:
>
> export CC=icc
> export CXX=icpc
> export FC=ifort
> export CFLAGS="-mmic"
> export CXXFLAGS=$CFLAGS
> export FFLAGS=$CFLAGS
> export MPI_C=mpiicc
> export MPI_CXX=mpiicpc
>
> I then run cmake with:
>
> cmake .. -DREGRESSIONTEST_DOWNLOAD=ON -DGMX_MPI=ON -DGMX_THREAD_MPI=OFF
> -DGMX_FFT_LIBRARY=mkl -DGMX_CPU_ACCELERATION=None
> -DCMAKE_INSTALL_PREFIX=~/gromacs
>
>
>
> Note -DGMX_THREAD_MPI=OFF. That seems to work fine (see attached
> cmake_output.txt), particularly, it finds the MIC Intel MPI:
>
> -- Found MPI_C:
> /opt/intel/impi/4.1.1.036/mic/lib/libmpigf.so;/opt/intel/impi/4.1.1.036/mic/lib/libmpi.so;/opt/i
> ntel/impi/4.1.1.036/mic/lib/libmpigi.a;/usr/lib64/libdl.so;/usr/lib64/librt.so;/usr/lib64/libpthread.so
> -- Checking for MPI_IN_PLACE
> -- Performing Test MPI_IN_PLACE_COMPILE_OK
> -- Performing Test MPI_IN_PLACE_COMPILE_OK - Success
> -- Checking for MPI_IN_PLACE - yes
>
>
> When I run make everything trundles along fine until:
>
> [ 20%] Building C object
> src/gmxlib/CMakeFiles/gmx.dir/thread_mpi/errhandler.c.o
> [ 20%] Building C object
> src/gmxlib/CMakeFiles/gmx.dir/thread_mpi/tmpi_malloc.c.o
> [ 22%] Building C object src/gmxlib/CMakeFiles/gmx.dir/thread_mpi/atomic.c.o
> [ 22%] Building C object
> src/gmxlib/CMakeFiles/gmx.dir/thread_mpi/pthreads.c.o
> /tmp/iccQqtl2Vas_.s: Assembler messages:
> /tmp/iccQqtl2Vas_.s:1773: Error: `sfence' is not supported on `k1om'
> make[2]: *** [src/gmxlib/CMakeFiles/gmx.dir/thread_mpi/pthreads.c.o] Error 1
> make[1]: *** [src/gmxlib/CMakeFiles/gmx.dir/all] Error 2
> make: *** [all] Error 2
>
>
> Why is it still building thread_mpi given the -DGMX_THREAD_MPI=OFF at the
> cmake invocation above?

Because these days thread-MPI not only provides a threading-based MPI
implementation for GROMACS, but also some functionality independent
from this very feature, namely efficient atomic operations and thread
affinity settings.

>
> Any suggestions how best to work around this?

[ FTFY: "Any suggestions how to *fix* this?" ]

What seems to be causing the trouble here is the atomics support.
While x86 normally supports the atomic memory fence operation, Xeon
Phi seems to be not so "normal" and apparently it does not. Now, if
you look at src/gmxlib/thread_mpi/pthreads.c:633 you'll see a
tMPI_Atomic_memory_barrier() which, for x86, is defined in
include/thread_mpi/atomic/gcc_x86.h:105 as
#define tMPI_Atomic_memory_barrier() __asm__ __volatile__("sfence;" :
: : "memory")
along some other atomic operations for icc among other compilers.
What's strange is that the build system checks whether it can compile
a dummy C file with the atomics stuff included (see
cmake/ThreadMPI.cmake). At first sight it seems that this should fail
already at cmake time and should disable the atomics, but apprently it
does not.

You have two options:
- Fix the problem by adding an #elif
MACRO_TO_CHECK_FOR_MIC_COMPILATION branch and implement an atomic
barrier using the appropriate MIC ASM instruction.
- Fix the atomics check such that the lack of atomics support in
thread-MPI on MIC is correctly reflected (see cmake/ThreadMPI.cmake:45
which compilescmake/TestAtomics.c). More concretely, the cmake test
should fail for MIC build which should result in the disabling of
atomics support (and hopefully no compile-time error).

I suspect that even the proper fix (first option) may be as simple as
a couple of lines worth of changes. Regardless of which option you
pick, I would really appreciate if you could upload your fix to
gerrit.gromacs.org. You could open an issue on redmine.gromacs.org if
you want this issue to be track-able.

Cheers,
--
Szilárd

PS: I hope you know that we have neither SIMD intrinsics support not
any reasonable accelerator-aware parallelization for MIC (yet), so
don't expect high performance.

>
> Thanks,
>
> Paul.
>
> cmake_output.txt
> <http://gromacs.5086.x6.nabble.com/file/n5011212/cmake_output.txt>
>
> --
> View this message in context: http://gromacs.5086.x6.nabble.com/Cross-compiling-GROMACS-4-6-3-for-native-Xeon-Phi-thread-mpi-problem-tp5011212.html
> Sent from the GROMACS Users Forum mailing list archive at Nabble.com.
> --
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists



More information about the gromacs.org_gmx-users mailing list