[gmx-users] Intel vs gcc compilers

Mark Abraham mark.j.abraham at gmail.com
Tue Jun 25 13:53:13 CEST 2013


On Tue, Jun 25, 2013 at 12:11 PM, Djurre de Jong-Bruinink
<djurredejong at yahoo.com> wrote:
> Dear Gromacs developers/users,
>
> After suggestions on this mailing list to use intel over gcc compilers, we recently obtained the newest intel compilers (2013.4.183). I was kind of disappointed to find that there is no speed-up at all when comparing a gcc and intel compiled g4.6.2. Is this expected for our hardware, would the speed-up only become apparent when using specific functions (OMP, group-scheme) or am I doing something wrong in the compilation?
>
> An exert fromboth log files is below, the complete logfiles can be found there:
>
> http://md.chem.rug.nl/~djurre/logs/N6intel.log
> http://md.chem.rug.nl/~djurre/logs/N6gcc.log

You're using a real-MPI process per core, and you have six cores per
processor. The recommended procedure is to map cores to OpenMP
threads, and choose the number of MPI processes per processor (and
thus the number of OpenMP threads per MPI process) to maximize
performance. See
http://www.gromacs.org/Documentation/Acceleration_and_parallelization#Multi-level_parallelization.3a_MPI.2fthread-MPI_.2b_OpenMP
The correct balance will depend on the hardware, simulation, degree of
parallelism and compiler. The performance from default settings will
generally not be terrible, but getting the maximum performance will
certainly require some effort from the user (and in some cases you
won't see differences in compilers until you reach this regime).

Cheers,

Mark

> Thanks in advance,
> Djurre de Jong
>
>
> Log file opened on Tue Jun 25 11:33:25 2013
> Host: node041  pid: 13590  nodeid: 0  nnodes:  72
> Gromacs version:    VERSION 4.6.2
> Precision:          single
> Memory model:       64 bit
> MPI library:        MPI
> OpenMP support:     enabled
> GPU support:        disabled
> invsqrt routine:    gmx_software_invsqrt(x)
> CPU acceleration:   SSE2
> FFT library:        fftw-3.3.2-sse2
> Large file support: enabled
> RDTSCP usage:       enabled
> Built on:           Mon Jun 24 20:54:51 CEST 2013
> Built by:           p238199 at login01 [CMAKE]
> Build OS/arch:      Linux 2.6.18-308.11.1.el5 x86_64
> Build CPU vendor:   AuthenticAMD
> Build CPU brand:    Six-Core AMD Opteron(tm) Processor 2435
> Build CPU family:   16   Model: 8   Stepping: 0
> Build CPU features: apic clfsh cmov cx8 cx16 htt lahf_lm misalignsse mmx msr nonstop_tsc pdpe1gb popcnt pse rdtscp sse2 sse3 sse4a
> C compiler:         /cm/shared/apps/openmpi/intel/64/1.4.5/bin/mpicc Intel icc (ICC) 13.1.2 20130514
> C compiler flags:   -msse2    -std=gnu99 -Wall   -ip -funroll-all-loops  -O3 -DNDEBUG
>
>
>                Core t (s)   Wall t (s)        (%)
>        Time:    10651.270      222.008     4797.7
>                  (ns/day)    (hour/ns)
> Performance:       77.836        0.308
>
>
> ######################################################################################
>
> Host: node041  pid: 13728  nodeid: 0  nnodes:  72
> Gromacs version:    VERSION 4.6.2
> Precision:          single
> Memory model:       64 bit
> MPI library:        MPI
> OpenMP support:     enabled
> GPU support:        disabled
> invsqrt routine:    gmx_software_invsqrt(x)
> CPU acceleration:   SSE2
> FFT library:        fftw-3.3.2-sse2
> Large file support: enabled
> RDTSCP usage:       enabled
> Built on:           Tue Jun 25 10:25:13 CEST 2013
> Built by:           p238199 at login01 [CMAKE]
> Build OS/arch:      Linux 2.6.18-308.11.1.el5 x86_64
> Build CPU vendor:   AuthenticAMD
> Build CPU brand:    Six-Core AMD Opteron(tm) Processor 2435
> Build CPU family:   16   Model: 8   Stepping: 0
> Build CPU features: apic clfsh cmov cx8 cx16 htt lahf_lm misalignsse mmx msr nonstop_tsc pdpe1gb popcnt pse rdtscp sse2 sse3 sse4a
> C compiler:         /cm/shared/apps/openmpi/gcc/64/1.4.5/bin/mpicc GNU gcc (GCC) 4.7.2
> C compiler flags:   -msse2    -Wextra -Wno-missing-field-initializers -Wno-sign-compare -Wall -Wno-unused -Wunused-value   -fomit-frame-pointer -funroll-all-loops -fexcess-precision=fast  -O3 -DNDEBUG
>
>
>                Core t (s)   Wall t (s)        (%)
>        Time:    10516.210      219.223     4797.0
>                  (ns/day)    (hour/ns)
> Performance:       78.825        0.304
> --
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists



More information about the gromacs.org_gmx-users mailing list