[gmx-users] AVX2 SIMD intrinsics speed boost

Bin Liu fdusuperstring at gmail.com
Thu Jul 11 18:01:18 CEST 2013

Hi all,

If my understanding is correct, GROMACS parallelization and acceleration
page indicates AVX2 SIMD intrinsics can offer a speed boost on a Haswell
CPU. I was wondering how much performance gain we can expect from it. In
another word, what's the approximate speed increase if we run a simulation
with AVX2 SIMD intrinsics on a Haswell CPU (say i7 4770K) than on an Ivy
Bridge CPU of the same  clock (say i7 3770K) with the current AVX SIMD
intrinsics? And is there a timeline for the release of AVX2 SIMD intrinsics?

This information is crucial if we want to assemble a machine with balanced
CPU and GPU performance.  My current machine has i7 3770K (3.5GHz, stock
frequency) and Geforce 650 Ti (768 CUDA cores, 1032MHz). When I ran
simulations with   rcoulomb=1.0 and rvdw=1.0, I got this at the end of the
log file:

*Force evaluation time GPU/CPU: 1.762 ms/1.150 ms = 1.531*
It seems I need a GPU with 50% more CUDA cores. In the best scenario, If
AVX2 can give 30% speed boost, and I can successfully overclock 4770K to
4.5GHz, I need 1900 CUDA cores( 130%*(4.5GHz/3.5GHz)*1.531*768 cores) at
the same frequency to get balanced CPU and GPU performance. Then I will
need a GeForce GTX 780 (2304 CUDA cores at 863MHz, equivalent to 1925 CUDA
cores at 1032MHz). Since GROMACS is highly insensitive to memory clock and
latency, I hope this naive arithmetic can give a good estimation which
graphic card I should purchase.



