[gmx-developers] Math function precision
erik.lindahl at gmail.com
Thu Sep 15 21:43:49 CEST 2016
On 15 September 2016 at 19:12:50, Schulz, Roland (roland.schulz at intel.com)
What precision do we expect from math functions in GROMACS? We use by
default relatively aggressive math options for many compilers but we
haven't documented what precision we demand from the compiler. This makes
it difficult to decide for new compilers / compiler versions what the
correct set of flags is.
How many ulps max relative error do we accept for standard math functions?
Should we use the same precision for non-simd math function as for simd
functions (GMX_SIMD_ACCURACY_BITS_SINGLE)? Our default (22) corresponds to
accepting 1ulp relative error, correct? Currently our default flags for ICC
don't specify accuracy and thus we use the default accuracy for O3 which is
4ulps. Also AFAIK gcc -ffast-math allows 2ulps errors. Where this matters
is in FunctionTest.ErfInvDouble which fails with ICC17 and -no-prec-div
(corresponds to -ffast-math) because the result is only correct to 6ulp but
the test requires 4ulp which I believe one cannot guarantee if the
intermediate results are only correct to 2ulp (because it includes a
difference of intermediate results).
To tell the truth I haven’t done any super-deep analysis. In my early
trials (in particular for the SIMD layer) I optimized entirely for
performance and accepted larger errors, but when I tested I realized we
could get to full performance with very little extra cost, and that has the
huge advantage we don’t need to worry about accuracy when we use our own
functions in new places (which in turn means we can use them almost
My reason for settling on 4ULPs is that the SIMD math functions I tested
(at least with gcc) typically achieved 1-2 ULP accuracy, and then I doubled
this to have some margin.
Personally I would be hesitant to increase this without a detailed analysis
of what goes wrong, and why it does not happen with other compilers.
If we accept 6ULP, why not 8? If we accept 8, why not 10 for another
How accurate results do we accept for non-standard input values such as
extremes, nans, infinites, denormals. For GCC we use -ffast-math which
AFAIU means it won't produce correct results for nans and infinites. I'm
not sure about extremes and denormals.
I don’t expect us to every need to handle NaN or Inf as input values.
Denormals can be clipped to zero.
An example where this matters: When compiling with ICC and allowing that
all 4 non-standard values don't have to produce correct results (using
-fast or -fp-model fast=2 or -fimf-domain-exclusion=common) than the
complex/nbnxn-ljpme-LB test fails because cr2 (nbnxn_kernel_ref_inner.h
line 241) gets very large and expf(cr2) should produce zero but produces
NaN. Our SIMD exp also doesn't support very large values but in our SIMD
kernel we mask out particles beyond the cut-off so that this cannot get
Spontaneously it sounds very dangerous to have the compiler assume that
every single argument for every single invocation of the exponential
function in a 3-million-line program has been checked so the arguments
never fall in the extreme range. Basically: There’s no way we’ve checked
that for old code, but new code might be better.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the gromacs.org_gmx-developers