[gmx-developers] Math function precision

Thu Sep 15 19:12:37 CEST 2016

Hi,

What precision do we expect from math functions in GROMACS? We use by default relatively aggressive math options for many compilers but we haven't documented what precision we demand from the compiler. This makes it difficult to decide for new compilers / compiler versions what the correct set of flags is.

How many ulps max relative error do we accept for standard math functions? Should we use the same precision for non-simd math function as for simd functions (GMX_SIMD_ACCURACY_BITS_SINGLE)? Our default (22) corresponds to accepting  1ulp relative error, correct? Currently our default flags for ICC don't specify accuracy and thus we use the default accuracy for O3 which is 4ulps. Also AFAIK gcc -ffast-math allows 2ulps errors. Where this matters is in FunctionTest.ErfInvDouble which fails with ICC17 and -no-prec-div (corresponds to -ffast-math) because the result is only correct to 6ulp but the test requires 4ulp which I believe one cannot guarantee if the intermediate results are only correct to 2ulp (because it includes a difference of intermediate results).

How accurate results do we accept for non-standard input values such as extremes, nans, infinites, denormals. For GCC we use -ffast-math which AFAIU means it won't produce correct results for nans and infinites. I'm not sure about extremes and denormals. 

An example where this matters: When compiling with ICC and allowing that all 4 non-standard values don't have to produce correct results (using -fast or -fp-model  fast=2 or -fimf-domain-exclusion=common) than the complex/nbnxn-ljpme-LB test fails because cr2 (nbnxn_kernel_ref_inner.h line 241) gets very large and expf(cr2) should produce zero but produces NaN. Our SIMD exp also doesn't support very large values but in our SIMD kernel we mask out particles beyond the cut-off so that this cannot get that large.

Do we expect that serial math functions produces correct values for extreme values and thus is the ref_inner usage OK as is? And thus ICC shouldn't use domain-exclusion=extremes? Or should ref_inner not rely on expf being OK for very large values?

Roland