[gmx-developers] Inacurracies / bug with continuation in first time step

Kutzner, Carsten ckutzne at mpinat.mpg.de
Wed Apr 6 16:27:55 CEST 2022


Dear devs,

I am investigating an issue which surfaced during FMM development.
A while ago, after merging in GROMACS master, one of our unit tests that checks 
that FMM and PME results are the same, started to fail.

The failing test system is a 3651 atom AKE protein with virtual sites.

The reason for the failure are slight (but well above tolerance) differences in 
FMM vs PME electrostatic energy in the first time step when continuation is true.
With .mdp parameter continuation = no, or with a rerun, the tests pass. Also for 
the time steps after the first one, the differences are negligible.

Example (double precision, serial):
Step 0 with E_PME = -68646.6 and E_FMM = -68638.3 (difference 8.3121)  <--- !!!
Step 1 with E_PME = -68651.3 and E_FMM = -68651.2 (difference 0.048572)
Step 2 with E_PME = -68663.8 and E_FMM = -68663.7 (difference 0.0724816)

When setting GMX_DD_SINGLE_RANK=0, the test passes again:
Step 0 with E_PME = -68638.3 and E_FMM = -68638.3 (difference 0.0195857)
Step 1 with E_PME = -68651.3 and E_FMM = -68651.3 (difference 0.0175216)
Step 2 with E_PME = -68663.8 and E_FMM = -68663.7 (difference 0.0181637)

When writing to file the positions x for the PME cases above for step 0 at the
start of do_force(), then sorting and comparing them, there are no differences.


The differences are also showing up when changing from PME to a plain cutoff
Snippets from .xvg output from .edr:
@    xaxis  label "Time (ps)"
@ s0 legend "Coulomb-14"
@ s1 legend "Coulomb (SR)"

With continuation=yes, GMX_DD_SINGLE_RANK=1
    0.000000  37799.296875  -70127.710938
    0.002000  37800.109375  -70292.289062

With continuation=yes, GMX_DD_SINGLE_RANK=0
    0.000000  37799.296875  -70173.804688
    0.002000  37800.085938  -70262.812500

With continuation=no, the .xvg energy results do not depend on the setting of
GMX_DD_SINGLE_RANK.

GROMACS results have changed with commit 01b234357f0d68d5 "Also use DD partitioning in serial".
Coulomb energy for the AKE example using plain cutoffs:
Commit 01b234357f0d68d5: -70127.710938
Commit just before that: -70173.804688

Maybe someone has an idea what could be the cause of this behavior or what
I could do to shed some more light on this.

I can file a bug report and attach my test system if anyone is interested to
make some further checks.

Best,
  Carsten



--
Dr. Carsten Kutzner
Max Planck Institute for Multidisciplinary Sciences
Theoretical and Computational Biophysics
Am Fassberg 11, 37077 Göttingen, Germany
Tel. +49-551-2012313, Fax: +49-551-2012302
http://www.mpinat.mpg.de/grubmueller/kutzner






More information about the gromacs.org_gmx-developers mailing list