[gmx-users] calculate cross correlation function in hydrogen bond dynamics, Luzar and Chandler analysis (gmx hbond)

Tue Oct 11 06:39:55 CEST 2016

Hi,

Probably this email should better go to the developer list, but I just
want some hydrogen bond experts to help me. I want to make sure my
understanding and modification to gmx_hbond.c is correct.

I found two issues in gmx_hbond.c. The first one is seg fault in
crosscorr.c (or .cpp in 2016 version). I am pretty sure it is a bug
and reported here with fixes. https://redmine.gromacs.org/issues/2057
The second one is the normalization factor of the cross correlation
function, which I am not quite sure.

I am following Luzar 2000 paper. (
http://scitation.aip.org/content/aip/journal/jcp/113/23/10.1063/1.1320826
  ) In the paper, the normalization factor of crosscorrelation
function is defined as <h> (actually 1/<h>), which is the same with
that of the autocorrelation function of h. (ght[] is the
crosscorrelation function in gmx_hbond.c; ht[] is the autocorrelation
function of h) But, in gmx_hbond.c, a different normalization factor
is used for ght[] in normalizeACF(). (it is nhb, total number of
h-bonds in all frames) This change was introduced in a bug fix. (
https://redmine.gromacs.org/issues/960   )  The normalization factor
of the autocorrelation function of h used in normalizeACF() is ht[0],
the value of the autocorrelation function at t=0. (BTW, this value is
not exactly the same as <h>, I think, there is a factor of N_hydrogen
difference. (?not very sure) Or in another way, ht[0] is the average
number of hbonds in one frame. But this difference doesn't matter if
both correlation functions have this factor)   nhb, the normalization
factor for crosscorrelation function ght[], is actually
nhb=N_frames*ht[0]. There is an additional N_frames factor here (total
number of frames in the analyzed trajectory) If we look at the old
crosscorrelation code in crosscorr.c, there is an (N-i) factor missing
for the ith output data, where N is the length of input data and i is
the index of output array. So, overall, the net effect is the output
ght[] has an additional factor (N-i)/N multiplied. This means that for
time near zero, the crosscorrelation function is very close to the
expected data, but if time gets close to dt*N_frames (the tail), the
crosscorrelation function is only about 1/N of the expected data. I
found this patch (   https://gerrit.gromacs.org/#/c/3471/   ) wanted
to fix this issue, though I don't think it does. I guess that is the
whole story....

At this time, I think using ht[0] as the normalization factor for both
ht[] and ght[] and adopting my fix for the crosscorr.c (
https://redmine.gromacs.org/issues/2057   ) should fix this issue.

Please comment whether I am right or not. I am still not very clear
how Luzar defined <h> in the paper. In Luzar and Chandler 1996 nature
paper, (   http://www.nature.com/nature/journal/v379/n6560/abs/379055a0.html
  ) they said 0.5*N(N-1)<h> is the average h-bond number in fluids. I
don't know how operator <> is defined here.... Just hope it does not
matter and can be cancelled out, otherwise the backward rate will be
very wired.

Thanks,
Jiannan Liu