[gmx-users] Cross correlation/Normalised covariance calculation problems

Thu Jul 3 18:25:56 CEST 2014

Dear all,

I've got a discrepancy between cross correlations that I previously calculated using Ran Freidman's amended version of g_covar and GROMACS version 3.3.3, and those I've done myself from using gromacs 4.6.4 and MATLAB that I'm bit baffled by. (I no longer have access to GROMACS 3.3.3 and don't really want to reinstall it.).

The discrepancy is that with Ran's code using the same trajectory (this was a sanity check for further calculations) I got about 1500 correlations above an absolute cross correlation coefficient of 0.7, whereas I get about 750 correlations using my code.

I only have the log file for Ran's GROMACS 3.3.3 g_covar output and this has the same value for the trace of covariance matrix as for the GROMACS 4.6.4 output, and the eigenvalues appear to be the same for both calculations. And I have the xpm plot of the cross correlation matrix from Ran's output, which visually looks the same as what I'm plotting in MATLAB for the whole cross correlation map from -1 to 1, but not for when I've filtered to correlations |r| >0.7. (I originally amended Ran's file to change the threshold from 0.5 to 0.7).

What I've done (from reading Tsjerk's messages in the archive) is output an ascii file from g_covar then reconstructed the covariance matrix and converted it to normalised covariance as per:

C_ij = <x_i * x_j> / sqrt ( <x_i ^2> * <x_j^2> )

And this was where I got lost following Ran's code. What I did next was to extract the 3 parts of the 3N matrix I got from the equation above, and sum them together and divide by 3 to get a N x N matrix and then  I've looped through to extract where |r|>0.7. Ran's code appears to output one third of the entire matrix, I may have this wrong, but I'm not au fait with C and was struggling with the code. Does what I've done make sense? I can post my code or email direct if that helps.

To compare the first 10 outputs, this is my MATLAB output:

Atom     Atom     Correlation
4              1              0.91
5              2              0.85
6              3              0.78
7              1              0.80
7              4              0.93
8              2              0.75
8              5              0.89
9              6              0.81
10           1              0.79
10           4              0.86

And this is the first 10 outputs from previous calculation using Ran's code:

Atom     Atom     Correlation
1              2              0.84
2              3              0.88
3              4              0.87
4              5              0.85
4              6              0.75
5              6              0.85
5              7              0.72
6              7              0.86
6              8              0.73
7              8              0.87

And as I mentioned, MATLAB gives me about 750 correlations, whilst Ran's gives me about 1500. So I'm a bit baffled, if anyone has an ideas as to where I'm going wrong? When I map them onto my protein, both solutions look plausible, but I'm not sure which is correct!

Many thanks,

Alistair