[gmx-users] calculating block average and error estimate by g_analyze tool in gromacs

mircial at sjtu.edu.cn mircial at sjtu.edu.cn
Tue May 6 02:21:38 CEST 2014

Hi All:

I am using g_analyze to analyze my data and I want to do error estimate by blocking averaging. However, I encountered some problems of understanding the output results.

THe command i used was: g_analyze -f input.xvg -ee output.xvg

The input files are like this:

0       6.37011          6.68738            0.6656147846
2       6.331272        6.646607        0.6575230749
4       6.347109        6.663233        0.6608166585
6       6.400918        6.719722        0.672068586
8       6.416399        6.735974        0.6753233881
.......
.......

Then, the program give the the following output:
---------------------------------------------------------------------------
std. dev.    relative deviation of
standard       ---------   cumulants from those of
set      average       deviation      sqrt(n-1)   a Gaussian distribition
cum. 3   cum. 4
SS1   6.295666e+00   5.914010e-02   3.740349e-04       0.079   -0.113
SS2   6.609228e+00   6.208563e-02   3.926640e-04       0.079   -0.113
SS3   6.502057e-01   1.222212e-02   7.729947e-05       0.093   -0.109

Back Off! I just backed up 303-parm1-block.xvg to ./#303-parm1-block.xvg.1#
Set   1:  err.est. 0.0279775  a 0.60107  tau1 382.424  tau2 13449.2
Set   2:  err.est. 0.0293711  a 0.601071  tau1 382.426  tau2 13449.3
Set   3:  err.est. 0.00570236  a 0.597252  tau1 377.834  tau2 12952.3
----------------------------------------------------------------------------------------

My first question is the err.est. value is the error estimated by block averaging right?
Generally speaking, the standard deviate underestimated the "real" error bars of the data because of the auto-correlation between the data. And this is the reason we need to do block averaging to estimate the error of the data, right?
But, why in my results, the standard deviation values are larger than the err.est. values?

THe output file is like this:

----------------------------------------------------
@    title "Error estimates"
@    xaxis  label "Block size (time)"
@    yaxis  label "Error estimate"
@TYPE xy
@ subtitle "using block averaging, total time 50000 (25001 points)"
@ view 0.15, 0.15, 0.75, 0.85
@ legend on
@ legend box on
@ legend loctype view
@ legend 0.78, 0.8
@ legend length 2
@ legend string 0 "av 6.295666"
@ legend string 1 "ee 0.0279775"
2 0.000374035 0.000373804
4 0.000527787 0.000528405
6 0.000644719 0.000646844
8 0.000742475 0.00074647
.....
---------------------------------------------------------------
TO my understanding the second column is the  std. dev./sqrt(m-1), where std.dev. is the standard deviation of the blocks and m is the number of the blocks. THe third column is the analytical block average curve. Do I understand the result right?

Thanks.

Regards,
RXG