[gmx-developers] Bug in g_cluster/gmx_cluster

Roman Affentranger roman.affentranger at bc.biol.ethz.ch
Wed Jun 23 11:47:30 CEST 2004

Dear gromacs developers

(All line numbers refer to gmx_cluster.c of gmx-3.2.1).
The cluster centers, defined as the structure with the lowest RMSD to all 
cluster members, are not correctly calculated in gmx_cluster (of 
gmx-3.2.1). The problem is that the function 'analyze_clusters', which 
determines the cluster centers, is called (line 1376) after the function 
'plot_clusters' (line 1364), where one half of the RMSD matrix is replaced 
by the cluster indices. Thus, not only the cluster centers are incorrectly 
determined, but also the average RMSD values reported in the log-file 
(average RMSD of all pairs of structures in cluster as well as average RMSD 
of all cluster members to cluster center).

As a solution to the problem one can simply shift the call of 
'plot_clusters' right after the call of 'analyze clusters'.

Another solution would be to replace the following line of the function 
'analyze_clusters' (line 846)

	r += rmsd[structure[i1]][structure[i]];


	if (i<i1)
	   r += rmsd[structure[i]][structure[i1]];
	   r += rmsd[structure[i1]][structure[i]];

Apart from that, I think that the clustering method denoted "linkage" does 
not represent a full linkage, but rather a single linkage clustering.

Yet another (though rather unimportant) detail, this time referring to 
gmx_rms.c: when calculating  an RMSD-matrix with -m, the reported average 
RMSD value is calculated incorrectly (except when using two trajectories 
with -f2).

line 672:	rmsd_avg /= tel_mat * tel_mat2;

should read	if (bFile2)
		   rmsd_avg /= tel_mat * tel_mat2;
		   rmsd_avg /= ( tel_mat * (tel_mat2 - 1) ) / 2;

Best regards


Roman Affentranger
Swiss Federal Institute of Technology Zurich
Institute of Biochemistry
Schafmattstrasse 18
ETH Hoenggerberg HPM G 9.3
8093 Zurich
Phone: +41 1 632 31 39

More information about the gromacs.org_gmx-developers mailing list