# [gmx-users] creating representative structures

Shyno Mathew sm3334 at columbia.edu
Tue Mar 1 00:52:57 CET 2016

```Dear all,

I was able to write a tcl script to do cluster analysis, here I am using
the gromos method.
However, my script is giving slightly different results! I am testing my
script on a smaller trajectory (8 frames).
In the gromos method, the structure with the greatest number of neighbors
is considered as the center of the cluster.
What if more than one structure has the highest number of neighbors. For
example, in my calculation, both frames 4 and 6 has 6 neighbors (maximum
neighbor count).
When I used g_cluster, the output file shows cluster center as frame 6.
That means if I have more than one structure that has the maximum number of
neighbors, I can randomly choose the cluster center?

thanks,
Shyno

On Fri, Feb 26, 2016 at 11:47 AM, Shyno Mathew <sm3334 at columbia.edu> wrote:

> Dear Prof. Mark and Prof. Stephane,
>
> Thanks so much for your suggestions.
>
> thanks,
> Shyno
>
> On Mon, Feb 8, 2016 at 12:23 PM, Shyno Mathew <sm3334 at columbia.edu> wrote:
>
>> Dear all,
>>
>>
>> I have few questions regarding creating representative structures. For
>> simplicity, let’s say I have a trajectory of 5 frames:
>>
>> 1.       g_rmsf: After reading previous posts, here is what I
>> understood. The average structures calculated using g_rmsf (by specifying
>> –ox) is literally the average of x, y, z co-ordinates of each atom over all
>> the 5 frames in my case. Energy minimizing this averaged structure might
>> give a meaningful structure?
>>
>> 2.       g_cluster: Here I am using the gromos method. I read the
>> reference paper (Daura et al.). Using gromos method and by just specifying
>> –cl (not –av) I get the middle structures for each cluster. Let’s say I get
>> 2 clusters using rmsd 0.16. Now the representative structures of these two
>> clusters (obtained in the out.pdb) should exactly correspond to two frames
>> in my original trajectory (the one with 5 frames)?
>>
>> 3.       My final question is regarding how exactly g_cluster works,
>> here is what I understand from Daura et al.
>>
>> If I use g_cluster with gromos method, the code will look for neighbors
>> of each frame within specified cutoff.
>>
>> Assume the first time the code finds the following:
>>
>> frame 0 has two neighbors: frames 2,3  within cutoff
>>
>> frame 1 has three neighbors: frames 2, 3, 4 within cutoff
>>
>> frame 2 has four neighbors: frames 0, 1, 3, 4 within cutoff
>>
>> frame 3 has three neighbors: frames 0,1, 2 within cutoff
>>
>> frame 4 has two neighbors: frames 1,2 within cutoff
>>
>>
>>
>> Since frame 2 has the highest number of neighbors, it's considered the
>> center cluster and this frame along with neighbors are removed. The same
>> calculation is performed on the remaining frames if I had more frames.
>>
>>
>>
>>
>> Sincerely,
>>
>> Shyno
>>
>> --
>> Shyno Mathew
>> PhD Candidate
>> Department of Chemical Engineering
>> Office of Graduate Student Affairs
>> The Fu Foundation School Of Engineering and Applied Science
>> Columbia University
>>
>
>
>
> --
> Shyno Mathew
> PhD Candidate
> Department of Chemical Engineering
> Office of Graduate Student Affairs
> The Fu Foundation School Of Engineering and Applied Science
> Columbia University
>

--
Shyno Mathew
PhD Candidate
Department of Chemical Engineering