[gmx-users] g_anaeig -proj

Mon Jun 28 12:42:45 CEST 2010

Hi Chris, Carla,

Sorry I didn't reply before. To understand the projections, first
consider the following. Take a single atom with three coordinates
(x,y,z). These coordinates are the projections onto a set of (three,
Cartesian) axes. If you consider, e.g., the projection of the
coordinates onto the x-axis, you're left with a single number in
distance measure (nm in gromacs). Now let's say your coordinates are
(1,1,1). If you then take the axis (x+y+z), rather than one of the
original ones, and project your coordinates onto it, you get a
projection of sqrt(3) nm. To describe all of Cartesian space, you need
two more axes, orthogonal to x+y+z, but for this particular point the
projections will both be zero.
Now for a complete configuration, it's a bit more complicated. You
best think of it as a point in 3N-dimensional space, having specific
projections or scores on 3N mutually orthogonal axes. These
projections still have the same units (nm). The aim of PCA is to find
new axes in this space that better describe a distribution of these
points, i.e. a set of configurations. Each configuration has one
projection on each axis. If you only select one eigenvector, the
projection will give a single number.
Filtering is the next step. If you have a projection of a
configuration onto a selected number of components, you can consider
that as having the projections on all other axes set to zero. In other
words, the variance or noise associated with these other axes is
removed. If you project these projections back to the original space,
then you've effectively filtered out the variance/noise.

I hope that clarifies a bit.

> ###
>
> Tsjerk: First, great tutorial, I hadn't seen this before. Second, you might
> avoid this type of confusion by mentioning during the command:

Thnx :)

> g_anaeig -s ../topol.tpr -f ../traj.xtc -v eigenvectors.trr -eig
> eigenvalues.xvg -proj proj-ev1.xvg -extr ev1.pdb -rmsf rmsf-ev1.xvg -first 1
> -last 1
>
> that you care actually doing a few things at once... or perhaps actually
> break the command apart into separate g_anaeig -proj and g_anaeig -extr
> calls.
>
> ###

Agreed. It's probably best to have the different operations split.

> All: might it be a good idea to get the html addresses of nice tutorials
> into the output messages of the analysis tools?

That would only make sense if the tutorials are adopted and put on the
gromacs site. Otherwise the tutorials/sites are probably too volatile.
Adoption raises questions regarding responsibility and maintenance
though.

Cheers,

Tsjerk

-- 
Tsjerk A. Wassenaar, Ph.D.

post-doctoral researcher
Molecular Dynamics Group
Groningen Institute for Biomolecular Research and Biotechnology
University of Groningen
The Netherlands