[gmx-users] Re: PCA eigenvalue normalization
Tsjerk Wassenaar
tsjerkw at gmail.com
Sat Apr 8 10:51:55 CEST 2006
Hi Tyler,
Non-centrality indeed arises from taking the deviation from a reference
structure which is not the average structure (including the "null
structure"). Actually you hadn't mentioned that you took the deviation from
a reference, so I just found that my remark was inappropriate.., until this
reply ;) You would be better off by taking the deviation from the average.
In some cases it does make sense to fit only on a subset of the atoms. This
exaggerates the motions of the rest of the atoms with respect to the group
of atoms used for fitting. In your case, this means both stretching motions
as well as bending motions. If you're main interest is stretching, you'll
run into the problem that the bending motions have a linear projection on
the stretching mode, and you'll actually find a combination of these for the
first (?) eigenvector.
I don't know whether PCA will be useful to compare flexibilities of chains
of different lengths. There you'll run into the problem that the
conformational spaces are completely different. You might be able to look at
the flexibilities of the stretch of amino acids which are comparable (in
position, not necessarily in terms of the side chain) and of equal number.
Otherwise you could possibly do something with the determinant of the
covariance matrix, which is a measure of the volume of conformational space
sampled.
If more springs to mind, I'll let you known.
Hope it helps,
Tsjerk
On 4/7/06, Tyler Luchko <tluchko at ualberta.ca> wrote:
>
> Hi Tsjerk,
>
> The system I am working on is the C-terminal tails of tubulin. The
> structure of the tails is missing in all of the crystal structures of
> tubulin, likely due to the flexibility of the tails. Since tubulin
> is quite large, a heterodimer of almost 900 residues, it is not
> really possible for us to adequately sample the tails' configuration
> space by simulating the whole protein. What we have done is to
> simulate nine different isotypes of the c-terminal tail fragment
> (9-26 residues) using constant pressure REMD.
>
> Among the properties we are interested in is quantifying and
> comparing the flexibility of tails from the different isotypes. I
> have performed PCA twice on this; fitting the all the C-alphas and
> then the backbone atoms of the first three residues to a reference
> structure. The motivation for the later fitting procedure is that
> the fragments would be anchored to tubulin at the fragments N-
> terminus. This is to look at how the fragments behave in the absence
> of tubulin's potential but as if they are still anchored. When
> performing the PCA we are fitting to a reference structure but still
> using the deviations from the mean and not from the reference
> structure. To compare the flexibility between the isotypes we had
> hoped to normalize the eigenvalues.
>
> I did note that I was inadvertently using the deviation from the
> reference structure rather than from the mean (-ref). Is this what
> you meant by a non-central covariance matrix? Using the deviation
> from the mean I obtained standard deviations more inline with what I
> originally expected. These are 0.8 nm at most if I normalize the
> variance by the number of atoms.
>
> My two questions are then:
>
> 1) does it make sense to fit the last three residues, as I described
> above, for the purposes of PCA?
>
> 2) is it possible to compare the relative flexibilities of the
> fragments using PCA?
>
> Thank you,
>
> Tyler
>
> > Hi Tyler,
> >
> > First, what question are you trying to answer? You're different
> > peptides
> > have completely different conformational spaces, simply because of the
> > differences in degrees of freedom, so you can't compare the PCA
> > results from
> > one system with the other. That is, unless you pick a subset from each
> > system, consisting of comparable particles, for which you can
> > safely make
> > the assumption that under equal circumstances should give the same
> > eigenvectors and -values. From that assumption, you could try to
> > make an
> > assessment whether the behaviour between the systems is different.
> >
> > Also, since you're using only the first three residues for fitting,
> > you
> > generate a non-central covariance matrix. That would be useful if
> > you would
> > like to exaggerate certain motional features, right, but it makes the
> > interpretation of PCA results difficult. If it's for the purpose of
> > comparing things, I wouldn't go there if I were you. The non-
> > centrality is
> > also the reason that your standard deviations end up high. You're not
> > subtracting the mean so your standard deviations is sqrt( sum(x^2)/N )
> > rather than sqrt( sum((x-average)^2))/N ). Is this really what you
> > want to
> > do? What are you expecting to get from this? I'd like to know the
> > question
> > your trying to answer and your assumptions on the nature of the
> > data...
> >
> > Cheers,
> >
> > Tsjerk
> >
> > On 4/7/06, Tyler Luchko <tluchko at ualberta.ca> wrote:
> >>
> >> Hello,
> >>
> >> Thank you for the previous responses. I still have some questions
> >> about the eigenvalues however.
> >>
> >> I should note that the frames of my trajectory have been fit to a
> >> reference structure using the backbone atoms of the first three
> >> residues. This is because the peptide is a fragment of a much larger
> >> protein.
> >>
> >> 1) If I wish to compare the eigenvalues of several peptides of
> >> different lengths how would I normalize the eigenvalues? Do I simply
> >> divide by the number of atoms used in the calculation?
> >>
> >> 2) If the eigenvalue represents the sum of the variances for each
> >> particle along the eigenvector then dividing the eigenvector by the
> >> number of atoms used in the calculation should be the average
> >> variance. Likewise, the square root of this should be the average
> >> standard deviation per atom. In my case, the first eigenvector is a
> >> stretching in the length of the peptide. Shouldn't the average
> >> standard deviation per atom along this stretching motion be smaller
> >> that the standard deviation in the length of the entire peptide, or
> >> at least smaller than the extended length of the peptide?
> >>
> >> Thank you,
> >>
> >> Tyler
> >>
> >>> Hi Tyler,
> >>>
> >>> Note that the eigenvalue represents the sum of the variances for
> >>> each
> >>> particle along the associated eigenvector. That seems quite
> >>> reasonable to
> >>> me.
> >>>
> >>> Tsjerk
> >>>
> >>> On 4/6/06, Tyler Luchko <tluchko at ualberta.ca> wrote:
> >>>>
> >>>> Hello,
> >>>>
> >>>> I have performed PCA analysis, without mass weighting, on a peptide
> >>>> using g_covar and g_anaeig. The first principal component
> >>>> generally
> >>>> corresponds to the stretching of the peptide. I understand that
> >>>> each
> >>>> eigenvalue represents the variance in the motion along the
> >>>> associated
> >>>> eigenvector. However, the square root of the variance for the
> >>>> first
> >>>> eigenvalue is ~20 nm while the maximum extended length of any
> >>>> peptide
> >>>> is ~3 nm. I have tried normalizing the eigenvalues by the
> >>>> number of
> >>>> atoms used for the analysis (73) but this gives the standard
> >>>> deviation of the motion to be ~2.2 nm, still much too large. I
> >>>> would
> >>>> like to know how to normalize the eigenvalues to obtain reasonable
> >>>> standard deviations from the eigenvalues.
> >>>>
> >>>> Thank you,
> >>>>
> >>>> Tyler
> >>>>
> >>>>
> >>>> ________________________________________________________________
> >>>> (_ Tyler Luchko Ph.D. Candidate _)
> >>>> _) Department of Physics University of Alberta (_
> >>>> (_ Edmonton, Alberta, Canada _)
> >>>> _) 780-492-1063 tluchko at ualberta.ca (_
> >>>> (________________________________________________________________)
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> gmx-users mailing list gmx-users at gromacs.org
> >>>> http://www.gromacs.org/mailman/listinfo/gmx-users
> >>>> Please don't post (un)subscribe requests to the list. Use the
> >>>> www interface or send it to gmx-users-request at gromacs.org.
> >>>> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>> Tsjerk A. Wassenaar, M.Sc.
> >>> Groningen Biomolecular Sciences and Biotechnology Institute (GBB)
> >>> Dept. of Biophysical Chemistry
> >>> University of Groningen
> >>> Nijenborgh 4
> >>> 9747AG Groningen, The Netherlands
> >>> +31 50 363 4336
> >>> -------------- next part --------------
> >>> An HTML attachment was scrubbed...
> >>> URL: http://www.gromacs.org/pipermail/gmx-users/attachments/
> >>> 20060406/0ffa9560/attachment-0001.html
> >>>
> >> _______________________________________________
> >> gmx-users mailing list gmx-users at gromacs.org
> >> http://www.gromacs.org/mailman/listinfo/gmx-users
> >> Please don't post (un)subscribe requests to the list. Use the
> >> www interface or send it to gmx-users-request at gromacs.org.
> >> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
> >>
> >
> >
> >
> > --
> >
> > Tsjerk A. Wassenaar, M.Sc.
> > Groningen Biomolecular Sciences and Biotechnology Institute (GBB)
> > Dept. of Biophysical Chemistry
> > University of Groningen
> > Nijenborgh 4
> > 9747AG Groningen, The Netherlands
> > +31 50 363 4336
>
> _______________________________________________
> gmx-users mailing list gmx-users at gromacs.org
> http://www.gromacs.org/mailman/listinfo/gmx-users
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> Can't post? Read http://www.gromacs.org/mailing_lists/users.php
>
--
Tsjerk A. Wassenaar, M.Sc.
Groningen Biomolecular Sciences and Biotechnology Institute (GBB)
Dept. of Biophysical Chemistry
University of Groningen
Nijenborgh 4
9747AG Groningen, The Netherlands
+31 50 363 4336
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://maillist.sys.kth.se/pipermail/gromacs.org_gmx-users/attachments/20060408/2d034944/attachment.html>
More information about the gromacs.org_gmx-users
mailing list