[gmx-users] PCA problems

Tsjerk Wassenaar tsjerkw at gmail.com
Wed Jun 8 18:10:19 CEST 2016


Hi James,

That's silly! Ambiguous means that the same structure can have multiple
solutions in a fit. The fit to a single reference structure (with more than
three atoms) is never ambiguous. Can never, by definition!

Now if you have two reference structures at hand, and they have (quite)
different structures, then fitting on one may give a different ensemble
from fitting on the other. The fit is not consistent, and the inconsistency
is worse for flexible molecules. Different ensembles will mean different
correlations, thus giving different principal components.

Progressive fitting does not solve the problem. In fact, progressive
fitting _is_ ambiguous. Let's say we have a series of conformations ABCAC.
Then we  fit C once to B, which was fitted to A, and later we have C fitted
to A, which was fitted to the previous C. Note that in practice the
situation will be much worse as we can approach a certain configuration
from many sides. Using B as reference will yield a fit that is different
from using A as reference, so the structure C will have two different
orientations in the resulting ensemble. Hence, the fit is ambiguous.

For structured proteins, the difference will not matter much. However, in
long trajectories there may be an added contribution (drift) of the
orientation.

Hope it helps,

Tsjerk

On Wed, Jun 8, 2016 at 5:33 PM, <jkrieger at mrc-lmb.cam.ac.uk> wrote:

> Thanks Tsjerk,
>
> Isn't the progressive fit supposed to rotate everything back into the same
> orientation without having to worry about inferring that orientation from
> a reference structure that doesn't align well? Each configuration should
> in theory align well to its predecessor all the way back to the starting
> structure (which is what I'd usually take as a reference anyway).
>
> The original note I was thinking of says as follows:
>
> '''Before a PCA, all structures should be superimposed onto a common
> reference
> structure. This can be problematic for very flexible systems such as
> peptides,
> where the fit may be ambiguous, leading to artificial structural
> transitions. In
> certain cases, such problems may be alleviated by using a progressive fit,
> where
> each structure is superimposed onto the previous one. It is also important
> to note
> that when results of different PCAs are to be compared with each other,
> then
> each individual PCA should be based on the same reference structure used
> for
> superposition.'''
>
> Please could you explain further what it is I have misunderstood.
>
> Also would you say a progressive fit is a bad idea for more structured
> proteins?
>
> Many thanks
> James
>
> > Hi James,
> >
> > 'Spurious alignment' is the dependence of the resulting ensemble on the
> > reference structure. Unfortunately, that's not solved by a progressive
> > fit.
> > Rather, in a progressive fit, the same configuration can have multiple
> > orientations, based on the previous structures, which is also problematic
> > when you're trying to understand spatial correlations between atoms
> within
> > their reference frame.
> >
> > Cheers,
> >
> > Tsjerk
> >
> > On Wed, Jun 8, 2016 at 9:27 AM, <jkrieger at mrc-lmb.cam.ac.uk> wrote:
> >
> >> Dear Teresa,
> >>
> >> That sounds like a periodic boundary issue to me. It could be fixed by
> >> using a tpr instead of a gro as the gmx covar manual says "All
> >> structures
> >> are fitted to the structure in the structure file. When this is not a
> >> run
> >> input file periodicity will not be taken into account." Alternatively if
> >> you don't have a tpr you could use gmx trjconv first with -pbc whole or
> >> -pbc nojump.
> >>
> >> I also remember reading (I think it was in the Hayward and de Groot
> >> review
> >> 2008) that fitting peptides to a reference structure can cause spurious
> >> alignments. I don't know if this is also related to what you're seeing
> >> but
> >> it might be worth using gmx trjconv again with-fit progressive then use
> >> -nofit in gmx covar.
> >>
> >> Best wishes
> >> James
> >>
> >> > Dear GROMACS community
> >> >
> >> > I am trying to complete a PCA analysis of my peptide adsorbed to a
> >> > surface. However when I use :
> >> >
> >> > gmx covar -s trajectory.gro  -f md_golp_vacuo.xtc
> >> >
> >> > and select the protein for both the least squares fit and covariance
> >> > calculation, followed by
> >> >
> >> >
> >> > gmx anaeig -s trajectory.gro -f md_golp_vacuo.trr -filt filter1.gro
> >> > -first 1 -last 1 -skip 100
> >> >
> >> > and I select the peptide for the least squares and covariance
> >> > calculation
> >> >
> >> > My peptide is now broken up into pieces. Is this right?
> >> >
> >> >
> >> >
> >> > Best
> >> > Teresa
> >> > --
> >> > Gromacs Users mailing list
> >> >
> >> > * Please search the archive at
> >> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> > posting!
> >> >
> >> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >> >
> >> > * For (un)subscribe requests visit
> >> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> send
> >> > a mail to gmx-users-request at gromacs.org.
> >> >
> >>
> >> --
> >> Gromacs Users mailing list
> >>
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> >> posting!
> >>
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> >> * For (un)subscribe requests visit
> >> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> >> send a mail to gmx-users-request at gromacs.org.
> >>
> >
> >
> >
> > --
> > Tsjerk A. Wassenaar, Ph.D.
> > --
> > Gromacs Users mailing list
> >
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> > posting!
> >
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
> > * For (un)subscribe requests visit
> > https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send
> > a mail to gmx-users-request at gromacs.org.
> >
>
>
> --
> Gromacs Users mailing list
>
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before
> posting!
>
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> * For (un)subscribe requests visit
> https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
> send a mail to gmx-users-request at gromacs.org.
>



-- 
Tsjerk A. Wassenaar, Ph.D.


More information about the gromacs.org_gmx-users mailing list