[gmx-users] Overlap between PC motions

Miguel Ángel Mompeán García mig.mompean at gmail.com
Tue Jun 18 20:00:49 CEST 2013

```Hi Tsjerk,

calculate the subspace overlap in the way you proposed.

I first performed the analysis varying the starting points, instead of the
end ones. I start from 110 ns down to 0ns, with 10ns time intervals:

Normalized  DT
0.402    110-100ns
0.456    110-90ns
0.472    110-80ns
0.485    110-70ns
0.570    110-60ns
0.585    110-50ns
0.594    110-40ns
0.595    110-30ns
0.598    110-20ns
0.602    110-10ns
1.000     Ttot

The analysis and visual inspection of the trajectory revealed that the
system, made up of six interacting monomers, is very stable. An Hbond is
formed at the half of the simulation since the N-terminal moiety of one of
these monomers moves towards another monomer. After my wrong conception and
based on your comments, now I think that the point is that the system was
relaxing towards that conformation and that was the reason of the 1.000
values for the subspace overlap in the first email I posted.
I have read through the bibliography that to have an estimation of whether
these values are indicative of good convergence is required a comparison
with similar systems. However, to understand the meaning of the method I
also did the covariance analysis on the last part of the trajectory, first
varying the end points (70-75; 70-80; 70-85 ... 70-110) and then varying
the starting ones (110-105; 110-100 ... 110-70) and I got the following
values:

Normalized  DT
0.503    110-105ns
0.735    110-100ns
0.821    110-95ns
0.947    110-90ns
1.000    110-85ns
1.000    110-80ns
1.000    110-75ns
1.000    110-70ns

Normalized  DT
0.439    70-75 ns
0.564    70-80 ns
0.614    70-85 ns
0.639    70-90 ns
0.895    70-95 ns
1.000    70-100ns
1.000    70-105ns
1.000    70-110ns

I do not know if I am wrong, but the relaxation is observed when varying
the end points (0-10, 0-20, 0-30 ... 0-110), and proceeding the reverse way
(110-100, 110-90, 110-80 ... 110-0) the results seem to make more sense.
However, when performing the covariance analysis on the last part of the
trajectory the thing of the covariance matrices being equal before reaching
the time window that covers the whole time analyzed is getting me confused.
It is expected to get so many 1.000-values? I do not see the point for the
covariance matrices being equal when varying the starting points.

Regards,
Miguel

2013/5/30 Tsjerk Wassenaar <tsjerkw at gmail.com>

> Hi Miguel,
>
> The fitting doesn't play a role; it's the dynamics of the system in the
> internal frame. Because the internal frame moves, you fit, so that any
> contribution due to the rigid body motion is removed.
>
> For the rest you have to look at it like this: You start out somewhere and
> walk (relax) towards some city, several kilometers away. You get there and
> get lost in the streets, making random walks. Now, what you do with
> covariance analysis is looking at the displacements around the mean
> position and the place where you ended up. The deviations from the mean
> will be dominated by vectors pointing from wherever you are in the city
> towards your starting point. That will be reflected in the covariance
> matrix, which doesn't change much, certainly not in terms of the directions
> of the first few eigenvectors.
> Probably you're actually more interested in the things you do in the city,
> your native configuration energy well. But to analyze that, you have to cut
> off the part of the journey that brought you there. So it makes sense to do
> the covariance analysis on the last half or quarter of the trajectory. It
> would also be interesting to check the convergence from the other side of
> the trajectory, varying the starting point of the analysis, rather than the
> end point.
>
>
> Hope it helps,
>
> Tsjerk
>
>
> On Thu, May 30, 2013 at 1:37 PM, Miguel Ángel Mompeán García <
> mig.mompean at gmail.com> wrote:
>
> > Hi Tsjerk,
> >
> > Thanks for the reply! So, let me see if I am getting the things right.
> The
> > same fitting structure is used for the overlap calculation. Since the
> > averaged structure is used for the covariance matrices, this is the
> reason
> > why the relaxation is included. Am I right?
> > The overall behavior of the system is that the structure keeps very
> > "compact" and one additional hydrogen bond between two residues is formed
> > at ~60ns, and stable till the end of the run, as computed with g_hbond.
> > That interaction is already present in the average structure. So if I
> > understood properly your comment, from the time window 0-70 up to the
> end,
> >  there will not be significant contributions to the covariance matrix.
> >
> >
> >
> > 2013/5/30 Tsjerk Wassenaar <tsjerkw at gmail.com>
> >
> > > Hi Miguel,
> > >
> > > Sorry for not responding earlier, but the question isn't really simple
> :)
> > > What you do is determining the covariance matrix from the start up to a
> > > certain point and see for different end points what the overlap is with
> > the
> > > covariance matrix from the whole. This means that in all cases, the
> > > relaxation of the system is included, and this is the dominant
> > contribution
> > > to your covariance matrix. At some point, the system has reached the
> > region
> > > around B and then stays there. All kinds of things may be happening,
> but
> > > they're overwhelmed by the changes associated with the relaxation.
> Hence,
> > > the part of the trajectory after 70 ns doesn't contribute significantly
> > to
> > > the covariance matrix anymore.
> > >
> > > Hope it helps,
> > >
> > > Tsjerk
> > >
> > >
> > >
> > > On Thu, May 30, 2013 at 12:26 PM, Miguel Ángel Mompeán García <
> > > mig.mompean at gmail.com> wrote:
> > >
> > > > Dear all,
> > > >
> > > > I posted some days ago an issue regarding overlap values. If any of
> you
> > > is
> > > > experienced with this I would appreciate some comments. Please find
> > below
> > > > the mentioned post:
> > > >
> > > >
> > > > I am doing PCA on a 110ns run.
> > > > When calculating the subspace overlap from independent PCA performed
> in
> > > > different time windows, I expect the overlap to be 1 only when the
> time
> > > > interval is equal to 110ns, since both covariance matrices are
> > identical.
> > > > However, I found that from the interval 0-70 forwards (I'm increasing
> > > each
> > > > window in 10ns) the overlap reaches that value. Is this expected?
> From
> > > the
> > > > rmsd, cosine content and visual inspection of the trajectory I
> expected
> > > > small changes from this point till the total time, but not the two
> > > matrices
> > > > being equal... Am I doing something wrong or this may happen?
> > > >
> > > > Here is my data set of results:
> > > >
> > > > Normalized Shape DT
> > > > 0.611        0.638    0-10ns
> > > > 0.616        0.650    0-20ns
> > > > 0.646        0.671    0-30ns
> > > > 0.656        0.678    0-40ns
> > > > 0.659        0.683    0-50ns
> > > > 0.873        0.919    0-60ns
> > > > 1.000        1.000    0-70ns
> > > > 1.000        1.000    0-80ns
> > > > 1.000        1.000    0-90ns
> > > > 1.000        1.000    0-100ns
> > > > 1.000        1.000    0-110ns = Ttot
> > > >
> > > >
> > > > Regards,
> > > > Miguel
> > > > --
> > > > gmx-users mailing list    gmx-users at gromacs.org
> > > > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > > > * Please search the archive at
> > > > http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> > > > * Please don't post (un)subscribe requests to the list. Use the
> > > > www interface or send it to gmx-users-request at gromacs.org.
> > > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > > >
> > >
> > >
> > >
> > > --
> > > Tsjerk A. Wassenaar, Ph.D.
> > > --
> > > gmx-users mailing list    gmx-users at gromacs.org
> > > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > > * Please search the archive at
> > > http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> > > * Please don't post (un)subscribe requests to the list. Use the
> > > www interface or send it to gmx-users-request at gromacs.org.
> > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> > >
> > --
> > gmx-users mailing list    gmx-users at gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> > * Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-users-request at gromacs.org.
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
>
>
>
> --
> Tsjerk A. Wassenaar, Ph.D.
> --
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>

```