[gmx-users] Principal Component Analysis

Mon Aug 13 18:37:52 CEST 2007

Hello,

I have a question concerning principal component analysis.

In principal component analysis (PCA) it is assumed that the coordinates along
each degree of freedom are Gaussianly distributed. If the data does not follow
a normal distribution, PCA may not identify the correct principal modes since
the largest variances do not correspond to the meaningful axes (e.g. J. Chem.
Phys. (2006) 124, 024910).
However, PCA is frequently applied to systems involving significant anharmonic
motions. Even for native state simulations, anharmonic fluctuations are
identified when projected along the principal axes (e.g. Proteins (1993) 17,
412-425). Some researcher applied the method to complete unfolding trajectories
(e.g. J. Mol. Biol. (1999) 290, 283-304). Especially in the case of unfolding
trajectories, I would expect that the coordinates corresponding to a certain
degree of freedom do not follow a Gaussian distribution.
My question is: Why can we (successfully) apply PCA to MD (unfolding)
trajectories?

Thank you for your help.

-Matthias

--------------------------------------------------------------------------------
Matthias M. Waegele
Graduate Student
Gai Research Group http://gailab4.chem.upenn.edu/
Department of Chemistry
University of Pennsylvania
231 South 34th Street
Philadelphia, PA 19104-6323
--------------------------------------------------------------------------------