[gmx-users] How to make PCA effective?
Jinzhi Tan
jztan at mail.shcnc.ac.cn
Thu Nov 11 13:48:24 CET 2004
Dear gmx-users,
After I run the conventional MD simulation for several nanoseconds,I want to do PCA. I encountered some problems.
Firstly, How long time should I run the conventional MD when I try to do PCA? as long as possible? I was told that the samples in the conformational space will be enough if the simulation time is long enough. But I am not sure it does work because I found some loops are mobile and they moved just at the first several hundreds picoseconds and then they hold the new position for a long time. I wonder if they can come back to their original conformation if I run long MD simulation? Another case is the protein unfolding. Some papers reported the protein unfolding after a long MD time (several nanoseconds), but I wonder if the time is long enough, the protein can fold automatically. What do we think about the effect of the force field?
Secondly, which time should I select as the initial time of PCA? Should I select the time when the RMSD of the protein tends to be level off after about two nanosecond or should I select the whole MD simulation time? But in some papers, they just run one nanosecond in total and then do PCA? Is it correct?
Thirdly, I used two methods to analyze the first eigenvector and got different results? I am not sure why they are different? If I use: g_anaeig -v eigenvec.trr -first 1 -last 1 -extr vec1_extreme.pdb, I got the following result:
1 eigenvectors selected for output: 1
Last frame 9445 time 9445.000
eigenvector Minimum Maximum
value time value time
1 -6.273994 454.0 5.266299 9429.0
Writing 2 frames along eigenvector 1 to vec1_extreme.pdb
When I use: g_anaeig -v eigenvec.trr -first 1 -last 8 -extr vec18_extreme.pdb, I got:
8 eigenvectors selected for output: 1 2 3 4 5 6 7 8
Last frame 9445 time 9445.000
eigenvector Minimum Maximum
value time value time
1 -6.273994 454.0 5.266299 9429.0
2 -4.850856 11.0 4.864636 5113.0
3 -2.722965 6113.0 2.619274 2238.0
4 -2.837103 3826.0 2.447154 8460.0
5 -3.493261 7502.0 2.076011 778.0
6 -2.219512 5995.0 2.655742 489.0
7 -1.916822 5302.0 2.395802 2613.0
8 -2.154755 62.0 1.883655 7235.0
Writing 2 frames along eigenvector 1 to vec18_extreme1.pdb
Writing 2 frames along eigenvector 2 to vec18_extreme2.pdb
Writing 2 frames along eigenvector 3 to vec18_extreme3.pdb
Writing 2 frames along eigenvector 4 to vec18_extreme4.pdb
Writing 2 frames along eigenvector 5 to vec18_extreme5.pdb
Writing 2 frames along eigenvector 6 to vec18_extreme6.pdb
Writing 2 frames along eigenvector 7 to vec18_extreme7.pdb
Writing 2 frames along eigenvector 8 to vec18_extreme8.pdb
So what is the mean of "value"? Is the time corresponding to the real simulation time? But I check the snapshot at 454.0 ps,vec1_extreme.pdb (select the minimal) and vec18_extreme1.pdb (select the minimal), they are not the same! So what is meaning of the time?
For the two results, the information of first eigenvector is the same (as above), but actually the vec1_extreme.pdb and vec18_extreme1.pdb is different. Should they be the same?
I am not sure if I am confused about the basic theory of PCA or make some other mistakes. Hope you can give me some advice. Thank you very much!
Best wishes,
Jinzhi Tan <tanjinzhi at hotmail.com>
2004-11-10
More information about the gromacs.org_gmx-users
mailing list