[gmx-developers] preserving residue numbers
Anton Feenstra
feenstra at few.vu.nl
Tue Mar 3 16:14:50 CET 2009
Berk Hess wrote:
> Hi,
>
> The past years we received a lot of requests for Gromacs to preserve the
> residue numbering of the original pdb file.
> Currently Gromacs simply renumbers all residues consecutively starting
> from one.
> I have now implemented free residue numbering (but not committed yet),
> including pdb insertion codes.
> This allows for more freedom, but therefore also some choices nedd to be
> made.
>
> A limitation of Gromacs, which I think we would like to preserve, is
> that residue numbers are stored
> per molecule type and therefore one can not have a different numbering
> in different molecules
> of the same type.
Agreed. This should cover most cases anyway.
> My current implementation preserves the residue numbers for
> multiple-residue molecules,
> but for single residue molecules (for instance water and ions) the
> residues will continue
> numbering from the last residue in the last molecule before.
> For a protein in water this gives the most desirable behavior. Although
> even here there are small
> issues. For instance the choice of pdb2gmx writing a gro file with the
> original pdb res numbers
> (e.g. for 2lzm the last prot. resnr is 164, but the first water is 166),
> which is then not what Gromacs
> will produce later, because the tpr made from the top file will number
> the waters starting at 165.
> Should the pdb2gmx output keep the pdb resnrs for the water or use the
> Gromacs convention?
I'd say output of pdb2gmx could retain pdb numbering as close as
possible, even though subsequent processing might lead to more extensive
residue renumbering.
> Another effect is that for for instance polymers there will be many
> residues with resnr 1 in the system.
> This is very convenient if you want to select all end groups, but less
> convenient if you want to select
> a particular residue, although this can be done by selecting chains or
> molecules.
>
> Finally several analysis tools write values as a function of residue
> number.
> For a single protein it might be very convenient to have the original
> residue numbers in the output.
> But if you have 4 chains, all with resnrs 1 to 200, things get messy.
>
> Do you have comments or suggestions?
Perhaps some options could control pdb2gmx behaviour here? Like in your
4-chain example you might want the options to preserve, or number
continuously, or add an offset, so chains start at 1, 1001, 2001, 3001?
Also, adding a 'renumber residues' option to an existing tool
(editconf?) might be convenient, to create numbering schemes for
different (display) purposes. I vaguely remember seeing such option
somewhere, but could not locate it just now.
--
Groetjes,
Anton
_____________ _______________________________________________________
| | |
| _ _ ___,| K. Anton Feenstra |
| / \ / \'| | | IBIVU/Bioinformatics - Free University Amsterdam |
|( | )| | | De Boelelaan 1083A - 1081 HV Amsterdam - Netherlands |
| \_/ \_/ | | | Tel +31 20 59 87783 - Fax +31 20 59 87653 - Room P136 |
| | Feenstra at few.vu.nl - www.few.vu.nl/~feenstra/ |
| | "You Could Be a Shadow" (The Breeders) |
|_____________|_______________________________________________________|
More information about the gromacs.org_gmx-developers
mailing list