[gmx-developers] preserving residue numbers

Anton Feenstra feenstra at few.vu.nl
Tue Mar 3 16:14:50 CET 2009


Berk Hess wrote:
> Hi,
> 
> The past years we received a lot of requests for Gromacs to preserve the 
> residue numbering of the original pdb file.
> Currently Gromacs simply renumbers all residues consecutively starting 
> from one.
> I have now implemented free residue numbering (but not committed yet), 
> including pdb insertion codes.
> This allows for more freedom, but therefore also some choices nedd to be 
> made.
> 
> A limitation of Gromacs, which I think we would like to preserve, is 
> that residue numbers are stored
> per molecule type and therefore one can not have a different numbering 
> in different molecules
> of the same type.

Agreed. This should cover most cases anyway.

> My current implementation preserves the residue numbers for 
> multiple-residue molecules,
> but for single residue molecules (for instance water and ions) the 
> residues will continue
> numbering from the last residue in the last molecule before.
> For a protein in water this gives the most desirable behavior. Although 
> even here there are small
> issues. For instance the choice of pdb2gmx writing a gro file with the 
> original pdb res numbers
> (e.g. for 2lzm the last prot. resnr is 164, but the first water is 166), 
> which is then not what Gromacs
> will produce later, because the tpr made from the top file will number 
> the waters starting at 165.
> Should the pdb2gmx output keep the pdb resnrs for the water or use the 
> Gromacs convention?

I'd say output of pdb2gmx could retain pdb numbering as close as 
possible, even though subsequent processing might lead to more extensive 
residue renumbering.

> Another effect is that for for instance polymers there will be many 
> residues with resnr 1 in the system.
> This is very convenient if you want to select all end groups, but less 
> convenient if you want to select
> a particular residue, although this can be done by selecting chains or 
> molecules.
> 
> Finally several analysis tools write values as a function of residue 
> number.
> For a single protein it might be very convenient to have the original 
> residue numbers in the output.
> But if you have 4 chains, all with resnrs 1 to 200, things get messy.
> 
> Do you have comments or suggestions?

Perhaps some options could control pdb2gmx behaviour here? Like in your 
4-chain example you might want the options to preserve, or number 
continuously, or add an offset, so chains start at 1, 1001, 2001, 3001?

Also, adding a 'renumber residues' option to an existing tool 
(editconf?) might be convenient, to create numbering schemes for 
different (display) purposes. I vaguely remember seeing such option 
somewhere, but could not locate it just now.

-- 
Groetjes,

Anton
  _____________ _______________________________________________________
|             |                                                       |
|  _   _  ___,| K. Anton Feenstra                                     |
| / \ / \'| | | IBIVU/Bioinformatics - Free University  Amsterdam     |
|(   |   )| | | De Boelelaan 1083A - 1081 HV Amsterdam - Netherlands  |
| \_/ \_/ | | | Tel +31 20 59 87783 - Fax +31 20 59 87653 - Room P136 |
|             | Feenstra at few.vu.nl - www.few.vu.nl/~feenstra/         |
|             | "You Could Be a Shadow" (The Breeders)                |
|_____________|_______________________________________________________|



More information about the gromacs.org_gmx-developers mailing list