R: Re: [gmx-users] numbering of .gro file

Mon May 16 11:55:37 CEST 2011

Dear gmx-users,
first let me resume my attempts on this question.

1) Using:
pdb2gmx -f my_protein.pdb -o my_protein.gro -p my_protein.top
 I obtained a .gro file in which the numbering of each chain correctly
starts from 21 to 379 in both chains, but no chain ID is present, so I
cannot distinguish residues of chain A from residues of chain B

2) Using:
pdb2gmx -f my_protein.pdb -o my_protein.gro -p my_protein.top -renum
 I obtained a .gro file in which the numbering of each chain starts from 1
to 359 in both chains (i.e. the second chain does not continue the numbering
from 360), but no chain ID is present, so again I cannot distinguish
residues of chain A from residues of chain B

3) Using:
pdb2gmx -f my_protein.pdb -o my_protein.gro -p my_protein.top -chainsep id
 I obtained a .gro file in which the numbering of each chain starts from 21
to 379 in both chains (i.e. the second chain does not continue the numbering
from 360), but no chain ID is present, so again I cannot distinguish
residues of chain A from residues of chain B.

4) Using:
pdb2gmx -f my_protein.pdb -o my_protein.pdb -p my_protein.top
 I obtained an output .pdb file in which the numbering and the chain ID of
the input .pdb file are kept (and so I can distinguish between the residues
of chain A and of chain B. However, when I continue with editconf+genbox and
add 22586 water molecules, I have the problem of more than 9999 residues,
which is not compatible with a .pdb format, so the numbering of the residues
restarts twice from zero.

My question is: can I distinguish chain A and B of my protein (either with a
chain ID or with a consecutive numbering of the two chains) AND have a file
format compatible with more than 9999 residues using Gromacs only?

Thank you very much
Anna

Date: Fri, 13 May 2011 09:54:38 -0400
From: "Justin A. Lemkul" <jalemkul at vt.edu>
Subject: Re: [gmx-users] numbering of .gro file
To: Discussion list for GROMACS users <gmx-users at gromacs.org>
Message-ID: <4DCD381E.1020500 at vt.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Anna Marabotti wrote:
> Dear Mark,
> thank you also for your suggestion, indeed using the nvt.gro file with the
> sequential numbering I was able to distinguish the contributions from both
> chains, instead of seeing them superimposed.
> Now I have another question. I used pdb2gmx to prepare another file for
> simulation (it is the same protein as above, with a mutation). The pdb
file
> contains two identical chains numbered starting from 21 to 379, marked
with
> chain ID A and B.
> Using the command line:
> pdb2gmx -f my_protein.pdb -o my_protein.gro -p my_protein.top
> I obtained a .gro file in which the numbering correctly starts from 21 to
> 379 in both chains, but no chain ID is present. I also tried to use
> -chainsep, but nothing changed. So my (last) question is: is there any way
> to avoid renumbering the file, but without obtaining a superposition of
> residue numbers in both subunits? In other workds: is there a possibility
to
> leave some form of "chain identifier" in the .gro file? Or the only way to
> obtain unambiguous identification of each residue is to renumber the file?

There are no chain identifiers in .gro format.  If you want them, use .pdb 
instead.  You can make use of a variety of formats for just about all the 
Gromacs tools; there is no requirement for .gro.  If your input .pdb file
has 
chain identifiers, so too should the output structure.

-Justin