[gmx-users] help with chromophore of a GFP

Thu Mar 21 11:46:12 CET 2013

On Wed, Mar 20, 2013 at 6:01 PM, Anna MARABOTTI <amarabotti at unisa.it> wrote:

>
>
> Dear gmx-users,
>
> it's about two weeks that I'm trying to solve this
> problem, and I can't, so I'm asking your help.
>
> I want to do some MD
> simulations on a protein of the family of green fluorescent protein.
> This protein, as you know, has a chromophore (CFY) derived from four
> residues of the protein (F64-C65-Y66-G67) and covalently bound to the
> rest of the protein chain. How to parametrize this object, since it is
> not recognized by pdb2gmx? I looked at the gmx-users list and the
> suggestion was to create a new entry in the .rtp file of the selected
> forcefield.

Indeed, this kind of problem is most easily solved by making a new
"residue" that contains the whole chromophore, such that it links to its
neighbours with normal peptide links.

> I decided to use Amber99SB since it seemed the better for my
> scope, then I start trying to parameterize it. This is what I did:
>
>         *
>
>
> I used Pymol to add H to my pdb file, since I want to use an all H
> forcefield and since Antechamber (see below) does not work without H
>         *
>
>
> I extracted the segment V63-CFY-H68 from my .pdb file. I did this
> since, when I extracted CFY only, I had problems with the terminals
>         *
>
>
> Following the Antechamber tutorial, I used Antechamber (using the
> traditional Amber force field, not GAFF) to calculate charges and to
> assign atom types to this segment.
>         *
>
> I used these calculated
> parameters in order to add the CFY residue to aminoacids.rtp in
> amber99sb.ff directory.
>         *
>
> I tried to modify also aminoacids.hdb, but
> since it seemed too complicated to me, I decided to keep it unchanged,
> and to give pdb2gmx the protein with H already present
>         *
>
> No need to
> add new atom/bond types to ffbonded.itp and ffnonbonded.itp: they seem
> all present. Since CFY is bound to the rest of protein with common
> peptide bonds, I did not change specbond.dat either.
>         *
>
> I added CFY
> in residuetypes.dat with the specification "Protein"
>
> In my opinion,
> all was ready to go, instead...
>
> When I launched pdb2gmx to my protein
> with H added by PyMol, I got immediately an error:
>
> Fatal error:
>
> Atom
> H01 in residue SER 3 was not found in rtp entry NSER with 13 atoms
>
>
> while sorting atoms.
>
> For a hydrogen, this can be a different
> protonation state, or it
>
> might have had a different number in the PDB
> file and was rebuilt
>
> (it might for instance have been H3, and we only
> expected H1 & H2).
>
> Note that hydrogens might have been added to the
> entry for the N-terminus.
>
> Remove this hydrogen or choose a different
> protonation state to solve it.
>
> Option -ignh will ignore all hydrogens
> in the input.
>
> For more information and tips for troubleshooting,
> please check the GROMACS
>
> website at
> http://www.gromacs.org/Documentation/Errors [1]
>
> >From this error I
> understand that:
>
>         *
>
> the code for H in PyMol is different from the
> code for H in Amber (read from aminoacids.rtp); in order to correct this
> error, I should add -ignh in order to ignore H in input.
>

pdb2gmx has to be able to make sense of the atom naming. There are lots of
different conventions for how to name atoms, particularly hydrogen atoms.
pdb2gmx can't possibly encode the logic to convert all of those
conventions. So the path of least resistance can be to ignore hydrogens and
regenerate them according to the generation rules.

However, you can just rename them in the input file so that pdb2gmx
understands your meaning. The NSER entry in the .rtp file shows you the
names pdb2gmx expects. If you edit the names of those hydrogen atoms
(probably H01, H02, H03) in your input coordinate file accordingly (to H1,
H2, H3), things will be fine. Be sure you don't break the required column
formatting of the coordinate file!

        *
>
> If I add
> -ignh, all the H of CFY will be ignored too, and I will not be able to
> add them since I did not modify aminoacids.hdb
>         *
>
> since I made
> calculations on CFY with H added by PyMol, probably also my codes for H
> will be wrong.
>

Your atom names for CFY in the .rtp and the input coordinate file will have
to match. How you want to achieve that is up to you.

>         *
>
> If I use "reduce" (the Amber tool to add H, as
> suggested by the tutorial) to add H to my protein, it does not add H to
> CFY because it complaints that the residue is not in HETATM connection
> database (but the record CONECT is present in .pdb file). If I add H to
> CFY alone, I have problems with the terminals.
>
> My question is,
> obviously: how can I parameterize this chromophore correctly? Please
> give me, if possible, some step-by-step indications on what to do. I
> made dozens of trials, ALL with errors, and I really do not know how to
> do.
>

You're very much on the right track.

Your decision to use Pymol to generate main chain hydrogens rather than
teach pdb2gmx how to generate CFY hydrogens had consequences that you are
now dealing with. In a subsequent email you said:

I am talking not only about the problem of obtaining parameters for this
> particular chromophore, mine is a more general question: how to deal with a
> "HETATM" entry which is not a ligand, but it's a part of the protein chain?
> I tried to follow indications to make a new .rtp entry in the GROMACS
> HowTo's, probably my problem would be solved if I would be able to modify
> the aminoacids.hdb file, but this is not a simple modification of a residue
> (eg. an oxidised Met or a methylation of a Lys), this is a profound
> modification of four residues, so how can I deal with this? I had a look at
> the .hdb file, but hydrogens I can see are typical for amino acids residues
> and I cannot find any suggestions on how to treat hydrogens that are bound
> to a "residue" which is so different from classic standard residues. Has
> anyone made this before (I am sure yes)? Could you please give some
> suggestions?

Generating hydrogen atoms requires that a program make guesses or receive
information about heavy-atom hybridization, hydrogen multiplicity and/or
hydrogen-atom naming. That's all that the .hdb format is doing, and manual
section 5.6.4 describes it. That the available examples are for simple
residues is not really an issue. If you need a fused-ring example, TRP side
chain will serve. The generated structures might not exactly fit the minima
for the model you have parameterized, but in-vacuo minimization will sort
that out easily (so long as your model doesn't have other larger problems!).

If you have your CFY hydrogens named like the .rtp entry, then you don't
need to get involved with .hdb at all. But if you do have feedback on how
we might improve the description in 5.6.4, we're all ears :-)

Mark