[gmx-users] help with chromophore of a GFP

Thu Mar 21 16:40:32 CET 2013

On Thu, Mar 21, 2013 at 11:30 AM, Anna MARABOTTI <amarabotti at unisa.it>wrote:

>
>
> Dear Mark,
>
> thank you for your message. I'm happy to be on the
> right track; unfortunately the end point seems to be very far away...
>
>
> I tried to obtain that CFY hydrogens and protein hydrogens are all
> matching the aminoacids.rtp entry, in order to avoid dealing with
> aminoacids.hdb. This is what I did:
>
> - starting from the pdb file of
> the protein, I removed CFY entry (prot_noCFY.pdb)
>
> - I used pdb2gmx to
> add H to the protein only: pdb2gmx -f prot_noCFY.pdb -o prot_noCFY_H.pdb
> -p topol.top
>
> - I inserted CFY_H.pdb (obtained with Pymol in a previous
> passage in which I added H with Pymol to the protein, including CFY)
> into prot_noCFY_H.pdb, obtaining prot_CFY_H.pdb.
>
> In this way, H atoms
> bound to "regular" residues have been added using Amber99SB, therefore
> they are compatible with this ff, and atoms of CFY (previously added
> with Pymol) have the same naming convention in aminoacids.rtp (that I
> edited using atom types, charges etc. calculated with Antechamber on
> this molecule coming from Pymol). Obviously, the atom numbering is not
> sequential: the last atom of V63 (the last "regular" residue before CFY)
> is numbered 938, the first atom of H68 (the first "regular" residue
> after CFY) is numbered 939, and the atoms of CFY66 are numbered from 1
> to 70. Moreover, since the sequence of atoms in aminoacids.rtp is not
> the same as in the coordinates of CFY (I adapted the sequence of atoms
> following the format of other residues in aminoacids.rtp), the numbering
> of CFY in the prot_CFY_H.pdb is not ordered (1-2-3-....-69-70) but
> disordered (19-54-20-55...49-50-24-25).
>
> - At this stage, I used
> pdb2gmx again to create the topol.top file with all coordinates correct:
>
>
> pdb2gmx -f prot_CFY_H.pdb -o prot_complete.gro -p topol.top
>
>
> (selecting amber99sb forcefield and tip3p for water, as recommended
> option)
>
> This is the message error from pdb2gmx:
>
> Read 'FLUORESCENT
> PROTEIN', 3346 atoms
> Analyzing pdb file
> Splitting PDB chains based on
> TER records or changing chain id.
> There are 1 chains and 0 blocks of
> water and 218 residues with 3346 atoms
>
>  chain #res #atoms
>  1 'A' 213
> 3346
>
> All occupancies are one
> Opening force field file
> ./amber99sb.ff/atomtypes.atp
> Atomtype 1
> Reading residue database...
> (amber99sb)
> Opening force field file
> ./amber99sb.ff/aminoacids.rtp
> Residue 94
> Sorting it all out...
> Opening
> force field file ./amber99sb.ff/dna.rtp
> Residue 110
> Sorting it all
> out...
> Opening force field file ./amber99sb.ff/rna.rtp
> Residue
> 126
> Sorting it all out...
> Opening force field file
> ./amber99sb.ff/aminoacids.hdb
> Opening force field file
> ./amber99sb.ff/dna.hdb
> Opening force field file
> ./amber99sb.ff/rna.hdb
> Opening force field file
> ./amber99sb.ff/aminoacids.n.tdb
> Opening force field file
> ./amber99sb.ff/aminoacids.c.tdb
>
> Processing chain 1 'A' (3346 atoms, 213
> residues)
> There are 327 donors and 319 acceptors
> There are 539 hydrogen
> bonds
> Will use HISE for residue 22
> Will use HISD for residue 38
> Will use
> HISE for residue 62
> Will use HISE for residue 68
> Will use HISD for
> residue 109
> Will use HISE for residue 119
> Will use HISE for residue
> 172
> Will use HISH for residue 193
> Will use HISH for residue 197
> Will use
> HISE for residue 217
> Identified residue SER3 as a starting
> terminus.
> Identified residue SER218 as a ending terminus.
> 8 out of 8
> lines of specbond.dat converted successfully
> Special Atom Distance
> matrix:
>  MET9 MET11 MET15 HIS22 HIS38 MET41 MET47
>  SD110 SD149 SD232
> NE2317 NE2549 SD596 SD700
>  MET11 SD149 0.807
>  MET15 SD232 2.279 1.627
>
> HIS22 NE2317 3.707 2.983 1.466
>  HIS38 NE2549 1.401 0.928 2.127 3.254
>
> MET41 SD596 1.458 0.665 1.144 2.384 1.001
>  MET47 SD700 3.059 2.324 0.995
> 0.801 2.656 1.761
>  MET53 SD777 2.786 1.999 0.990 1.171 2.160 1.373
> 0.603
>  HIS62 NE2917 2.340 1.733 0.833 1.797 1.988 1.236 1.583
>  HIS68
> NE21002 0.884 0.597 1.466 2.916 1.356 0.885 2.347
>  HIS109 NE21638 2.061
> 1.886 1.380 2.614 2.661 1.862 2.279
>  HIS119 NE21803 1.459 0.967 0.923
> 2.372 1.617 0.812 1.870
>  MET135 SD2041 3.480 2.751 1.316 0.606 2.919
> 2.121 0.993
>  MET162 SD2439 2.521 1.976 1.656 2.412 1.855 1.543 2.264
>
> HIS172 NE22588 3.632 2.949 1.894 1.657 2.872 2.338 1.945
>  CYS174 SG2623
> 2.968 2.372 1.452 1.861 2.428 1.848 1.924
>  MET189 SD2891 2.167 2.379
> 2.736 4.000 2.754 2.569 3.722
>  HIS193 NE22942 2.003 2.001 2.490 3.686
> 2.049 2.075 3.396
>  HIS197 NE23011 2.012 1.634 1.830 2.896 1.554 1.426
> 2.614
>  HIS217 NE23329 2.545 2.376 2.831 3.805 2.039 2.305 3.575
>  MET53
> HIS62 HIS68 HIS109 HIS119 MET135 MET162
>  SD777 NE2917 NE21002 NE21638
> NE21803 SD2041 SD2439
>  HIS62 NE2917 1.363
>  HIS68 NE21002 2.107 1.482
>
> HIS109 NE21638 2.365 1.568 1.372
>  HIS119 NE21803 1.688 0.976 0.584
> 1.078
>  MET135 SD2041 1.057 1.365 2.661 2.490 2.119
>  MET162 SD2439 1.878
> 0.871 1.805 2.246 1.520 1.861
>  HIS172 NE22588 1.721 1.401 2.829 2.860
> 2.359 1.067 1.342
>  CYS174 SG2623 1.694 0.725 2.140 2.152 1.681 1.297
> 0.745
>  MET189 SD2891 3.547 2.310 1.858 1.893 1.980 3.627 2.290
>  HIS193
> NE22942 3.076 1.890 1.639 2.197 1.760 3.221 1.547
>  HIS197 NE23011 2.229
> 1.149 1.407 2.078 1.323 2.401 0.676
>  HIS217 NE23329 3.146 2.112 2.205
> 2.935 2.272 3.263 1.402
>  HIS172 CYS174 MET189 HIS193 HIS197
>  NE22588
> SG2623 SD2891 NE22942 NE23011
>  CYS174 SG2623 0.826
>  MET189 SD2891 3.417
> 2.599
>  HIS193 NE22942 2.831 2.079 1.020
>  HIS197 NE23011 2.011 1.324
> 1.766 0.939
>  HIS217 NE23329 2.629 2.068 1.936 0.946 1.003
> Opening force
> field file ./amber99sb.ff/aminoacids.arn
> Opening force field file
> ./amber99sb.ff/dna.arn
> Opening force field file
> ./amber99sb.ff/rna.arn
> Checking for duplicate atoms....
> Now there are
> 3345 atoms. Deleted 1 duplicates.
> Now there are 213 residues with 3345
> atoms
> Making bonds...
> Warning: Long Bond (988-989 = 0.453624
> nm)
>
> WARNING: atom O1 is missing in residue CFY 66 in the pdb
> file
>
> -------------------------------------------------------
> Program
> pdb2gmx_d, VERSION 4.5.4
> Source code file: pdb2top.c, line: 1463
>
> Fatal
> error:
> There were 1 missing atoms in molecule Protein_chain_A, if you
> want to use this incomplete topology anyhow, use the option -missing
> For
> more information and tips for troubleshooting, please check the
> GROMACS
> website at http://www.gromacs.org/Documentation/Errors
>
> The
> strange thing is that I checked for this error, but atom O1 in residue
> CFY66 is present BOTH in the starting .pdb file (the one I used for
> pdb2gmx) AND in the aminoacids.rtp file!!!! I checked 4 or 5 times,
> every time erasing the old file, checking the file IMMEDIATELY BEFORE
> submitting it to pdb2gmx. All atoms present in aminoacids.rtp for CFY
> residue are also present in the .pdb file and vice versa, and I am sure
> I did not make the stupid error of naming the atom 01 (zero-one) instead
> of O1 (o-one).
>
> I suspect that this atom is the one which is deleted
> because recognized as duplicated, but I'm not sure about it and I don't
> know how to check it. I am sure there are no duplicated atoms in CFY.
>
>
> I feel like this is a "fake" error message (i.e.: there is an error in
> my files, but it is not the one that is reported in the message:
> probably a problem occur around this atom, but it is not exactly ON this
> atom). However, I am not able to find errors.
>
>
This is indeed a false error. It comes from the fact that pdb2gmx
interprets anything named O1 or O2 as C-terminal atoms. Use any other name
you like aside from O1 or O2.

-Justin

> BTW the "long bond" of
> the other warning message is not involving residue CFY.
>
> Any help is
> welcome
>
> Thank you so much.
>
> Anna
>
> Il 21.03.2013 12:00
> gmx-users-request at gromacs.org ha scritto:
>
> >> Dear gmx-users, it's
> about two weeks that I'm trying to solve this problem, and I can't, so
> I'm asking your help. I want to do some MD simulations on a protein of
> the family of green fluorescent protein. This protein, as you know, has
> a chromophore (CFY) derived from four residues of the protein
> (F64-C65-Y66-G67) and covalently bound to the rest of the protein chain.
> How to parametrize this object, since it is not recognized by pdb2gmx? I
> looked at the gmx-users list and the suggestion was to create a new
> entry in the .rtp file of the selected forcefield.
> >
> > Indeed, this
> kind of problem is most easily solved by making a new
> > "residue" that
> contains the whole chromophore, such that it links to its
> > neighbours
> with normal peptide links.
> > ------------------------------ Message: 5
> Date: Thu, 21 Mar 2013 11:46:12 +0100 From: Mark Abraham
> <mark.j.abraham at gmail.com [2]> Subject: Re: [gmx-users] help with
> chromophore of a GFP To: Discussion list for GROMACS users
> <gmx-users at gromacs.org [3]> Message-ID:
> <CAMNuMASicyMGiVb_x5sY1YB44th8VKNioQVhzDqq-tAm9TnRqQ at mail.gmail.com [4]>
> Content-Type: text/plain; charset=ISO-8859-1 On Wed, Mar 20, 2013 at
> 6:01 PM, Anna MARABOTTI <amarabotti at unisa.it [5]> wrote:
> >
> >> I
> decided to use Amber99SB since it seemed the better for my scope, then I
> start trying to parameterize it. This is what I did: * I used Pymol to
> add H to my pdb file, since I want to use an all H forcefield and since
> Antechamber (see below) does not work without H * I extracted the
> segment V63-CFY-H68 from my .pdb file. I did this since, when I
> extracted CFY only, I had problems with the terminals * Following the
> Antechamber tutorial, I used Antechamber (using the traditional Amber
> force field, not GAFF) to calculate charges and to assign atom types to
> this segment. * I used these calculated parameters in order to add the
> CFY residue to aminoacids.rtp in amber99sb.ff directory. * I tried to
> modify also aminoacids.hdb, but since it seemed too complicated to me, I
> decided to keep it unchanged, and to give pdb2gmx the protein with H
> already present * No need to add new atom/bond types to ffbonded.itp and
> ffnonbonded.itp: they seem all present. Since CFY is bound to the rest
> of protein with common peptide bonds, I did not change specbond.dat
> either. * I added CFY in residuetypes.dat with the specification
> "Protein" In my opinion, all was ready to go, instead... When I launched
> pdb2gmx to my protein with H added by PyMol, I got immediately an error:
> Fatal error: Atom H01 in residue SER 3 was not found in rtp entry NSER
> with 13 atoms while sorting atoms. For a hydrogen, this can be a
> different protonation state, or it might have had a different number in
> the PDB file and was rebuilt (it might for instance have been H3, and we
> only expected H1 & H2). Note that hydrogens might have been added to the
> entry for the N-terminus. Remove this hydrogen or choose a different
> protonation state to solve it. Option -ignh will ignore all hydrogens in
> the input. For more information and tips for troubleshooting, please
> check the GROMACS website at http://www.gromacs.org/Documentation/Errors
> [1][1]
> >>
> >>> From this error I
> >> understand that: * the code for H
> in PyMol is different from the code for H in Amber (read from
> aminoacids.rtp); in order to correct this error, I should add -ignh in
> order to ignore H in input.
> >
> > pdb2gmx has to be able to make sense of
> the atom naming. There are lots of
> > different conventions for how to
> name atoms, particularly hydrogen atoms.
> > pdb2gmx can't possibly encode
> the logic to convert all of those
> > conventions. So the path of least
> resistance can be to ignore hydrogens and
> > regenerate them according to
> the generation rules.
> >
> > However, you can just rename them in the
> input file so that pdb2gmx
> > understands your meaning. The NSER entry in
> the .rtp file shows you the
> > names pdb2gmx expects. If you edit the
> names of those hydrogen atoms
> > (probably H01, H02, H03) in your input
> coordinate file accordingly (to H1,
> > H2, H3), things will be fine. Be
> sure you don't break the required column
> > formatting of the coordinate
> file!
> >
> > *
>
>
>
> Links:
> ------
> [1]
> http://www.gromacs.org/Documentation/Errors
> [2]
> mailto:mark.j.abraham at gmail.com
> [3] mailto:gmx-users at gromacs.org
> [4]
> mailto:CAMNuMASicyMGiVb_x5sY1YB44th8VKNioQVhzDqq-tAm9TnRqQ at mail.gmail.com
> [5]
> mailto:amarabotti at unisa.it
> --
> gmx-users mailing list    gmx-users at gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-request at gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>

-- 

========================================

Justin A. Lemkul, Ph.D.
Research Scientist
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540)
231-9080http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin

========================================