MBE Advance Access published online on March 30, 2007
Molecular Biology and Evolution, doi:10.1093/molbev/msm064
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
© 2007 The Authors
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Research Article |
An Empirical Codon Model for Protein Sequence Evolution

# EMBL-European Bioinformatics Institute, Hinxton, U.K.
Department of Bioengineering, University of California, Berkeley, USA
* Current address and corresponding author: Carolin Kosiol, Department of Biological Statistics and Computational Biology, 169 Biotechnology Building, Cornell University, Ithaca, NY 14853, USA. tel: +1-607-255 7430, fax: +1-607-255 4698, e-mail: ck285{at}cornell.edu
Received for publication November 16, 2006. Revision received March 16, 2007. Accepted for publication March 19, 2007.
In the past, two kinds of Markov models have been considered to describe protein sequence evolution. Codon-level models have been mechanistic, with a small number of parameters designed to take into account features such as transition-transversion bias, codon frequency bias and synonymous-nonsynonymous amino acid substitution bias. Amino acid models have been empirical, attempting to summarize the replacement patterns observed in large quantities of data and not explicitly considering the distinct factors that shape protein evolution. We have estimated the first empirical codon model. Previous codon models assume that protein evolution proceeds only by successive single nucleotide substitutions, but our results indicate that model accuracy is significantly improved by incorporating instantaneous doublet and triplet changes. We also find that the affiliations between codons, the amino acid each encodes and the physico-chemical properties of the amino acids are main factors driving the process of codon evolution. Neither multiple nucleotide changes nor the strong influence of the genetic code nor amino acids' physico-chemical properties form a part of standard mechanistic models and their views of how codon evolution proceeds. We have implemented the empirical codon model for likelihood-based phylogenetic analysis, and an assessment of its ability to describe protein evolution shows it consistently outperforms comparable mechanistic codon models. We point out the biological interpretation of our empirical codon model and possible consequences for studies of selection.
Key Words: protein evolution codon models Markov models maximum likelihood phylogenetic inference
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. Heger, C. P. Ponting, and I. Holmes Accurate Estimation of Gene Evolutionary Rates Using XRATE, with an Application to Transmembrane Proteins Mol. Biol. Evol., August 1, 2009; 26(8): 1715 - 1721. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Fletcher and Z. Yang INDELible: A Flexible Simulator of Biological Sequence Evolution Mol. Biol. Evol., August 1, 2009; 26(8): 1879 - 1888. [Abstract] [Full Text] [PDF] |
||||
![]() |
T.-K. Seo and H. Kishino Statistical Comparison of Nucleotide, Amino Acid, and Codon Substitution Models for Evolutionary Analysis of Protein-Coding Sequences Syst Biol, June 29, 2009; (2009) syp015v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Anisimova and C. Kosiol Investigating Protein-Coding Sequence Evolution with Probabilistic Codon Substitution Models Mol. Biol. Evol., February 1, 2009; 26(2): 255 - 271. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Delport, K. Scheffler, and C. Seoighe Models of coding sequence evolution Brief Bioinform, January 1, 2009; 10(1): 97 - 109. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. P Huelsenbeck, P. Joyce, C. Lakner, and F. Ronquist Bayesian analysis of amino acid substitution models Phil Trans R Soc B, December 27, 2008; 363(1512): 3941 - 3953. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. T Holder, D. J Zwickl, and C. Dessimoz Evaluating the robustness of phylogenetic methods to among-site variability in substitution processes Phil Trans R Soc B, December 27, 2008; 363(1512): 4013 - 4021. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Rodrigue, N. Lartillot, and H. Philippe Bayesian Comparisons of Codon Substitution Models Genetics, November 1, 2008; 180(3): 1579 - 1591. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. L. Kosakovsky Pond, A. F.Y. Poon, A. J. Leigh Brown, and S. D.W. Frost A Maximum Likelihood Method for Detecting Directional Evolution in Protein Sequences and Its Application to Influenza A Virus Mol. Biol. Evol., September 1, 2008; 25(9): 1809 - 1824. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Barquist and I. Holmes xREI: a phylo-grammar visualization webserver Nucleic Acids Res., July 1, 2008; 36(suppl_2): W65 - W69. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Q. Le and O. Gascuel An Improved General Amino Acid Replacement Matrix Mol. Biol. Evol., July 1, 2008; 25(7): 1307 - 1320. [Abstract] [Full Text] [PDF] |
||||
![]() |
T.-K. Seo and H. Kishino Synonymous Substitutions Substantially Improve Evolutionary Inference from Highly Diverged Proteins Syst Biol, June 1, 2008; 57(3): 367 - 377. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. T. Saunders and P. Green Insights from Modeling Protein Evolution with Context-Dependent Mutation and Asymmetric Amino Acid Selection Mol. Biol. Evol., December 1, 2007; 24(12): 2632 - 2647. [Abstract] [Full Text] [PDF] |
||||





