MBE Advance Access originally published online on March 30, 2007
Molecular Biology and Evolution 2007 24(7):1464-1479; doi:10.1093/molbev/msm064
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
An Empirical Codon Model for Protein Sequence Evolution

* European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, United Kingdom
Department of Bioengineering, University of California, Berkeley, Cambridge
E-mail: ck285{at}cornell.edu.
Accepted for publication March 19, 2007.
In the past, 2 kinds of Markov models have been considered to describe protein sequence evolution. Codon-level models have been mechanistic with a small number of parameters designed to take into account features, such as transitiontransversion bias, codon frequency bias, and synonymousnonsynonymous amino acid substitution bias. Amino acid models have been empirical, attempting to summarize the replacement patterns observed in large quantities of data and not explicitly considering the distinct factors that shape protein evolution. We have estimated the first empirical codon model (ECM). Previous codon models assume that protein evolution proceeds only by successive single nucleotide substitutions, but our results indicate that model accuracy is significantly improved by incorporating instantaneous doublet and triplet changes. We also find that the affiliations between codons, the amino acid each encodes and the physicochemical properties of the amino acids are main factors driving the process of codon evolution. Neither multiple nucleotide changes nor the strong influence of the genetic code nor amino acids' physicochemical properties form a part of standard mechanistic models and their views of how codon evolution proceeds. We have implemented the ECM for likelihood-based phylogenetic analysis, and an assessment of its ability to describe protein evolution shows that it consistently outperforms comparable mechanistic codon models. We point out the biological interpretation of our ECM and possible consequences for studies of selection.
Key Words: protein evolution codon models Markov models maximum likelihood phylogenetic inference
1 Present address: Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York.
Arndt von Haeseler, Associate Editor
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. L. Kosakovsky Pond, A. F.Y. Poon, A. J. Leigh Brown, and S. D.W. Frost A Maximum Likelihood Method for Detecting Directional Evolution in Protein Sequences and Its Application to Influenza A Virus Mol. Biol. Evol., September 1, 2008; 25(9): 1809 - 1824. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Barquist and I. Holmes xREI: a phylo-grammar visualization webserver Nucleic Acids Res., July 1, 2008; 36(suppl_2): W65 - W69. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Q. Le and O. Gascuel An Improved General Amino Acid Replacement Matrix Mol. Biol. Evol., July 1, 2008; 25(7): 1307 - 1320. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. T. Saunders and P. Green Insights from Modeling Protein Evolution with Context-Dependent Mutation and Asymmetric Amino Acid Selection Mol. Biol. Evol., December 1, 2007; 24(12): 2632 - 2647. [Abstract] [Full Text] [PDF] |
||||

