MBE Advance Access originally published online on January 29, 2008
Molecular Biology and Evolution 2008 25(5):842-858; doi:10.1093/molbev/msn018
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
A Site- and Time-Heterogeneous Model of Amino Acid Replacement
Laboratoire d'Informatique, de Robotique et de Microélectronique de Montpellier, UMR 5506, CNRS-Université de Montpellier 2, Montpellier, France
E-mail: samuel.blanquart{at}lirmm.fr.
Accepted for publication January 20, 2008.
We combined the category (CAT) mixture model (Lartillot N, Philippe H. 2004) and the nonstationary break point (BP) model (Blanquart S, Lartillot N. 2006) into a new model, CAT–BP, accounting for variations of the evolutionary process both along the sequence and across lineages. As in CAT, the model implements a mixture of distinct Markovian processes of substitution distributed among sites, thus accommodating site-specific selective constraints induced by protein structure and function. Furthermore, as in BP, these processes are nonstationary, and their equilibrium frequencies are allowed to change along lineages in a correlated way, through discrete shifts in global amino acid composition distributed along the phylogenetic tree. We implemented the CAT–BP model in a Bayesian Markov Chain Monte Carlo framework and compared its predictions with those of 3 simpler models, BP, CAT, and the site- and time-homogeneous general time–reversible (GTR) model, on a concatenation of 4 mitochondrial proteins of 20 arthropod species. In contrast to GTR, BP, and CAT, which all display a phylogenetic reconstruction artifact positioning the bees Apis mellifera and Melipona bicolor among chelicerates, the CAT–BP model is able to recover the monophyly of insects. Using posterior predictive tests, we further show that the CAT–BP combination yields better anticipations of site- and taxon-specific amino acid frequencies and that it better accounts for the homoplasies that are responsible for the artifact. Altogether, our results show that the joint modeling of heterogeneities across sites and along time results in a synergistic improvement of the phylogenetic inference, indicating that it is essential to disentangle the combined effects of both sources of heterogeneity, in order to overcome systematic errors in protein phylogenetic analyses.
Key Words: phylogeny MCMC nonstationary mixture posterior predictive model violation LBA
Andrew Roger, Associate Editor
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
E. A. Sperling, K. J. Peterson, and D. Pisani Phylogenetic-Signal Dissection of Nuclear Housekeeping Genes Supports the Paraphyly of Sponges and the Monophyly of Eumetazoa Mol. Biol. Evol., October 1, 2009; 26(10): 2261 - 2274. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Anisimova and C. Kosiol Investigating Protein-Coding Sequence Evolution with Probabilistic Codon Substitution Models Mol. Biol. Evol., February 1, 2009; 26(2): 255 - 271. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Delport, K. Scheffler, and C. Seoighe Models of coding sequence evolution Brief Bioinform, January 1, 2009; 10(1): 97 - 109. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. C. Choi, B. D Redelings, and J. L Thorne Basing population genetic inferences and models of molecular evolution upon desired stationary distributions of DNA or protein sequences Phil Trans R Soc B, December 27, 2008; 363(1512): 3931 - 3939. [Abstract] [Full Text] [PDF] |
||||


