MBE Advance Access published online on January 29, 2008
Molecular Biology and Evolution, doi:10.1093/molbev/msn018
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Article |
A Site- and Time-Heterogeneous Model of Amino-Acid Replacement
Laboratoire d'Informatique, de Robotique et de Microélectronique de Montpellier, UMR 5506, CNRS-Université de Montpellier 2, 161, rue Ada, 34392 Montpellier Cedex 5, France
1 Corresponding author: samuel.blanquart{at}lirmm.fr
Received for publication August 14, 2007. Revision received December 17, 2007. Revision received January 10, 2008. Accepted for publication January 20, 2008.
We combined the CAT mixture model (Lartillot and Philippe 2004) and the non-stationary BP model (Blanquart and Lartillot 2006) into a new model, CAT-BP, accounting for variations of the evolutionary process both along the sequence and across lineages. As in CAT, the model implements a mixture of distinct Markovian processes of substitution distributed among sites, thus accommodating site-specific selective constraints induced by protein structure and function. Furthermore, as in BP, these processes are non-stationary, and their equilibrium frequencies are allowed to change along lineages in a correlated way, through discrete shifts in global amino acid composition distributed along the phylogenetic tree.
We implemented the CAT-BP model in a Bayesian Markov Chain Monte Carlo framework, and compared its predictions with those of three simpler models, BP, CAT, and the site- and time-homogeneous GTR model, on a concatenation of four mitochondrial proteins of 20 arthropod species. In contrast to GTR, BP and CAT, which all display a phylogenetic reconstruction artefact positioning the bees Apis m. and Melipona b. among chelicerates, the CAT-BP model is able to recover the monophyly of insects. Using posterior predictive tests, we further show that the CAT-BP combination yields better anticipations of site- and taxon-specific amino acid frequencies, and that it better accounts for the homoplasies that are responsible for the artefact.
Altogether, our results show that the joint modelling of heterogeneities across sites and along time results in a synergistic improvement of the phylogenetic inference, indicating that it is essential to disentangle the combined effects of both sources of heterogeneity, in order to overcome systematic errors in protein phylogenetic analyses.
Key Words: phylogeny MCMC non-stationary mixture posterior predictive model violation LBA
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
E. A. Sperling, K. J. Peterson, and D. Pisani Phylogenetic-Signal Dissection of Nuclear Housekeeping Genes Supports the Paraphyly of Sponges and the Monophyly of Eumetazoa Mol. Biol. Evol., October 1, 2009; 26(10): 2261 - 2274. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Anisimova and C. Kosiol Investigating Protein-Coding Sequence Evolution with Probabilistic Codon Substitution Models Mol. Biol. Evol., February 1, 2009; 26(2): 255 - 271. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Delport, K. Scheffler, and C. Seoighe Models of coding sequence evolution Brief Bioinform, January 1, 2009; 10(1): 97 - 109. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. C. Choi, B. D Redelings, and J. L Thorne Basing population genetic inferences and models of molecular evolution upon desired stationary distributions of DNA or protein sequences Phil Trans R Soc B, December 27, 2008; 363(1512): 3931 - 3939. [Abstract] [Full Text] [PDF] |
||||


