MBE Advance Access originally published online on March 10, 2004
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Mol. Biol. Evol. 21(6):1095-1109. 2004
DOI: 10.1093/molbev/msh112
© 2004 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038
A Bayesian Mixture Model for Across-Site Heterogeneities in the Amino-Acid Replacement Process
Canadian Institute for Advanced Research, Département de Biochimie, Université de Montréal, Montréal, Québec Canada
E-mail: nicolas.lartillot{at}lirmm.fr
Most current models of sequence evolution assume that all sites of a protein evolve under the same substitution process, characterized by a 20 x 20 substitution matrix. Here, we propose to relax this assumption by developing a Bayesian mixture model that allows the amino-acid replacement pattern at different sites of a protein alignment to be described by distinct substitution processes. Our model, named CAT, assumes the existence of distinct processes (or classes) differing by their equilibrium frequencies over the 20 residues. Through the use of a Dirichlet process prior, the total number of classes and their respective amino-acid profiles, as well as the affiliations of each site to a given class, are all free variables of the model. In this way, the CAT model is able to adapt to the complexity actually present in the data, and it yields an estimate of the substitutional heterogeneity through the posterior mean number of classes. We show that a significant level of heterogeneity is present in the substitution patterns of proteins, and that the standard one-matrix model fails to account for this heterogeneity. By evaluating the Bayes factor, we demonstrate that the standard model is outperformed by CAT on all of the data sets which we analyzed. Altogether, these results suggest that the complexity of the pattern of substitution of real sequences is better captured by the CAT model, offering the possibility of studying its impact on phylogenetic reconstruction and its connections with structure-function determinants.
Key Words: phylogeny Bayes Dirichlet process mixtures amino-acid replacement Bayes factor posterior predictive resampling
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
F. Burki, Y. Inagaki, J. Brate, J. M. Archibald, P. J. Keeling, T. Cavalier-Smith, M. Sakaguchi, T. Hashimoto, A. Horak, S. Kumar, et al. Large-Scale Phylogenomic Analyses Reveal That Two Enigmatic Protist Lineages, Telonemia and Centroheliozoa, Are Related to Photosynthetic Chromalveolates Gen Biol Evol, October 19, 2009; 2009(0): 231 - 238. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Turmel, C. Otis, and C. Lemieux The Chloroplast Genomes of the Green Algae Pedinomonas minor, Parachlorella kessleri, and Oocystis solitaria Reveal a Shared Ancestry between the Pedinomonadales and Chlorellales Mol. Biol. Evol., October 1, 2009; 26(10): 2317 - 2331. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. A. Sperling, K. J. Peterson, and D. Pisani Phylogenetic-Signal Dissection of Nuclear Housekeeping Genes Supports the Paraphyly of Sponges and the Monophyly of Eumetazoa Mol. Biol. Evol., October 1, 2009; 26(10): 2261 - 2274. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Paps, J. Baguna, and M. Riutort Bilaterian Phylogeny: A Broad Sampling of 13 Nuclear Genes Provides a New Lophotrochozoa Phylogeny and Supports a Paraphyletic Basal Acoelomorpha Mol. Biol. Evol., October 1, 2009; 26(10): 2397 - 2406. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Himmelmann and D. Metzler TreeTime: an extensible C++ software package for Bayesian phylogeny reconstruction with time-calibration Bioinformatics, September 15, 2009; 25(18): 2440 - 2441. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Lartillot, T. Lepage, and S. Blanquart PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating Bioinformatics, September 1, 2009; 25(17): 2286 - 2288. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. G. Foster, C. J. Cox, and T. M. Embley The primary divisions of life: a phylogenomic approach employing composition-heterogeneous methods Phil Trans R Soc B, August 12, 2009; 364(1527): 2197 - 2207. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Rodrigue, C. L. Kleinman, H. Philippe, and N. Lartillot Computational Methods for Evaluating Phylogenetic Models of Coding Sequence Evolution with Dependence between Codons Mol. Biol. Evol., July 1, 2009; 26(7): 1663 - 1676. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Hampl, L. Hug, J. W. Leigh, J. B. Dacks, B. F. Lang, A. G. B. Simpson, and A. J. Roger Phylogenomic analyses support the monophyly of Excavata and resolve relationships among eukaryotic "supergroups" PNAS, March 10, 2009; 106(10): 3859 - 3864. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Minge, J. D Silberman, R. J.S Orr, T. Cavalier-Smith, K. Shalchian-Tabrizi, F. Burki, A. Skjaeveland, and K. S Jakobsen Evolutionary position of breviate amoebae and the primary eukaryote divergence Proc R Soc B, February 22, 2009; 276(1657): 597 - 604. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Anisimova and C. Kosiol Investigating Protein-Coding Sequence Evolution with Probabilistic Codon Substitution Models Mol. Biol. Evol., February 1, 2009; 26(2): 255 - 271. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Delport, K. Scheffler, and C. Seoighe Models of coding sequence evolution Brief Bioinform, January 1, 2009; 10(1): 97 - 109. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Liu, J. W. Leigh, H. Brinkmann, M. T. Cushion, N. Rodriguez-Ezpeleta, H. Philippe, and B. F. Lang Phylogenomic Analyses Support the Monophyly of Taphrinomycotina, including Schizosaccharomyces Fission Yeasts Mol. Biol. Evol., January 1, 2009; 26(1): 27 - 34. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. R. Bordenstein, C. Paraskevopoulos, J. C. Dunning Hotopp, P. Sapountzis, N. Lo, C. Bandi, H. Tettelin, J. H. Werren, and K. Bourtzis Parasitism and Mutualism in Wolbachia: What the Phylogenomic Trees Can and Cannot Say Mol. Biol. Evol., January 1, 2009; 26(1): 231 - 241. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. C. Choi, B. D Redelings, and J. L Thorne Basing population genetic inferences and models of molecular evolution upon desired stationary distributions of DNA or protein sequences Phil Trans R Soc B, December 27, 2008; 363(1512): 3931 - 3939. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. P Huelsenbeck, P. Joyce, C. Lakner, and F. Ronquist Bayesian analysis of amino acid substitution models Phil Trans R Soc B, December 27, 2008; 363(1512): 3941 - 3953. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Pagel and A. Meade Modelling heterotachy in phylogenetic inference by reversible-jump Markov chain Monte Carlo Phil Trans R Soc B, December 27, 2008; 363(1512): 3955 - 3964. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Q. Le, N. Lartillot, and O. Gascuel Phylogenetic mixture models for proteins Phil Trans R Soc B, December 27, 2008; 363(1512): 3965 - 3976. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. T Holder, D. J Zwickl, and C. Dessimoz Evaluating the robustness of phylogenetic methods to among-site variability in substitution processes Phil Trans R Soc B, December 27, 2008; 363(1512): 4013 - 4021. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. J. Cox, P. G. Foster, R. P. Hirt, S. R. Harris, and T. M. Embley The archaebacterial origin of eukaryotes PNAS, December 23, 2008; 105(51): 20356 - 20361. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Archibald The eocyte hypothesis and the origin of eukaryotic cells PNAS, December 23, 2008; 105(51): 20049 - 20050. [Full Text] [PDF] |
||||
![]() |
S. Kryazhimskiy, G. A Bazykin, J. Plotkin, and J. Dushoff Directionality in the evolution of influenza A haemagglutinin Proc R Soc B, November 7, 2008; 275(1650): 2455 - 2464. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Rodrigue, N. Lartillot, and H. Philippe Bayesian Comparisons of Codon Substitution Models Genetics, November 1, 2008; 180(3): 1579 - 1591. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Si Quang, O. Gascuel, and N. Lartillot Empirical profile mixture models for phylogenetic reconstruction Bioinformatics, October 15, 2008; 24(20): 2317 - 2323. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Kim and M. J. Sanderson Penalized Likelihood Phylogenetic Inference: Bridging the Parsimony-Likelihood Gap Syst Biol, October 1, 2008; 57(5): 665 - 674. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Lanfear and L. Bromham Statistical Tests between Competing Hypotheses of Hox Cluster Evolution Syst Biol, October 1, 2008; 57(5): 708 - 718. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Burki, K. Shalchian-Tabrizi, and J. Pawlowski Phylogenomics reveals a new 'megagroup' including most photosynthetic eukaryotes Biol Lett, August 23, 2008; 4(4): 366 - 369. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Helmkampf, I. Bruchhaus, and B. Hausdorf Phylogenomic analyses of lophophorates (brachiopods, phoronids and bryozoans) confirm the Lophotrochozoa concept Proc R Soc B, August 22, 2008; 275(1645): 1927 - 1933. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Aguileta, S. Marthey, H. Chiapello, M.-H. Lebrun, F. Rodolphe, E. Fournier, A. Gendrault-Jacquemard, and T. Giraud Assessing the Performance of Single-Copy Genes for Recovering Robust Phylogenies Syst Biol, August 1, 2008; 57(4): 613 - 627. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Whelan Spatial and Temporal Heterogeneity in Nucleotide Sequence Evolution Mol. Biol. Evol., August 1, 2008; 25(8): 1683 - 1694. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Q. Le and O. Gascuel An Improved General Amino Acid Replacement Matrix Mol. Biol. Evol., July 1, 2008; 25(7): 1307 - 1320. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Kolaczkowski and J. W. Thornton A Mixed Branch Length Model of Heterotachy Improves Phylogenetic Accuracy Mol. Biol. Evol., June 1, 2008; 25(6): 1054 - 1066. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Blanquart and N. Lartillot A Site- and Time-Heterogeneous Model of Amino Acid Replacement Mol. Biol. Evol., May 1, 2008; 25(5): 842 - 858. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Lartillot and H. Philippe Improvement of molecular phylogenetic inference and the phylogeny of Bilateria Phil Trans R Soc B, April 27, 2008; 363(1496): 1463 - 1472. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Baguna, P. Martinez, J. Paps, and M. Riutort Back in time: a new systematic proposal for the Bilateria Phil Trans R Soc B, April 27, 2008; 363(1496): 1481 - 1491. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Venditti, A. Meade, and M. Pagel Phylogenetic Mixture Models Can Reduce Node-Density Artifacts Syst Biol, April 1, 2008; 57(2): 286 - 293. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Zabalou, A. Apostolaki, S. Pattas, Z. Veneti, C. Paraskevopoulos, I. Livadaras, G. Markakis, T. Brissac, H. Mercot, and K. Bourtzis Multiple Rescue Factors Within a Wolbachia Strain Genetics, April 1, 2008; 178(4): 2145 - 2160. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Ruiz-Trillo, A. J. Roger, G. Burger, M. W. Gray, and B. F. Lang A Phylogenomic Investigation into the Origin of Metazoa Mol. Biol. Evol., April 1, 2008; 25(4): 664 - 672. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. H. Struck and F. Fisse Phylogenetic Position of Nemertea Derived from Phylogenomic Data Mol. Biol. Evol., April 1, 2008; 25(4): 728 - 736. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. W. Leigh, E. Susko, M. Baumgartner, and A. J. Roger Testing Congruence in Phylogenomic Analysis Syst Biol, February 1, 2008; 57(1): 104 - 115. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Rodrigue, H. Philippe, and N. Lartillot Uniformization for sampling realizations of Markov processes: applications to Bayesian implementations of codon substitution models Bioinformatics, January 1, 2008; 24(1): 56 - 62. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. P. Huelsenbeck and M. A. Suchard A Nonparametric Method for Accommodating and Testing Across-Site Rate Variation Syst Biol, December 1, 2007; 56(6): 975 - 987. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Lepage, D. Bryant, H. Philippe, and N. Lartillot A General Comparison of Relaxed Molecular Clock Models Mol. Biol. Evol., December 1, 2007; 24(12): 2669 - 2680. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Hausdorf, M. Helmkampf, A. Meyer, A. Witek, H. Herlyn, I. Bruchhaus, T. Hankeln, T. H. Struck, and B. Lieb Spiralian Phylogenomics Supports the Resurrection of Bryozoa Comprising Ectoprocta and Entoprocta Mol. Biol. Evol., December 1, 2007; 24(12): 2723 - 2729. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. V. Lavrov Key transitions in animal evolution: a mitochondrial DNA perspective Integr. Comp. Biol., November 1, 2007; 47(5): 734 - 743. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Rodrigue, H. Philippe, and N. Lartillot Exploring Fast Computational Strategies for Probabilistic Phylogenetic Analysis Syst Biol, October 1, 2007; 56(5): 711 - 726. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Brown and A. R. Lemmon The Importance of Data Partitioning and the Utility of Bayes Factors in Bayesian Phylogenetics Syst Biol, August 1, 2007; 56(4): 643 - 655. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Khan, N. Parks, C. Kozera, B. A. Curtis, B. J. Parsons, S. Bowman, and J. M. Archibald Plastid Genome Sequence of the Cryptophyte Alga Rhodomonas salina CCMP1319: Lateral Transfer of Putative DNA Replication Machinery and a Test of Chromist Plastid Phylogeny Mol. Biol. Evol., August 1, 2007; 24(8): 1832 - 1842. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Jimenez-Guri, H. Philippe, B. Okamura, and P. W. H. Holland Buddenbrockia Is a Cnidarian Worm Science, July 6, 2007; 317(5834): 116 - 118. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. M. Haen, B. F. Lang, S. A. Pomponi, and D. V. Lavrov Glass Sponges and Bilaterian Animals Share Derived Mitochondrial Genomic Features: A Common Ancestry or Parallel Evolution? Mol. Biol. Evol., July 1, 2007; 24(7): 1518 - 1527. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Rodriguez-Ezpeleta, H. Brinkmann, B. Roure, N. Lartillot, B. F. Lang, and H. Philippe Detecting and Overcoming Systematic Errors in Genome-Scale Phylogenies Syst Biol, June 1, 2007; 56(3): 389 - 399. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Baurain, H. Brinkmann, and H. Philippe Lack of Resolution in the Animal Phylogeny: Closely Spaced Cladogeneses or Undetected Systematic Errors? Mol. Biol. Evol., January 1, 2007; 24(1): 6 - 9. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Abascal, D. Posada, and R. Zardoya MtArt: A New Model of Amino Acid Replacement for Arthropoda Mol. Biol. Evol., January 1, 2007; 24(1): 1 - 5. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. C. Marshall, C. Simon, and T. R. Buckley Accurate Branch Length Estimation in Partitioned Bayesian Analyses Requires Accommodation of Among-Partition Rate Variation and Attention to Branch Length Priors Syst Biol, December 1, 2006; 55(6): 993 - 1003. [Full Text] [PDF] |
||||
![]() |
S. Blanquart and N. Lartillot A Bayesian Compound Stochastic Process for Modeling Nonstationary and Nonhomogeneous Sequence Evolution Mol. Biol. Evol., November 1, 2006; 23(11): 2058 - 2071. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Rodrigue, H. Philippe, and N. Lartillot Assessing Site-Interdependent Phylogenetic Models of Sequence Evolution Mol. Biol. Evol., September 1, 2006; 23(9): 1762 - 1775. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Yu and J. L. Thorne Dependence among Sites in RNA Evolution Mol. Biol. Evol., August 1, 2006; 23(8): 1525 - 1537. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J Roger and L. A Hug The origin and diversification of eukaryotes: problems with molecular phylogenetics and molecular clock estimation Phil Trans R Soc B, June 29, 2006; 361(1470): 1039 - 1054. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. P. Huelsenbeck, S. Jain, S. W. D. Frost, and S. L. K. Pond A Dirichlet process model for detecting positive selection in protein-coding DNA sequences PNAS, April 18, 2006; 103(16): 6263 - 6268. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Lartillot and H. Philippe Computing Bayes Factors Using Thermodynamic Integration Syst Biol, April 1, 2006; 55(2): 195 - 207. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Gesell and A. von Haeseler In silico sequence evolution with site-specific interactions along phylogenetic trees Bioinformatics, March 15, 2006; 22(6): 716 - 722. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Gowri-Shankar and M. Rattray On the Correlation Between Composition and Site-Specific Evolutionary Rate: Implications for Phylogenetic Inference Mol. Biol. Evol., February 1, 2006; 23(2): 352 - 364. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Brinkmann, M. van der Giezen, Y. Zhou, G. P. de Raucourt, and H. Philippe An Empirical Assessment of Long-Branch Attraction Artefacts in Deep Eukaryotic Phylogenomics Syst Biol, October 1, 2005; 54(5): 743 - 757. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. E. Crooks and S. E. Brenner An alternative model of amino acid replacement Bioinformatics, April 1, 2005; 21(7): 975 - 980. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Blouin, D. Butt, and A. J. Roger Impact of Taxon Sampling on the Estimation of Rates of Evolution at Sites Mol. Biol. Evol., March 1, 2005; 22(3): 784 - 791. [Abstract] [Full Text] [PDF] |
||||











