Molecular Biology and Evolution 19:2294-2307 (2002)
© 2002 Society for Molecular Biology and Evolution
Combining Multiple Data Sets in a Likelihood Analysis: Which Models are the Best?


*The Institute of Statistical Mathematics, Tokyo, Japan;
Molecular Evolution Laboratory, Faculty of Bioscience and Biotechnology, Tokyo Institute of Technology, Japan
Until recently, phylogenetic analyses have been routinely based on homologous sequences of a single gene. Given the vast number of gene sequences now available, phylogenetic studies are now based on the analysis of multiple genes. Thus, it has become necessary to devise statistical methods to combine multiple molecular data sets. Here, we compare several models for combining different genes for the purpose of evaluating the likelihood of tree topologies. Three methods of branch length estimation were studied: assuming all genes have the same branch lengths (concatenate model), assuming that branch lengths are proportional among genes (proportional model), or assuming that each gene has a separate set of branch lengths (separate model). We also compared three models of among-site rate variation: the homogenous model, a model that assumes one gamma parameter for all genes, and a model that assumes one gamma parameter for each gene. On the basis of two nuclear and one mitochondrial amino acid data sets, our results suggest that, depending on the data set chosen, either the separate model or the proportional model represents the most appropriate method for branch length analysis. For all the data sets examined, one gamma parameter for each gene represents the best model for among-site rate variation. Using these models we analyzed alternative mammalian tree topologies, and we describe the effect of the assumed model on the maximum likelihood tree. We show that the choice of the model has an impact on the best phylogeny obtained.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
C. A. Lozupone, M. Hamady, B. L. Cantarel, P. M. Coutinho, B. Henrissat, J. I. Gordon, and R. Knight The convergence of carbohydrate active gene repertoires in human gut microbes PNAS, September 30, 2008; 105(39): 15076 - 15081. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Li, G. Lu, and G. Orti Optimal Data Partitioning and a Test Case for Ray-Finned Fishes (Actinopterygii) Based on Ten Nuclear Loci Syst Biol, August 1, 2008; 57(4): 519 - 539. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. W. Leigh, E. Susko, M. Baumgartner, and A. J. Roger Testing Congruence in Phylogenomic Analysis Syst Biol, February 1, 2008; 57(1): 104 - 115. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Yang PAML 4: Phylogenetic Analysis by Maximum Likelihood Mol. Biol. Evol., August 1, 2007; 24(8): 1586 - 1591. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. B. Bevan, D. Bryant, and B. F. Lang Accounting for Gene Rate Heterogeneity in Phylogenetic Inference Syst Biol, April 1, 2007; 56(2): 194 - 205. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Criscuolo, V. Berry, E. J. P. Douzery, and O. Gascuel SDM: A Fast Distance-Based Approach for (Super)Tree Building in Phylogenomics Syst Biol, October 1, 2006; 55(5): 740 - 755. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. H. Wortley and R. W. Scotland The Effect of Combining Molecular and Morphological Data in Published Phylogenetic Analyses Syst Biol, August 1, 2006; 55(4): 677 - 685. [Full Text] [PDF] |
||||
![]() |
A. J Roger and L. A Hug The origin and diversification of eukaryotes: problems with molecular phylogenetics and molecular clock estimation Phil Trans R Soc B, June 29, 2006; 361(1470): 1039 - 1054. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. G. B. Simpson, Y. Inagaki, and A. J. Roger Comprehensive Multigene Phylogenies of Excavate Protists Reveal the Evolutionary Positions of "Primitive" Eukaryotes Mol. Biol. Evol., March 1, 2006; 23(3): 615 - 625. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Phillips, P. A. McLenachan, C. Down, G. C. Gibb, and D. Penny Combined Mitochondrial and Nuclear DNA Sequences Resolve the Interrelations of the Major Australasian Marsupial Radiations Syst Biol, February 1, 2006; 55(1): 122 - 137. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Gowri-Shankar and M. Rattray On the Correlation Between Composition and Site-Specific Evolutionary Rate: Implications for Phylogenetic Inference Mol. Biol. Evol., February 1, 2006; 23(2): 352 - 364. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. B. Bevan, B. F. Lang, and D. Bryant Calculating the Evolutionary Rates of Different Genes: A Fast, Accurate Estimator with Applications to Maximum Likelihood Phylogenetic Analysis Syst Biol, December 1, 2005; 54(6): 900 - 915. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Gatesy and R. H. Baker Hidden Likelihood Support in Genomic Data: Can Forty-Five Wrongs Make a Right? Syst Biol, June 1, 2005; 54(3): 483 - 492. [Abstract] [Full Text] [PDF] |
||||
![]() |
T.-K. Seo, H. Kishino, and J. L. Thorne Incorporating gene-specific variation when inferring and evaluating optimal evolutionary tree topologies from multilocus sequence data PNAS, March 22, 2005; 102(12): 4436 - 4441. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Posada and T. R. Buckley Model Selection and Model Averaging in Phylogenetics: Advantages of Akaike Information Criterion and Bayesian Approaches Over Likelihood Ratio Tests Syst Biol, October 1, 2004; 53(5): 793 - 808. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Philippe, E. A. Snell, E. Bapteste, P. Lopez, P. W. H. Holland, and D. Casane Phylogenomics of Eukaryotes: Impact of Missing Data on Large Alignments Mol. Biol. Evol., September 1, 2004; 21(9): 1740 - 1752. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. A. Castoe, T. M. Doan, and C. L. Parkinson Data Partitions and Complex Models in Bayesian Analysis: The Phylogeny of Gymnophthalmid Lizards Syst Biol, June 1, 2004; 53(3): 448 - 469. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. Gontcharov, B. Marin, and M. Melkonian Are Combined Analyses Better Than Single Gene Phylogenies? A Case Study Using SSU rDNA and rbcL Sequence Comparisons in the Zygnematophyceae (Streptophyta) Mol. Biol. Evol., March 1, 2004; 21(3): 612 - 624. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. A. Nylander, F. Ronquist, J. P. Huelsenbeck, and J. Nieves-Aldrey Bayesian Phylogenetic Analysis of Combined Data Syst Biol, February 1, 2004; 53(1): 47 - 67. [Abstract] [Full Text] [PDF] |
||||



