Skip Navigation

This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary material
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (45)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Pupko, T.
Right arrow Articles by Hasegawa, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Pupko, T.
Right arrow Articles by Hasegawa, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Molecular Biology and Evolution 19:2294-2307 (2002)
© 2002 Society for Molecular Biology and Evolution

Combining Multiple Data Sets in a Likelihood Analysis: Which Models are the Best?

Tal Pupko*,2, Dorothée Huchon{dagger}, Ying Cao*, Norihiro Okada{dagger} and Masami Hasegawa*

*The Institute of Statistical Mathematics, Tokyo, Japan;
{dagger}Molecular Evolution Laboratory, Faculty of Bioscience and Biotechnology, Tokyo Institute of Technology, Japan

Until recently, phylogenetic analyses have been routinely based on homologous sequences of a single gene. Given the vast number of gene sequences now available, phylogenetic studies are now based on the analysis of multiple genes. Thus, it has become necessary to devise statistical methods to combine multiple molecular data sets. Here, we compare several models for combining different genes for the purpose of evaluating the likelihood of tree topologies. Three methods of branch length estimation were studied: assuming all genes have the same branch lengths (concatenate model), assuming that branch lengths are proportional among genes (proportional model), or assuming that each gene has a separate set of branch lengths (separate model). We also compared three models of among-site rate variation: the homogenous model, a model that assumes one gamma parameter for all genes, and a model that assumes one gamma parameter for each gene. On the basis of two nuclear and one mitochondrial amino acid data sets, our results suggest that, depending on the data set chosen, either the separate model or the proportional model represents the most appropriate method for branch length analysis. For all the data sets examined, one gamma parameter for each gene represents the best model for among-site rate variation. Using these models we analyzed alternative mammalian tree topologies, and we describe the effect of the assumed model on the maximum likelihood tree. We show that the choice of the model has an impact on the best phylogeny obtained.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Proc. Natl. Acad. Sci. USAHome page
C. A. Lozupone, M. Hamady, B. L. Cantarel, P. M. Coutinho, B. Henrissat, J. I. Gordon, and R. Knight
The convergence of carbohydrate active gene repertoires in human gut microbes
PNAS, September 30, 2008; 105(39): 15076 - 15081.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
C. Li, G. Lu, and G. Orti
Optimal Data Partitioning and a Test Case for Ray-Finned Fishes (Actinopterygii) Based on Ten Nuclear Loci
Syst Biol, August 1, 2008; 57(4): 519 - 539.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
J. W. Leigh, E. Susko, M. Baumgartner, and A. J. Roger
Testing Congruence in Phylogenomic Analysis
Syst Biol, February 1, 2008; 57(1): 104 - 115.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
Z. Yang
PAML 4: Phylogenetic Analysis by Maximum Likelihood
Mol. Biol. Evol., August 1, 2007; 24(8): 1586 - 1591.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
R. B. Bevan, D. Bryant, and B. F. Lang
Accounting for Gene Rate Heterogeneity in Phylogenetic Inference
Syst Biol, April 1, 2007; 56(2): 194 - 205.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
A. Criscuolo, V. Berry, E. J. P. Douzery, and O. Gascuel
SDM: A Fast Distance-Based Approach for (Super)Tree Building in Phylogenomics
Syst Biol, October 1, 2006; 55(5): 740 - 755.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
A. H. Wortley and R. W. Scotland
The Effect of Combining Molecular and Morphological Data in Published Phylogenetic Analyses
Syst Biol, August 1, 2006; 55(4): 677 - 685.
[Full Text] [PDF]


Home page
Phil Trans R Soc BHome page
A. J Roger and L. A Hug
The origin and diversification of eukaryotes: problems with molecular phylogenetics and molecular clock estimation
Phil Trans R Soc B, June 29, 2006; 361(1470): 1039 - 1054.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
A. G. B. Simpson, Y. Inagaki, and A. J. Roger
Comprehensive Multigene Phylogenies of Excavate Protists Reveal the Evolutionary Positions of "Primitive" Eukaryotes
Mol. Biol. Evol., March 1, 2006; 23(3): 615 - 625.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
M. J. Phillips, P. A. McLenachan, C. Down, G. C. Gibb, and D. Penny
Combined Mitochondrial and Nuclear DNA Sequences Resolve the Interrelations of the Major Australasian Marsupial Radiations
Syst Biol, February 1, 2006; 55(1): 122 - 137.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
V. Gowri-Shankar and M. Rattray
On the Correlation Between Composition and Site-Specific Evolutionary Rate: Implications for Phylogenetic Inference
Mol. Biol. Evol., February 1, 2006; 23(2): 352 - 364.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
R. B. Bevan, B. F. Lang, and D. Bryant
Calculating the Evolutionary Rates of Different Genes: A Fast, Accurate Estimator with Applications to Maximum Likelihood Phylogenetic Analysis
Syst Biol, December 1, 2005; 54(6): 900 - 915.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
J. Gatesy and R. H. Baker
Hidden Likelihood Support in Genomic Data: Can Forty-Five Wrongs Make a Right?
Syst Biol, June 1, 2005; 54(3): 483 - 492.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
T.-K. Seo, H. Kishino, and J. L. Thorne
Incorporating gene-specific variation when inferring and evaluating optimal evolutionary tree topologies from multilocus sequence data
PNAS, March 22, 2005; 102(12): 4436 - 4441.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
D. Posada and T. R. Buckley
Model Selection and Model Averaging in Phylogenetics: Advantages of Akaike Information Criterion and Bayesian Approaches Over Likelihood Ratio Tests
Syst Biol, October 1, 2004; 53(5): 793 - 808.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
H. Philippe, E. A. Snell, E. Bapteste, P. Lopez, P. W. H. Holland, and D. Casane
Phylogenomics of Eukaryotes: Impact of Missing Data on Large Alignments
Mol. Biol. Evol., September 1, 2004; 21(9): 1740 - 1752.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
T. A. Castoe, T. M. Doan, and C. L. Parkinson
Data Partitions and Complex Models in Bayesian Analysis: The Phylogeny of Gymnophthalmid Lizards
Syst Biol, June 1, 2004; 53(3): 448 - 469.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
A. A. Gontcharov, B. Marin, and M. Melkonian
Are Combined Analyses Better Than Single Gene Phylogenies? A Case Study Using SSU rDNA and rbcL Sequence Comparisons in the Zygnematophyceae (Streptophyta)
Mol. Biol. Evol., March 1, 2004; 21(3): 612 - 624.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
J. A. A. Nylander, F. Ronquist, J. P. Huelsenbeck, and J. Nieves-Aldrey
Bayesian Phylogenetic Analysis of Combined Data
Syst Biol, February 1, 2004; 53(1): 47 - 67.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.