MBE Advance Access originally published online on March 2, 2005
Molecular Biology and Evolution 2005 22(5):1337-1344; doi:10.1093/molbev/msi121
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Article |
More Genes or More Taxa? The Relative Contribution of Gene Number and Taxon Number to Phylogenetic Accuracy
Howard Hughes Medical Institute and Laboratory of Molecular Biology, University of WisconsinMadison
E-mail: sbcarrol{at}wisc.edu.
The relative contribution of taxon number and gene number to accuracy in phylogenetic inference is a major issue in phylogenetics and of central importance to the choice of experimental strategies for the successful reconstruction of a broad sketch of the tree of life. Maximization of the number of taxa sampled is the strategy favored by most phylogeneticists, although its necessity remains the subject of debate. Vast increases in gene number are now possible due to advances in genomics, but large numbers of genes will be available for only modest numbers of taxa, raising the question of whether such genome-scale phylogenies will be robust to the addition of taxa. To examine the relative benefit of increasing taxon number or gene number to phylogenetic accuracy, we have developed an assay that utilizes the symmetric difference tree distance as a measure of phylogenetic accuracy. We have applied this assay to a genome-scale data matrix containing 106 genes from 14 yeast species. Our results show that increasing taxon number correlates with a slight decrease in phylogenetic accuracy. In contrast, increasing gene number has a significant positive effect on phylogenetic accuracy. Analyses of an additional taxon-rich data matrix from the same yeast clade show that taxon number does not have a significant effect on phylogenetic accuracy. The positive effect of gene number and the lack of effect of taxon number on phylogenetic accuracy are also corroborated by analyses of two data matrices from mammals and angiosperm plants, respectively. We conclude that, for typical data sets, the number of genes utilized may be a more important determinant of phylogenetic accuracy than taxon number.
Key Words: phylogenetics taxon number gene number phylogenetic accuracy tree of life genomics
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
E. W. Bloomquist and M. A. Suchard Unifying Vertical and Nonvertical Evolution: A Stochastic ARG-based Framework Syst Biol, November 9, 2009; (2009) syp076v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Decker, J. C. Pires, G. C. Conant, S. D. McKay, M. P. Heaton, K. Chen, A. Cooper, J. Vilkki, C. M. Seabury, A. R. Caetano, et al. Resolving the evolution of extant and extinct ruminants with high-throughput phylogenomics PNAS, November 3, 2009; 106(44): 18644 - 18649. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Campbell and F.-J. Lapointe The Use and Validity of Composite Taxa in Phylogenetic Analysis Syst Biol, September 21, 2009; (2009) syp056v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. San Mauro, D. J. Gower, T. Massingham, M. Wilkinson, R. Zardoya, and J. A. Cotton Experimental Design in Caecilian Systematics: Phylogenetic Information of Mitochondrial Genomes and Nuclear rag1 Syst Biol, August 18, 2009; (2009) syp043v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. M. Bossu and T. J. Near Gene Trees Reveal Repeated Instances of Mitochondrial DNA Introgression in Orangethroat Darters (Percidae: Etheostoma) Syst Biol, May 22, 2009; (2009) syp014v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
A.E. Palme, T. Pyhajarvi, W. Wachowiak, and O. Savolainen Selection on Nuclear Genes in a Pinus Phylogeny Mol. Biol. Evol., April 1, 2009; 26(4): 893 - 905. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. Sherman, T. Martin, M. Nikolski, C. Cayla, J.-L. Souciet, P. Durrens, and for the Genolevures Consortium Genolevures: protein families and synteny among complete hemiascomycetous yeast proteomes and genomes Nucleic Acids Res., January 1, 2009; 37(suppl_1): D550 - D554. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. B. Prasad, M. W. Allard, NISC Comparative Sequencing Program, and E. D. Green Confirming the Phylogeny of Mammals by Use of Large Comparative Sequence Data Sets Mol. Biol. Evol., September 1, 2008; 25(9): 1795 - 1808. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. W. Peterson Phylogenetic analysis of Aspergillus species using DNA sequences from four loci Mycologia, March 1, 2008; 100(2): 205 - 226. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. J. Salipante, J. M. Thompson, and M. S. Horwitz Phylogenetic Fate Mapping: Theoretical and Experimental Studies Applied to the Development of Mouse Fibroblasts Genetics, February 1, 2008; 178(2): 967 - 977. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Chandra and D. R. Huff Salmacisia, a new genus of Tilletiales: reclassification of Tilletia buchloeana causing induced hermaphroditism in buffalograss Mycologia, January 1, 2008; 100(1): 81 - 93. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A. Morrison Increasing the Efficiency of Searches for the Maximum Likelihood Tree in a Phylogenetic Analysis of up to 150 Nucleotide Sequences Syst Biol, December 1, 2007; 56(6): 988 - 1010. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. M. Hallstrom, M. Kullberg, M. A. Nilsson, and A. Janke Phylogenomic Data Analyses Provide Evidence that Xenarthra and Afrotheria Are Sister Groups Mol. Biol. Evol., September 1, 2007; 24(9): 2059 - 2068. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. A. Huttley, M. J. Wakefield, and S. Easteal Rates of Genome Evolution and Branching Order from Whole Genome Analysis Mol. Biol. Evol., August 1, 2007; 24(8): 1722 - 1730. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. B. Bevan, D. Bryant, and B. F. Lang Accounting for Gene Rate Heterogeneity in Phylogenetic Inference Syst Biol, April 1, 2007; 56(2): 194 - 205. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. P. Townsend Profiling Phylogenetic Informativeness Syst Biol, April 1, 2007; 56(2): 222 - 231. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Gatesy, R. DeSalle, and N. Wahlberg How Many Genes Should a Systematist Sample? Conflicting Insights from a Phylogenomic Matrix Characterized by Replicated Incongruence Syst Biol, April 1, 2007; 56(2): 355 - 363. [Full Text] [PDF] |
||||
![]() |
I. B. Rogozin, Y. I. Wolf, L. Carmel, and E. V. Koonin Ecdysozoan Clade Rejected by Genome-Wide Analysis of Rare Amino Acid Replacements Mol. Biol. Evol., April 1, 2007; 24(4): 1080 - 1090. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. S. Kubatko and J. H. Degnan Inconsistency of Phylogenetic Estimates from Concatenated Data under Coalescence Syst Biol, February 1, 2007; 56(1): 17 - 24. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Baurain, H. Brinkmann, and H. Philippe Lack of Resolution in the Animal Phylogeny: Closely Spaced Cladogeneses or Undetected Systematic Errors? Mol. Biol. Evol., January 1, 2007; 24(1): 6 - 9. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Blanquart and N. Lartillot A Bayesian Compound Stochastic Process for Modeling Nonstationary and Nonhomogeneous Sequence Evolution Mol. Biol. Evol., November 1, 2006; 23(11): 2058 - 2071. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. M. Bateman, J. Hilton, and P. J. Rudall Morphological and molecular phylogenetic context of the angiosperms: contrasting the 'top-down' and 'bottom-up' approaches used to infer the likely characteristics of the first flowers J. Exp. Bot., October 1, 2006; 57(13): 3471 - 3503. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. A. Kellogg Progress and challenges in studies of the evolution of development J. Exp. Bot., October 1, 2006; 57(13): 3505 - 3516. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. M. Hedtke, T. M. Townsend, and D. M. Hillis Resolution of Phylogenetic Conflict in Large Data Sets by Increased Taxon Sampling Syst Biol, June 1, 2006; 55(3): 522 - 529. [Full Text] [PDF] |
||||
![]() |
J. C. Chiu, E. K. Lee, M. G. Egan, I. N. Sarkar, G. M. Coruzzi, and R. DeSalle OrthologID: automation of genome-scale ortholog identification within a parsimony framework Bioinformatics, March 15, 2006; 22(6): 699 - 707. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Giribet and G. D Edgecombe Conflict between datasets and phylogeny of centipedes: an analysis based on seven genes and morphology Proc R Soc B, March 7, 2006; 273(1586): 531 - 538. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Rokas, D. Kruger, and S. B. Carroll Animal Evolution and the Molecular Signature of Radiations Compressed in Time Science, December 23, 2005; 310(5756): 1933 - 1938. [Abstract] [Full Text] [PDF] |
||||









