MBE Advance Access originally published online on November 9, 2006
Molecular Biology and Evolution 2007 24(2):412-426; doi:10.1093/molbev/msl170
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
Bayesian Estimation of Concordance among Gene Trees





* Department of Statistics, University of Wisonsin
Department of Botany, University of Wisonsin
Department of Biology, Duke University
Microbial Sequencing Center, The Broad Institute of MIT and Harvard
E-mail: ane{at}wisc.edu.
Accepted for publication November 3, 2006.
Multigene sequence data have great potential for elucidating important and interesting evolutionary processes, but statistical methods for extracting information from such data remain limited. Although various biological processes may cause different genes to have different genealogical histories (and hence different tree topologies), we also may expect that the number of distinct topologies among a set of genes is relatively small compared with the number of possible topologies. Therefore evidence about the tree topology for one gene should influence our inferences of the tree topology on a different gene, but to what extent? In this paper, we present a new approach for modeling and estimating concordance among a set of gene trees given aligned molecular sequence data. Our approach introduces a one-parameter probability distribution to describe the prior distribution of concordance among gene trees. We describe a novel 2-stage Markov chain Monte Carlo (MCMC) method that first obtains independent Bayesian posterior probability distributions for individual genes using standard methods. These posterior distributions are then used as input for a second MCMC procedure that estimates a posterior distribution of gene-to-tree maps (GTMs). The posterior distribution of GTMs can then be summarized to provide revised posterior probability distributions for each gene (taking account of concordance) and to allow estimation of the proportion of the sampled genes for which any given clade is true (the sample-wide concordance factor). Further, under the assumption that the sampled genes are drawn randomly from a genome of known size, we show how one can obtain an estimate, with credibility intervals, on the proportion of the entire genome for which a clade is true (the genome-wide concordance factor). We demonstrate the method on a set of 106 genes from 8 yeast species.
Key Words: gene genealogy concordance Bayesian phylogenetics total evidence consensus methods Dirichlet process
Edward Holmes, Associate Editor
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J. H. Degnan, M. DeGiorgio, D. Bryant, and N. A. Rosenberg Properties of Consensus Methods for Inferring Species Trees from Gene Trees Syst Biol, June 4, 2009; (2009) syp008v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A. Baum Species as Ranked Taxa Syst Biol, May 19, 2009; (2009) syp011v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. S. Kubatko, B. C. Carstens, and L. L. Knowles STEM: species tree estimation using maximum likelihood for gene trees under coalescence Bioinformatics, April 1, 2009; 25(7): 971 - 973. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-W. Yuan and R. G. Olmstead Evolution and phylogenetic utility of the PHOT gene duplicates in the Verbena complex (Verbenaceae): dramatic intron size variation and footprint of ancestral recombination Am. J. Botany, September 1, 2008; 95(9): 1166 - 1176. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Sanderson Phylogenetic Signal in the Eukaryotic Tree of Life Science, July 4, 2008; 321(5885): 121 - 123. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Horvath, D. W. Weisrock, S. L. Embry, I. Fiorentino, J. P. Balhoff, P. Kappeler, G. A. Wray, H. F. Willard, and A. D. Yoder Development and application of a phylogenomic toolkit: Resolving the evolutionary history of Madagascar's lemurs Genome Res., March 1, 2008; 18(3): 489 - 499. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. D. Rasmussen and M. Kellis Accurate gene-tree reconstruction by learning gene- and species-specific substitution rates across multiple complete genomes Genome Res., December 1, 2007; 17(12): 1932 - 1942. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. V. Edwards, L. Liu, and D. K. Pearl High-resolution species trees without concatenation PNAS, April 3, 2007; 104(14): 5936 - 5941. [Abstract] [Full Text] [PDF] |
||||





