MBE Advance Access published online on May 7, 2007
Molecular Biology and Evolution, doi:10.1093/molbev/msm081
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Article |
Fair-Balance Paradox, Star-Tree Paradox and Bayesian Phylogenetics
Department of Biology, Galton Laboratory, University College London, Darwin Building, Gower Street, London WC1E 6BT, England
Correspondence to: Ziheng Yang, Department of Biology, University College London, Darwin Building, Gower Street, London WC1E 6BT, England, Tel: +44 (20) 7679 4379, Fax: +44 (20) 7679 7096, Email: z.yang{at}ucl.ac.uk
Received for publication November 7, 2006. Revision received March 16, 2007. Accepted for publication April 18, 2007.
The star-tree paradox refers to the conjecture that the posterior probabilities for the three unrooted trees for four species (or the three rooted trees for three species if the molecular clock is assumed) do not approach
when the data are generated using the star tree and when the amount of data approaches infinity. It reflects the more general phenomenon of high and presumably spurious posterior probabilities for trees or clades produced by the Bayesian method of phylogenetic reconstruction, and is perceived to be a manifestation of the deeper problem of the extreme sensitivity of Bayesian model selection to the prior on parameters. Analysis of the star-tree paradox has been hampered by the intractability of the integrals involved. In this paper, I use Laplacian expansion to approximate the posterior probabilities for the three rooted trees for three species using binary characters evolving at a constant rate. The approximation enables calculation of posterior tree probabilities for arbitrarily large data sets. Both theoretical analysis of the analogous fair-coin and fair-balance problems and computer simulation for the tree problem confirmed the existence of the star-tree paradox. When the data size n
, the posterior tree probabilities do not converge to
each, but vary among data sets according to a statistical distribution. This distribution is characterized. Two strategies for resolving the star-tree paradox are explored: (i) a nonzero prior probability for the degenerate star tree and (ii) an increasingly informative prior forcing the internal branch length towards zero. Both appear to be effective in resolving the paradox, while the latter is simpler to implement. The posterior tree probabilities are found to be very sensitive to the prior.
Key Words: Lindley's paradox fair-balance paradox star-tree paradox prior clade probabilities
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. R. Lemmon, J. M. Brown, K. Stanger-Hall, and E. M. Lemmon The Effect of Ambiguous Data on Phylogenetic Estimates Obtained by Maximum Likelihood and Bayesian Inference Syst Biol, May 22, 2009; (2009) syp017v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. D. McKenna, A. S. Sequeira, A. E. Marvaldi, and B. D. Farrell Temporal lags and overlap in the diversification of weevils and flowering plants PNAS, April 28, 2009; 106(17): 7083 - 7088. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Yang Empirical evaluation of a prior for Bayesian phylogenetic inference Phil Trans R Soc B, December 27, 2008; 363(1512): 4031 - 4039. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. B. Prasad, M. W. Allard, NISC Comparative Sequencing Program, and E. D. Green Confirming the Phylogeny of Mammals by Use of Large Comparative Sequence Data Sets Mol. Biol. Evol., September 1, 2008; 25(9): 1795 - 1808. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Susko On the Distributions of Bootstrap Support and Posterior Distributions for a Star Tree Syst Biol, August 1, 2008; 57(4): 602 - 612. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Dohrmann, D. Janussen, J. Reitner, A. G. Collins, and G. Worheide Phylogeny and Evolution of Glass Sponges (Porifera, Hexactinellida) Syst Biol, June 1, 2008; 57(3): 388 - 405. [Abstract] [Full Text] [PDF] |
||||



