Skip Navigation

This Article
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Berry, V.
Right arrow Articles by Gascuel, O.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Berry, V.
Right arrow Articles by Gascuel, O.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Mol. Biol. Evol. 13(7):999-1011. 1996
DOI:
© 1996 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038

On the Interpretation of Bootstrap Trees: Appropriate Threshold of Clade Selection and Induced Gain

Vincent Berry and Olivier Gascuel

LIRMM, UMR 9928 Universite Montpellier II/CNRS, 161, Rue Ada, 34392 Montpellier cedex 5, France

E-mail: gascuel{at}lirmm.fr.

In this study we address the problem of interpreting a bootstrap tree. The main issue is choosing the threshold of clade selection in order to separate reliable clades from unreliable ones, depending on their bootstrap proportion. This threshold depends on the chosen error measure. We investigate error measures that stem from a generalization of Robinson and Foulds' (1981) distance, used to quantify the divergence between the true phylogeny and the estimated trees. We propose two analytical approximations of the optimum threshold of clade selection to interpret (i.e., reduce) the bootstrap tree. We performed extensive simulations along the lines of Kuhner and Felsenstein (1994) using the neighbor-joining and the maximum-parsimony methods. These simulations show that our approximations cause only small losses in quality when compared to the optimum threshold resulting from empirical observation. Next, we measured the error reduction achieved when estimating the true phylogeny by the properly reduced bootstrap tree rather than by the complete original tree, obtained with a classical tree-building method. Our simulations on short sequences show than an error reduction of 39% is achieved with the parsimony method and an error reduction of 33% is achieved with the distance method when the error is measured with the standard Robinson and Foulds distance. The observed error reduction is shown to originate from an important decrease in Type I error (wrong inferences), while Type II error (omitted correct clades) is only slightly increased. Greater error reduction is achieved when shorter sequences are used, and when more importance is given to Type I error than to Type II error. To investigate the causes of error from another point of view, we propose a general decomposition of the error expectation in two terms of bias, and one of variance. Results for these terms show that no fundamental bias is introduced by the bootstrap process, the only source of bias being structural (lack of resolution). Moreover, the variance in the estimations is greatly reduced, providing another explanation for the better results of the reduced bootstrap tree compared with the original tree estimate.

Key Words: bootstrap method • threshold of clade selection • topological distance • Type I and Type II error • bias/variance compromise • maximum parsimony • neighbor joining • computer simulations


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Syst BiolHome page
M. T. Holder, J. Sukumaran, and P. O. Lewis
A Justification for Reporting the Majority-Rule Consensus Tree in Bayesian Phylogenetics
Syst Biol, October 1, 2008; 57(5): 814 - 821.
[Full Text] [PDF]


Home page
Syst BiolHome page
A. Criscuolo, V. Berry, E. J. P. Douzery, and O. Gascuel
SDM: A Fast Distance-Based Approach for (Super)Tree Building in Phylogenomics
Syst Biol, October 1, 2006; 55(5): 740 - 755.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
M. Anisimova and O. Gascuel
Approximate Likelihood-Ratio Test for Branches: A Fast, Accurate, and Powerful Alternative
Syst Biol, August 1, 2006; 55(4): 539 - 552.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
V. Berry and C. Semple
Fast Computation of Supertrees for Compatible Phylogenies with Nested Taxa
Syst Biol, April 1, 2006; 55(2): 270 - 288.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
Z. Yang and B. Rannala
Branch-Length Prior Influences Bayesian Posterior Probability of Phylogeny
Syst Biol, June 1, 2005; 54(3): 455 - 470.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
R. Sanjuan and B. Wrobel
Weighted Least-Squares Likelihood Ratio Test for Branch Testing in Phylogenies Reconstructed from Distance Measures
Syst Biol, April 1, 2005; 54(2): 218 - 229.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
N. Galtier
Sampling Properties of the Bootstrap Support in Molecular Phylogeny: Influence of Nonindependence Among Sites
Syst Biol, February 1, 2004; 53(1): 38 - 46.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.