Skip Navigation


MBE Advance Access originally published online on April 17, 2009
Molecular Biology and Evolution 2009 26(7):1641-1650; doi:10.1093/molbev/msp077
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Data
Right arrowOA All Versions of this Article:
26/7/1641    most recent
msp077v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Google Scholar
Right arrow Articles by Price, M. N.
Right arrow Articles by Arkin, A. P.
PubMed
Right arrow PubMed Citation
Right arrow Articles by Price, M. N.
Right arrow Articles by Arkin, A. P.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2009 The Authors
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.


Research Articles

FastTree: Computing Large Minimum Evolution Trees with Profiles instead of a Distance Matrix

Morgan N. Price*,{dagger}

Paramvir S. Dehal*,{dagger}

Adam P. Arkin*,{dagger},{ddagger}

* Physical Biosciences Division, Lawrence Berkeley National Laboratory
{dagger} Virtual Institute of Microbial Stress and Survival, Lawrence Berkeley National Laboratory
{ddagger} Department of Bioengineering, University of California, Berkeley

E-mail: morgannprice{at}yahoo.com.

Accepted for publication April 11, 2009.

Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement Neighbor-Joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N2) space and O(N2L) time, but FastTree requires just O(NLa + NFormula ) memory and O(NFormula log (N)La) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 h and 2.4 GB of memory. Just computing pairwise Jukes–Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 h and 50 GB of memory. In simulations, FastTree was slightly more accurate than Neighbor-Joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.

Key Words: minimum evolution • Neighbor-Joining • large phylogenies


While this paper was under review, we implemented tree-comparison in O(N) space and approximately O(N) time(http://www.microbesonline.org/fasttree/ treecmp.html). This makes it possible to use the traditional bootstrap with tens of thousands of sequences.

Koichiro Tamura, Associate Editor


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
R. D. Finn, J. Mistry, J. Tate, P. Coggill, A. Heger, J. E. Pollington, O. L. Gavin, P. Gunasekaran, G. Ceric, K. Forslund, et al.
The Pfam protein families database
Nucleic Acids Res., November 17, 2009; (2009) gkp985v1.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.