Skip Navigation


MBE Advance Access originally published online on May 21, 2004
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
21/9/1629    most recent
msh159v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (15)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Susko, E.
Right arrow Articles by Roger, A. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Susko, E.
Right arrow Articles by Roger, A. J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Mol. Biol. Evol. 21(9):1629-1642. 2004
DOI: 10.1093/molbev/msh159
© 2004 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038

On Inconsistency of the Neighbor-Joining, Least Squares, and Minimum Evolution Estimation When Substitution Processes Are Incorrectly Modeled

Edward Susko*, Yuji Inagaki{dagger} and Andrew J. Roger{dagger}

* Genome Atlantic, Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
{dagger} Genome Atlantic, Canadian Institute for Advanced Research, Program in Evolutionary Biology, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, Canada

E-mail: susko{at}mathstat.dal.ca.

Using analytical methods, we show that under a variety of model misspecifications, Neighbor-Joining, minimum evolution, and least squares estimation procedures are statistically inconsistent. Failure to correctly account for differing rates-across-sites processes, failure to correctly model rate matrix parameters, and failure to adjust for parallel rates-across-sites changes (a rates-across-subtrees process) are all shown to lead to a "long branch attraction" form of inconsistency. In addition, failure to account for rates-across-sites processes is also shown to result in underestimation of evolutionary distances for a wide variety of substitution models, generalizing an earlier analytical result for the Jukes-Cantor model reported in Golding (Mol. Biol. Evol. 1:125–142, 1983) and a similar bias result for the GTR or REV model in Kelly and Rice (1996). Although standard rates-across-sites models can be employed in many of these cases to restore consistency, current models cannot account for other kinds of misspecification. We examine an idealized but biologically relevant case, where parallel changes in rates at sites across subtrees is shown to give rise to inconsistency. This changing rates-across-subtrees type model misspecification cannot be adjusted for with conventional methods or without carefully considering the rate variation in the larger tree. The results are presented for four-taxon trees, but the expectation is that they have implications for larger trees as well. To illustrate this, a simulated 42-taxon example is given in which the microsporidia, an enigmatic group of eukaryotes, are incorrectly placed at the archaebacteria-eukaryotes split because of incorrectly specified pairwise distances. The analytical nature of the results lend insight into the reasons that long branch attraction tends to be a common form of inconsistency and reasons that other forms of inconsistency like "long branches repel" can arise in some settings. In many of the cases of inconsistency presented, a particular incorrect topology is estimated with probability converging to one, the implication being that measures of uncertainty like bootstrap support will be unable to detect that there is a problem with the estimation. The focus is on distance methods, but previous simulation results suggest that the zones of inconsistency for distance methods contain the zones of inconsistency for maximum likelihood methods as well.

Key Words: inconsistency • rates across sites • distance methods • Neighbor-Joining • molecular evolution • phylogenetics


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Syst BiolHome page
J. Kim and M. J. Sanderson
Penalized Likelihood Phylogenetic Inference: Bridging the Parsimony-Likelihood Gap
Syst Biol, October 1, 2008; 57(5): 665 - 674.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H. Birin, Z. Gal-Or, I. Elias, and T. Tuller
Inferring horizontal transfers in the presence of rearrangements by the minimum evolution criterion
Bioinformatics, March 15, 2008; 24(6): 826 - 832.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
M. Spencer, D. Bryant, and E. Susko
Conditioned Genome Reconstruction: How to Avoid Choosing the Conditioning Genome
Syst Biol, February 1, 2007; 56(1): 25 - 43.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
V. Ruano-Rubio and M. A. Fares
Artifactual Phylogenies Caused by Correlated Distribution of Substitution Rates among Sites and Lineages: The Good, the Bad, and the Ugly
Syst Biol, February 1, 2007; 56(1): 68 - 82.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
H.-C. Wang, M. Spencer, E. Susko, and A. J. Roger
Testing for Covarion-like Evolution in Protein Sequences
Mol. Biol. Evol., January 1, 2007; 24(1): 294 - 305.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
K. Shalchian-Tabrizi, M. Skanseng, F. Ronquist, D. Klaveness, T. R. Bachvaroff, C. F. Delwiche, A. Botnen, T. Tengs, and K. S. Jakobsen
Heterotachy Processes in Rhodophyte-Derived Secondhand Plastid Genes: Implications for Addressing the Origin and Evolution of Dinoflagellate Plastids
Mol. Biol. Evol., August 1, 2006; 23(8): 1504 - 1515.
[Abstract] [Full Text] [PDF]


Home page
Phil Trans R Soc BHome page
A. J Roger and L. A Hug
The origin and diversification of eukaryotes: problems with molecular phylogenetics and molecular clock estimation
Phil Trans R Soc B, June 29, 2006; 361(1470): 1039 - 1054.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
P. Lockhart, P. Novis, B. G. Milligan, J. Riden, A. Rambaut, and T. Larkum
Heterotachy and Tree Building: A Case Study with Plastids and Eubacteria
Mol. Biol. Evol., January 1, 2006; 23(1): 40 - 45.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
E. Belda, A. Moya, and F. J. Silva
Genome Rearrangement Distances and Gene Order Phylogeny in {gamma}-Proteobacteria
Mol. Biol. Evol., June 1, 2005; 22(6): 1456 - 1467.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
M. Spencer, E. Susko, and A. J. Roger
Likelihood, Parsimony, and Heterogeneous Evolution
Mol. Biol. Evol., May 1, 2005; 22(5): 1161 - 1164.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.