Molecular Biology and Evolution, Vol 13, 1255-1265, Copyright © 1996 by Society for Molecular Biology and Evolution
A Rzhetsky and T Sitnikova
The choice of an "optimal" mathematical model for computing evolutionary
distances from real sequences is not currently supported by easy-to-use
software applicable to large data sets, and an investigator frequently
selects one of the simplest models available. Here we study properties of
the observed proportion of differences (p- distance) between sequences as
an estimator of evolutionary distance for tree-making. We show that
p-distances allow for consistent tree- making with any of the popular
methods working with evolutionary distances if evolution of sequences obeys
a "molecular clock" (more precisely, if it follows a stationary
time-reversible Markov model of nucleotide substitution). Next, we show
that p-distances seem to be efficient in recovering the correct tree
topology under a "molecular clock," but produce "statistically supported"
wrong trees when substitutions rates vary among evolutionary lineages.
Finally, we outline a practical approach for selecting an "optimal" model
of nucleotide substitution in a real data analysis, and obtain a crude
estimate of a "prior" distribution of the expected tree branch lengths
under the Jukes-Cantor model. We conclude that the use of a model that is
obviously oversimplified is inadvisable unless it is justified by a
preliminary analysis of the real sequences.
ORIGINAL ARTICLE
When is it safe to use an oversimplified substitution model in tree- making?
Institute of Molecular Evolutionary Genetics, Pennsylvania State University, University Park 16802, USA.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
E. Susko, Y. Inagaki, and A. J. Roger On Inconsistency of the Neighbor-Joining, Least Squares, and Minimum Evolution Estimation When Substitution Processes Are Incorrectly Modeled Mol. Biol. Evol., September 1, 2004; 21(9): 1629 - 1642. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Tamura, M. Nei, and S. Kumar Prospects for inferring very large phylogenies by using the neighbor-joining method PNAS, July 27, 2004; 101(30): 11030 - 11035. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Guindon and O. Gascuel Efficient Biased Estimation of Evolutionary Distances When Substitution Rates Vary Across Sites Mol. Biol. Evol., April 1, 2002; 19(4): 534 - 543. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Posada and K. A. Crandall Selecting Models of Nucleotide Substitution: An Application to Human Immunodeficiency Virus 1 (HIV-1) Mol. Biol. Evol., June 1, 2001; 18(6): 897 - 906. [Abstract] [Full Text] |
||||
![]() |
M. A. Suchard, R. E. Weiss, and J. S. Sinsheimer Bayesian Selection of Continuous-Time Markov Chain Evolutionary Models Mol. Biol. Evol., June 1, 2001; 18(6): 1001 - 1013. [Abstract] [Full Text] |
||||
![]() |
N. V. Grishin, Y. I. Wolf, and E. V. Koonin From Complete Genomes to Measures of Substitution Rate Variability Within and Between Proteins Genome Res., July 1, 2000; 10(7): 991 - 1000. [Abstract] [Full Text] |
||||
![]() |
M. Nei, S. Kumar, and K. Takahashi The optimization principle in phylogenetic analysis tends to give incorrect topologies when the number of nucleotides or amino acids used is small PNAS, October 13, 1998; 95(21): 12390 - 12397. [Abstract] [Full Text] [PDF] |
||||


