Molecular Biology and Evolution, Vol 14, 287-298, Copyright © 1997 by Society for Molecular Biology and Evolution
NJ Tourasse and M Gouy
The rate of evolution of macromolecules such as ribosomal RNAs and proteins
varies along the molecule because structural and functional constraints
differ between sites. Many studies have shown that ignoring this variation
in computing evolutionary distances leads to severe underestimation of
sequence divergences, and thus can lead to misleading evolutionary tree
inferences. We propose here a new parsimony-based method for computing
evolutionary distances between pairs of sequences that takes into account
this variation and estimates it from the data. This method applies to the
number of substitutions per site in ribosomal RNA genes as well as to the
number of nonsynonymous substitutions per codon for protein-coding genes
and is especially suitable when large data sets (> or = 100 sequences)
are analyzed. First, starting from a phylogeny constructed with usual
distances, the maximum-parsimony method is used to infer the distribution
of the number of substitutions that have occurred at each site (or codon)
along this tree. This distribution is then fitted to an "invariant +
truncated negative binomial" distribution that allows for invariant sites.
Maximum-likelihood fitting of this distribution to different data sets
showed that it agreed very well with real data. Noticeably, allowing for
invariant sites seemed to be very important. Finally, two distance
estimates were developed by introducing the distribution of site
variability into the substitution models of Jukes and Cantor and of Kimura.
The use of different numbers of aligned sequences (up to 1,000 rRNA
sequences) showed that the parameters of the model are very sensitive to
the number of sequences used to estimate them. However, if at least 100
sequences are considered, the two new distance estimates are quite stable
with respect to the number of sequences used to fit the distribution. This
stability is true for low as well as for high evolutionary distances. These
new distances appeared to be much better estimates of the number of
substitutions per site than the classical distances of Jukes and Cantor and
of Kimura, which both greatly underestimate this number, so that they can
serve as indexes to detect saturation. We conclude that the new distances
are particularly suitable for phylogenetic analysis when very distantly
related species and relatively large data sets are considered. Trees
reconstructed using these distances are generally different from those
constructed by means of the classical estimates. Using this new method, we
showed that the mean evolutionary distance between Prokaryotes and
Eukaryotes is substantially higher for the small-subunit than for the
large-subunit rRNAs. This suggests than the former might have experienced a
drastic change during the early evolution of Eukaryotes.
ORIGINAL ARTICLE
Evolutionary distances between nucleotide sequences based on the distribution of substitution rates among sites as estimated by parsimony
Laboratorie de Biometrie, Genetique et Biologie des Populations, Universite Claude Bernard, France.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. Abhiman, C. O. Daub, and E. L. L. Sonnhammer Prediction of Function Divergence in Protein Families Using the Substitution Rate Variation Parameter Alpha Mol. Biol. Evol., July 1, 2006; 23(7): 1406 - 1413. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Xu, R. Chakraborty, and Y.-X. Fu Mutation Rate Variation at Human Dinucleotide Microsatellites Genetics, May 1, 2005; 170(1): 305 - 312. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Galtier Maximum-Likelihood Phylogenetic Analysis Under a Covarion-like Model Mol. Biol. Evol., May 1, 2001; 18(5): 866 - 873. [Abstract] [Full Text] |
||||
![]() |
A. Sekowska, A. Danchin, and J.-L. Risler Phylogeny of related functions: the case of polyamine biosynthetic enzymes Microbiology, August 1, 2000; 146(8): 1815 - 1828. [Abstract] [Full Text] |
||||
![]() |
N. Galtier, N. Tourasse, and M. Gouy A Nonhyperthermophilic Common Ancestor to Extant Life Forms Science, January 8, 1999; 283(5399): 220 - 221. [Abstract] [Full Text] |
||||
![]() |
X. Gu and W.-H. Li Estimation of evolutionary distances under stationary and nonstationary models of nucleotide substitution PNAS, May 26, 1998; 95(11): 5899 - 5905. [Abstract] [Full Text] [PDF] |
||||




