Skip Navigation



MBE Advance Access published online on December 8, 2004

Molecular Biology and Evolution, doi:10.1093/molbev/msi065
This Article
Right arrow Advance Access manuscript (PDF) Freely available
Right arrow Correction to PDF
Right arrow All Versions of this Article:
22/3/784    most recent
msi065v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Blouin, C.
Right arrow Articles by Roger, A.J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Blouin, C.
Right arrow Articles by Roger, A.J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Molecular Biology and Evolution © Society for Molecular Biology and Evolution 2004; all rights reserved.
Accepted December 6, 2004

Research Article

The Impact of Taxon Sampling on the Estimation of Rates of Evolution at Sites

C. Blouin 1*, D. Butt 2, and A.J. Roger 3

1 Genome Atlantic, Dept. of Biochemistry and Molecular Biology, Dalhousie University, Canada, 5850 University, Ave., B3H 1X5; Faculty of Computer Science, Dalhousie University, Canada, 6050 University Ave., B3H 1W5
2 Faculty of Computer Science, Dalhousie University, Canada, 6050 University Ave., B3H 1W5
3 Genome Atlantic, Dept. of Biochemistry and Molecular Biology, Dalhousie University, Canada, 5850 University, Ave., B3H 1X5; Canadian Institute for Advanced Research, Program in Evolutionary Biology


   Abstract

The function of individual sites within a protein influences their rate of accepted point mutations. During the computation of phylogenetic likelihoods, rate heterogeneity can be modeled on a site-per-site basis with relative rates drawn from a discretized -distribution. Site-rate estimates (e.g. the rate of highest posterior probability given the data at a site) can then be used as a measure of evolutionary constraints imposed by function. However, if the sequence availability is limited, the evaluation of rates is subject to sampling error. This paper presents a simulation study that evaluates the robustness of evolutionary site rate estimates for both small and phylogenetically unbalanced samples. The sampling error on rate estimates was first evaluated for alignments that included 5-45 sequences, sampled by jackknifing, from a master alignment containing 968 sequences. We observed that the potentially enhanced resolution amongst site rates due to the inclusion of a larger number of rate categories is negated by the difficulty to correctly estimate intermediate rates. This effect is marked for datasets with less than 30 sequences. Although the computation of likelihood theoretically accounts for phylogenetic distances through branch lengths, the introduction of a single long-branch outlier sequence had a significant negative effect on site rate estimates. Finally, the presence of a shift in rate of evolution between related lineages can be diagnostic of a gain/loss of function within a protein family. Our analyses indicate that detecting these rate shifts is a harder problem than estimating rates. This is so, partially, because the difference in rates depends on two rate estimates, each with an intrinsic uncertainty. The performances of four methods to detect these site rate shifts are evaluated and compared. Guidelines to prepare datasets minimally influenced by error introduced by sequence sampling are suggested.

Keywords: protein; evolutionary rate; functional divergence; maximum likelihood; simulation.
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Mol Biol EvolHome page
A. Bar-Hen, M. Mariadassou, M.-A. Poursat, and P. Vandenkoornhuyse
Influence Function for Robust Phylogenetic Reconstructions
Mol. Biol. Evol., May 1, 2008; 25(5): 869 - 873.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. G. Beiko and R. L. Charlebois
A simulation test bed for hypotheses of genome evolution
Bioinformatics, April 1, 2007; 23(7): 825 - 831.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.