Molecular Biology and Evolution 18:481-490 (2001)
© 2001 Society for Molecular Biology and Evolution
ARTICLE |
Models of Sequence Evolution for DNA Sequences Containing Gaps
Department of Applied Statistics, University of Reading, Reading, England
Most evolutionary tree estimation methods for DNA sequences ignore or inefficiently use the phylogenetic information contained within shared patterns of gaps. This is largely due to the computational difficulties in implementing models for insertions and deletions. A simple way to incorporate this information is to treat a gap as a fifth character (with the four nucleotides being the other four) and to incorporate it within a Markov model of nucleotide substitution. This idea has been dismissed in the past, since it treats a multiple-site insertion or deletion as a sequence of independent events rather than a single event. While this is true, we have found that under many circumstances it is better to incorporate gap information inadequately than to ignore it, at least for topology estimation. We propose an extension to a class of nucleotide substitution models to incorporate the gap character and show that, for data sets (both real and simulated) with short and medium gaps, these models do lead to effective use of the information contained within insertions and deletions. We also implement an ad hoc method in which the likelihood at columns containing multiple-site gaps is downweighted in order to avoid giving them undue influence. The precision of the estimated tree, assessed using Markov chain Monte Carlo techniques to find the posterior distribution over tree space, improves under these five-state models compared with standard methods which effectively ignore gaps.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
D. F. Soria-Hernanz, J. M. Braverman, and M. B. Hamilton Parallel Rate Heterogeneity in Chloroplast and Mitochondrial Genomes of Brazil Nut Trees (Lecythidaceae) Is Consistent with Lineage Effects Mol. Biol. Evol., July 1, 2008; 25(7): 1282 - 1296. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Taylor, S. Tyekucheva, D. C. King, R. C. Hardison, W. Miller, and F. Chiaromonte ESPERR: Learning strong and weak signals in genomic sequence alignments to identify functional elements Genome Res., December 1, 2006; 16(12): 1596 - 1604. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. C. Brandley, A. Schmitz, and T. W. Reeder Partitioned Bayesian Analyses, Partition Choice, and the Phylogenetic Relationships of Scincid Lizards Syst Biol, June 1, 2005; 54(3): 373 - 390. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. M. Cooper, M. Brudno, E. A. Stone, I. Dubchak, S. Batzoglou, and A. Sidow Characterization of Evolutionary Rates and Constraints in Three Mammalian Genomes Genome Res., April 1, 2004; 14(4): 539 - 548. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. R. Lemmon and E. C. Moriarty The Importance of Proper Model Assumption in Bayesian Phylogenetics Syst Biol, April 1, 2004; 53(2): 265 - 277. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. K. Ingvarsson, S. Ribstein, and D. R. Taylor Molecular Evolution of Insertions and Deletion in the Chloroplast Genome of Silene Mol. Biol. Evol., November 1, 2003; 20(11): 1737 - 1740. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. B. Hamilton, J. M. Braverman, and D. F. Soria-Hernanz Patterns and Relative Rates of Nucleotide and Insertion/Deletion Evolution at Six Chloroplast Intergenic Regions in New World Species of the Lecythidaceae Mol. Biol. Evol., October 1, 2003; 20(10): 1710 - 1721. [Abstract] [Full Text] |
||||
![]() |
K. Saltonstall Cryptic invasion by a non-native genotype of the common reed, Phragmites australis, into North America PNAS, February 19, 2002; 99(4): 2445 - 2449. [Abstract] [Full Text] [PDF] |
||||



