Skip Navigation


MBE Advance Access originally published online on December 5, 2003
This Article
Right arrow Full Text Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
21/3/468    most recent
msh039v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (103)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Siepel, A.
Right arrow Articles by Haussler, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Siepel, A.
Right arrow Articles by Haussler, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Mol. Biol. Evol. 21(3):468-488. 2004
DOI: 10.1093/molbev/msh039
© 2004 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038

Phylogenetic Estimation of Context-Dependent Substitution Rates by Maximum Likelihood

Adam Siepel* and David Haussler*,{dagger}

* Center for Biomolecular Science and Engineering, University of California, Santa Cruz
{dagger} Howard Hughes Medical Institute, University of California, Santa Cruz

E-mail: acs{at}soe.ucsc.edu.

Nucleotide substitution in both coding and noncoding regions is context-dependent, in the sense that substitution rates depend on the identity of neighboring bases. Context-dependent substitution has been modeled in the case of two sequences and an unrooted phylogenetic tree, but it has only been accommodated in limited ways with more general phylogenies. In this article, extensions are presented to standard phylogenetic models that allow for better handling of context-dependent substitution, yet still permit exact inference at reasonable computational cost. The new models improve goodness of fit substantially for both coding and noncoding data. Considering context dependence leads to much larger improvements than does using a richer substitution model or allowing for rate variation across sites, under the assumption of site independence. The observed improvements appear to derive from three separate properties of the models: their explicit characterization of context-dependent substitution within N-tuples of adjacent sites, their ability to accommodate overlapping N-tuples, and their rich parameterization of the substitution process. Parameter estimation is accomplished using an expectation maximization algorithm, with a quasi-Newton algorithm for the maximization step; this approach is shown to be preferable to ordinary Newton methods for parameter-rich models. Overlapping tuples are efficiently handled by assuming Markov dependence of the observed bases at each site on those at the N - 1 preceding sites, and the required conditional probabilities are computed with an extension of Felsenstein's algorithm. Estimated substitution rates based on a data set of about 160,000 noncoding sites in mammalian genomes indicate a pronounced CpG effect, but they also suggest a complex overall pattern of context-dependent substitution, comprising a variety of subtle effects. Estimates based on about 3 million sites in coding regions demonstrate that amino acid substitution rates can be learned at the nucleotide level, and suggest that context effects across codon boundaries are significant.

Key Words: neighbor-dependent substitution • CpG effect • codon model • expectation maximization • substitution rate matrix


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Mol Biol EvolHome page
A. P. J. de Koning, W. Gu, and D. D. Pollock
Rapid Likelihood Analysis on Large Phylogenies Using Partial Sampling of Substitution Histories
Mol. Biol. Evol., February 1, 2010; 27(2): 249 - 265.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
K. S. Pollard, M. J. Hubisz, K. R. Rosenbloom, and A. Siepel
Detection of nonneutral substitution rates on mammalian phylogenies
Genome Res., January 1, 2010; 20(1): 110 - 121.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Hiard, C. Charlier, W. Coppieters, M. Georges, and D. Baurain
Patrocles: a database of polymorphic miRNA-mediated gene regulation in vertebrates
Nucleic Acids Res., January 1, 2010; 38(suppl_1): D640 - D651.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
L. Eory, D. L. Halligan, and P. D. Keightley
Distributions of Selectively Constrained Sites and Deleterious Mutation Rates in the Hominid and Murid Genomes
Mol. Biol. Evol., January 1, 2010; 27(1): 177 - 192.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
Y. Suzuki, T. Gojobori, and S. Kumar
Methods for Incorporating the Hypermutability of CpG Dinucleotides in Detecting Natural Selection Operating at the Amino Acid Sequence Level
Mol. Biol. Evol., October 1, 2009; 26(10): 2275 - 2284.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
W. Fletcher and Z. Yang
INDELible: A Flexible Simulator of Biological Sequence Evolution
Mol. Biol. Evol., August 1, 2009; 26(8): 1879 - 1888.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
P. D. Keightley, U. Trivedi, M. Thomson, F. Oliver, S. Kumar, and M. L. Blaxter
Analysis of the genome sequences of three Drosophila melanogaster spontaneous mutation accumulation lines
Genome Res., July 1, 2009; 19(7): 1195 - 1201.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
M. Anisimova and C. Kosiol
Investigating Protein-Coding Sequence Evolution with Probabilistic Codon Substitution Models
Mol. Biol. Evol., February 1, 2009; 26(2): 255 - 271.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
W. H. Majoros and U. Ohler
Complexity reduction in context-dependent DNA substitution models
Bioinformatics, January 15, 2009; 25(2): 175 - 182.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
W. Delport, K. Scheffler, and C. Seoighe
Models of coding sequence evolution
Brief Bioinform, January 1, 2009; 10(1): 97 - 109.
[Abstract] [Full Text] [PDF]


Home page
Phil Trans R Soc BHome page
S. C. Choi, B. D Redelings, and J. L Thorne
Basing population genetic inferences and models of molecular evolution upon desired stationary distributions of DNA or protein sequences
Phil Trans R Soc B, December 27, 2008; 363(1512): 3931 - 3939.
[Abstract] [Full Text] [PDF]


Home page
Brief Funct Genomic ProteomicHome page
E. E. Pashos, E. Kague, and S. Fisher
Evaluation of cis-regulatory function in zebrafish
Briefings in Functional Genomics, November 1, 2008; 7(6): 465 - 473.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
B. Paten, J. Herrero, S. Fitzgerald, K. Beal, P. Flicek, I. Holmes, and E. Birney
Genome-wide nucleotide-level mammalian ancestor reconstruction
Genome Res., November 1, 2008; 18(11): 1829 - 1843.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
G. Baele, Y. Van de Peer, and S. Vansteelandt
A Model-Based Approach to Study Nearest-Neighbor Influences Reveals Complex Substitution Patterns in Non-coding Sequences
Syst Biol, October 1, 2008; 57(5): 675 - 692.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
A. K. Holloway, D. J. Begun, A. Siepel, and K. S. Pollard
Accelerated sequence divergence of conserved genomic elements in Drosophila melanogaster
Genome Res., October 1, 2008; 18(10): 1592 - 1601.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. Peifer, J. E. Karro, and H. H. von Grunberg
Is there an acceleration of the CpG transition rate during the mammalian radiation?
Bioinformatics, October 1, 2008; 24(19): 2157 - 2164.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. L. Kosakovsky Pond, A. F.Y. Poon, A. J. Leigh Brown, and S. D.W. Frost
A Maximum Likelihood Method for Detecting Directional Evolution in Protein Sequences and Its Application to Influenza A Virus
Mol. Biol. Evol., September 1, 2008; 25(9): 1809 - 1824.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
L. Barquist and I. Holmes
xREI: a phylo-grammar visualization webserver
Nucleic Acids Res., July 1, 2008; 36(suppl_2): W65 - W69.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
S. Ke, X. H.-F. Zhang, and L. A. Chasin
Positive selection acting on splicing motifs reflects compensatory evolution
Genome Res., April 1, 2008; 18(4): 533 - 543.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
Z. Yang and R. Nielsen
Mutation-Selection Models of Codon Substitution and Their Use to Estimate Selective Strengths on Codon Usage
Mol. Biol. Evol., March 1, 2008; 25(3): 568 - 579.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
G. Lunter, A. Rocco, N. Mimouni, A. Heger, A. Caldeira, and J. Hein
Uncertainty in homology inferences: Assessing and improving genomic sequence alignment
Genome Res., February 1, 2008; 18(2): 298 - 309.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. K. Bradley and I. Holmes
Transducers: an emerging probabilistic framework for modeling indels on trees
Bioinformatics, December 1, 2007; 23(23): 3258 - 3262.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
C. T. Saunders and P. Green
Insights from Modeling Protein Evolution with Context-Dependent Mutation and Asymmetric Amino Acid Selection
Mol. Biol. Evol., December 1, 2007; 24(12): 2632 - 2647.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
J. L. Thorne, S. C. Choi, J. Yu, P. G. Higgs, and H. Kishino
Population Genetics Without Intraspecific Data
Mol. Biol. Evol., August 1, 2007; 24(8): 1667 - 1677.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. C. Choi, A. Hobolth, D. M. Robinson, H. Kishino, and J. L. Thorne
Quantifying the Impact of Protein Tertiary Structure on Molecular Evolution
Mol. Biol. Evol., August 1, 2007; 24(8): 1769 - 1782.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
R. D. Hernandez, S. H. Williamson, and C. D. Bustamante
Context Dependence, Ancestral Misidentification, and Spurious Signatures of Natural Selection
Mol. Biol. Evol., August 1, 2007; 24(8): 1792 - 1800.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
L. Carmel, Y. I. Wolf, I. B. Rogozin, and E. V. Koonin
Three distinct modes of intron dynamics in the evolution of eukaryotes
Genome Res., July 1, 2007; 17(7): 1034 - 1044.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
E. H. Margulies, G. M. Cooper, G. Asimenos, D. J. Thomas, C. N. Dewey, A. Siepel, E. Birney, D. Keefe, A. S. Schwartz, M. Hou, et al.
Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome
Genome Res., June 1, 2007; 17(6): 760 - 774.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
A. Tanay, A. H. O'Donnell, M. Damelin, and T. H. Bestor
Hyperconserved CpG domains underlie Polycomb-binding sites
PNAS, March 27, 2007; 104(13): 5521 - 5526.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
B. Shapiro, A. Rambaut, O. G. Pybus, and E. C. Holmes
A Phylogenetic Method for Detecting Positive Epistasis in Gene Sequences and Its Application to RNA Virus Evolution
Mol. Biol. Evol., September 1, 2006; 23(9): 1724 - 1730.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
N. Rodrigue, H. Philippe, and N. Lartillot
Assessing Site-Interdependent Phylogenetic Models of Sequence Evolution
Mol. Biol. Evol., September 1, 2006; 23(9): 1762 - 1775.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
A. Tanay
Extensive low-affinity transcriptional interactions in the yeast genome
Genome Res., August 1, 2006; 16(8): 962 - 972.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
J. Yu and J. L. Thorne
Dependence among Sites in RNA Evolution
Mol. Biol. Evol., August 1, 2006; 23(8): 1525 - 1537.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
A. Hobolth, R. Nielsen, Y. Wang, F. Wu, and S. D. Tanksley
CpG + CpNpG Analysis of Protein-Coding Sequences from Tomato
Mol. Biol. Evol., June 1, 2006; 23(6): 1318 - 1323.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
M. W. Hahn
Accurate Inference and Estimation in Population Genomics
Mol. Biol. Evol., May 1, 2006; 23(5): 911 - 918.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. Biol.Home page
D. Antonini, B. Rossi, R. Han, A. Minichiello, T. Di Palma, M. Corrado, S. Banfi, M. Zannini, J. L. Brissette, and C. Missero
An Autoregulatory Loop Directs the Tissue-Specific Expression of p63 through a Long-Range Evolutionarily Conserved Enhancer
Mol. Cell. Biol., April 15, 2006; 26(8): 3308 - 3318.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
T. Gesell and A. von Haeseler
In silico sequence evolution with site-specific interactions along phylogenetic trees
Bioinformatics, March 15, 2006; 22(6): 716 - 722.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
J. Taylor, S. Tyekucheva, M. Zody, F. Chiaromonte, and K. D. Makova
Strong and Weak Male Mutation Bias at Different Sites in the Primate Genomes: Insights from the Human-Chimpanzee Comparison
Mol. Biol. Evol., March 1, 2006; 23(3): 565 - 573.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
Y. Y. Tseng and J. Liang
Estimation of Amino Acid Residue Substitution Rates at Local Spatial Regions and Application in Protein Function Inference: A Bayesian Monte Carlo Approach
Mol. Biol. Evol., February 1, 2006; 23(2): 421 - 436.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
A. Stern and T. Pupko
An Evolutionary Space-Time Model with Varying Among-Site Dependencies
Mol. Biol. Evol., February 1, 2006; 23(2): 392 - 400.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. Tran, P. Havlak, and J. Miller
MicroRNA enrichment among short 'ultraconserved' sequences in insects.
Nucleic Acids Res., January 1, 2006; 34(9): e65 - e65.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
M. Sironi, G. Menozzi, G. P. Comi, R. Cagliani, N. Bresolin, and U. Pozzoli
Analysis of intronic conserved elements indicates that functional complexity might represent a major source of negative selection on non-coding sequences
Hum. Mol. Genet., September 1, 2005; 14(17): 2533 - 2546.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
A. Siepel, G. Bejerano, J. S. Pedersen, A. S. Hinrichs, M. Hou, K. Rosenbloom, H. Clawson, J. Spieth, L. W. Hillier, S. Richards, et al.
Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes
Genome Res., August 1, 2005; 15(8): 1034 - 1050.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
D. J. Gaffney and P. D. Keightley
The scale of mutational variation in the murid genome
Genome Res., August 1, 2005; 15(8): 1086 - 1094.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
G. M. Cooper, E. A. Stone, G. Asimenos, NISC Comparative Sequencing Program, E. D. Green, S. Batzoglou, and A. Sidow
Distribution and intensity of constraint in mammalian genomic sequence
Genome Res., July 1, 2005; 15(7): 901 - 913.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
I. Holmes
Using evolutionary Expectation Maximization to estimate indel rates
Bioinformatics, May 15, 2005; 21(10): 2294 - 2300.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
P. F. Arndt and T. Hwa
Identification and measurement of neighbor-dependent nucleotide substitution processes
Bioinformatics, May 15, 2005; 21(10): 2322 - 2328.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
R. H. Brown, S. S. Gross, and M. R. Brent
Begin at the beginning: Predicting genes with 5' UTRs
Genome Res., May 1, 2005; 15(5): 742 - 747.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
E. H. Margulies, J. P. Vinson, NISC Comparative Sequencing Program, W. Miller, D. B. Jaffe, K. Lindblad-Toh, J. L. Chang, E. D. Green, E. S. Lander, J. C. Mullikin, et al.
An initial strategy for the systematic identification of functional elements in the human genome by low-redundancy comparative sequencing
PNAS, March 29, 2005; 102(13): 4795 - 4800.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
E. H. Margulies, NISC Comparative Sequencing Program, V. V. B. Maduro, P. J. Thomas, J. P. Tomkins, C. T. Amemiya, M. Luo, and E. D. Green
Comparative sequencing provides insights about the structure and conservation of marsupial and monotreme genomes
PNAS, March 1, 2005; 102(9): 3354 - 3359.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
I. Ovcharenko, G. G. Loots, M. A. Nobrega, R. C. Hardison, W. Miller, and L. Stubbs
Evolution and functional classification of vertebrate gene deserts
Genome Res., January 1, 2005; 15(1): 137 - 145.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
D. G. Hwang and P. Green
Inaugural Article: Bayesian Markov chain Monte Carlo sequence analysis reveals varying neutral substitution patterns in mammalian evolution
PNAS, September 28, 2004; 101(39): 13994 - 14001.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. S. Pedersen, I. M. Meyer, R. Forsberg, P. Simmonds, and J. Hein
A comparative method for finding and folding RNA secondary structures within protein-coding regions
Nucleic Acids Res., September 24, 2004; 32(16): 4925 - 4936.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
G. A. Huttley
Modeling the Impact of DNA Methylation on the Evolution of BRCA1 in Mammals
Mol. Biol. Evol., September 1, 2004; 21(9): 1760 - 1768.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. Whelan and N. Goldman
Estimating the Frequency of Events That Cause Multiple-Nucleotide Changes
Genetics, August 1, 2004; 167(4): 2027 - 2043.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.