Skip Navigation


MBE Advance Access originally published online on November 4, 2008
Molecular Biology and Evolution 2009 26(1):217-230; doi:10.1093/molbev/msn244
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Data
Right arrow An erratum has been published
Right arrowOA All Versions of this Article:
26/1/217    most recent
msn244v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Google Scholar
Right arrow Articles by Henn, B. M.
Right arrow Articles by Mountain, J. L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Henn, B. M.
Right arrow Articles by Mountain, J. L.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2008 The Authors
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.


Research Articles

Characterizing the Time Dependency of Human Mitochondrial DNA Mutation Rate Estimates

Brenna M. Henn*, Christopher R. Gignoux{dagger},1, Marcus W. Feldman{ddagger} and Joanna L. Mountain*,{dagger}

* Department of Anthropology, Stanford University
{dagger} 23andMe, Inc., Mountain View, CA
{ddagger} Department of Biological Sciences, Stanford University

E-mail: bmhenn{at}stanford.edu.


    Abstract
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Conclusion
 Supplemental Material
 Acknowledgements
 References
 
Previous research has established a discrepancy of nearly an order of magnitude between pedigree-based and phylogeny-based (human vs. chimpanzee) estimates of the mitochondrial DNA (mtDNA) control region mutation rate. We characterize the time dependency of the human mitochondrial hypervariable region one mutation rate by generating 14 new phylogeny-based mutation rate estimates using within-human comparisons and archaeological dates. Rate estimates based on population events between 15,000 and 50,000 years ago are at least 2-fold lower than pedigree-based estimates. These within-human estimates are also higher than estimates generated from phylogeny-based human–chimpanzee comparisons. Our new estimates establish a rapid decay in evolutionary mutation rate between approximately 2,500 and 50,000 years ago and a slow decay from 50,000 to 6 Ma. We then extend this analysis to the mtDNA-coding region. Our within-human coding region mutation rate estimates display a similar, though less rapid, time-dependent decay. We explore the possibility that multiple hits explain the discrepancy between pedigree-based and phylogeny-based mutation rates. We conclude that whereas nucleotide substitution models incorporating multiple hits do provide a possible explanation for the discrepancy between pedigree-based and human–chimpanzee mutation rate estimates, they do not explain the rapid decline of within-human rate estimates. We propose that demographic processes such as serial bottlenecks prior to the Holocene could explain the difference between rates estimated before and after 15,000 years ago. Our findings suggest that human mtDNA estimates of dates of population and phylogenetic events should be adjusted in light of this time dependency of the mutation rate estimates.

Key Words: mutation rate • time dependency • mitochondria • human evolution • calibration


    Introduction
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Conclusion
 Supplemental Material
 Acknowledgements
 References
 
"Mutation rates" are used to estimate the timing of speciation, population splits, and the divergence of genetic lineages. Estimates of the average rate at which mitochondrial DNA (mtDNA) mutates vary dramatically depending on the data and method used for estimation. On average, human pedigree–based rates (estimates based on mtDNA sequences derived from close relatives) are almost an order of magnitude higher than estimates obtained by using species phylogenies of humans and chimpanzees (Ward et al. 1991Go; Forster et al. 1996Go; Parsons et al. 1997Go; Howell et al. 2003Go) (table 1). Only two estimates of the within-human (i.e., intraspecific) phylogeny-based mutation rate are routinely cited, but these estimates are 3-fold lower than the pedigree rate (Stoneking et al. 1992Go; Forster et al. 1996Go). Additionally, a similar mutation rate discrepancy is seen in within-human Y chromosome short tandem repeats (STRs) (Heyer et al. 1997Go; Forster et al. 2000Go; Kayser et al. 2000Go; Zhivotovsky et al. 2004Go). Because of the discrepancy, divergence time estimates based on pedigree-based rates are more recent than divergence times based on phylogeny-based mutation rates. Most human mtDNA divergence time estimates are obtained using the phylogenetic rate (Vigilant et al. 1991Go; Watson et al. 1996Go; Richards et al. 2000Go; Roostalu et al. 2007Go) but without theoretical justification for the phylogeny-based rather than the pedigree-based rate.


View this table:
[in this window]
[in a new window]

 
Table 1 Compilation of Published Human mtDNA Mutation Rates by Method Estimated

 
Recently, the time dependency of mtDNA mutation rates has been recognized as a general phenomenon applying to many species (e.g., primates, birds, and fish) (Ho et al. 2005Go; Ho and Larson 2006Go; Burridge et al. 2008Go). Higher mutation rate estimates are obtained from events calibrated with "more" recent dates, such as pedigrees, and lower estimates are from events calibrated with older dates. Ho et al. (2005)Go suggest that for primate and avian mitochondrial data, there is a decline in the effective mutation rate, and this decline reaches an equilibrium phylogenetic rate after about 1–2 Ma (i.e., the "elbow" of the curve is in the 1–2 Ma range). However, the characterization of the beginning of the decline is poor given the small number of estimates based on recent dates, between 2 Ma and the present. Describing the precise rate of time-dependent decay using younger calibration points becomes particularly important when correcting for the mutation rate in order to then estimate divergence times (Ho et al. 2005Go). Mutation rates estimated from relatively young time series of ancient DNA samples tend to fall closer to pedigree-based rate estimates, possibly because ancient DNA exposes genetic diversity that has been lost through time-dependent processes such as natural selection (Lambert et al. 2002Go; Ho et al. 2007Go; Kemp et al. 2007Go). Compiling additional recent mutation rate estimates using phylogeny-based methods is thus necessary in order to understand how quickly the mutation rate decays, the impact of the estimation method, and choice of the appropriate mutation rate estimate for dating population history events. The concept of time dependency has been subject to some debate, particularly concerning human mtDNA mutation rates (Bandelt 2008Go; Howell et al. 2008Go). One goal of this paper is to ask if there is a clear trend in human data using established methods of lineage dating.

Human pedigree–based mutation rates are estimated by comparing parent/offspring pairs or deep-rooted familial lineages (e.g., Heyer et al. 2001Go) at particular loci and counting the number of novel mutations per pair, divided by the number of meioses (i.e., number of pair comparisons). This approach has been used to estimate the spontaneous mutation rate for the 1.1-kb mitochondrial control region. Estimates of control region rates, including hypervariable regions (HVR)I and HVRII, range from 0.0 to 2.91 mutations/bp/My (Howell et al. 2003Go). All mutation rates discussed here will be in the form of divergence rates, the mutation rate along two lineages, unless otherwise noted.

Phylogeny-based rates range from 0.12 to 0.38/bp/My for the mitochondrial control region (table 1). Given the inherent time averaging over multiple generations in a phylogeny and the accumulation of multiple mutations, phylogeny-based rates are less affected by stochastic fluctuations than pedigree studies. Phylogeny-based mutation rates are estimated by first constructing either a gene genealogy or a species phylogeny. In the latter case, one then calibrates the species split with external paleontological evidence. The number of mutations between two groups is subsequently computed, either by averaging the number of differences between species or by taking two sequences with the greatest number of differences. The number of mutations between two species lineages is then divided by the externally derived divergence date. This method has been applied to phylogenetic mutation rates obtained by comparing modern human and chimpanzee mtDNA sequences (Ward et al. 1991Go; Tamura and Nei 1993Go; Horai et al. 1995Go). This method assumes that the gene most recent common ancestor (MRCA) and the population divergence of two species occur at roughly the same time, such that lineage sorting only introduces minimal discrepancy. Estimates of within-species mutation rate from gene genealogies, such as reduced median networks of sequence data, can be made by using estimators such as {rho} (i.e., the average number of mutations from each derived haplotype to the ancestral haplotype) in conjunction with external calibration information for the clade (Forster et al. 1996Go). For the remainder of this paper, we distinguish between the interspecific phylogeny-based mutation rate and the mutation rate based on intraspecific gene genealogies (hereafter the "genealogy-based mutation rate"), consistent with an earlier study (Endicott and Ho 2008Go).

Although research over the past 10 years has clearly established the discrepancy between pedigree-based and phylogeny-based estimates of mutation rates, there is no consensus on the causes of this discrepancy. Heyer et al. (2001)Go suggest that pedigree estimates may be affected by rate heterogeneity among sites, picking up only the "fast" mutations whereas phylogenetic methods capture "slow" mutations. Zhivotovsky et al. (2004)Go similarly explain the discordance between Y-STR mutation rate studies by invoking rate heterogeneity among different STR loci. Over long time periods, homoplasy and back mutation obscure our ability to detect the number of mutations that have occurred along the ancestral lineages when we sample contemporary individuals. Phylogenetic analyses are expected to be affected by homoplasy and back mutation, especially at hot spots, but pedigree studies are not.

Denver et al. (2000)Go suggested an alternative, that natural selection may play a role in the mitochondrial discrepancy by preventing deleterious mutations in the coding region from rising in frequency and thus by background selection, removing variation in the control region as well (see also Kivisild et al. 2006Go). However, an analytic study by Woodhams (2006)Go indicated that very large effective population sizes would be needed to maintain population survival, given the mutational load of deleterious mutations if purifying selection alone were to explain the decay in mutation rate estimates. In his model, maximum likelihood estimates of population sizes in primates and birds, based on data in Ho et al. (2005)Go, suggested that the population sizes would be unrealistically large (Woodhams 2006Go). Other possible explanations for the discrepancy in rates include factors such as sequencing errors, non–germ-line mutations, heteroplasmy, and ancestral polymorphism (Ho et al. 2005Go, 2007Go; Santos et al. 2005Go, 2008Go). For example, Santos et al. (2005)Go examined one deep-rooting pedigree and suggested that correcting for gender (i.e., removing novel mutations occurring in male individuals) decreases the difference between phylogeny-based and pedigree-based mutation rate estimates.

If the discrepancy between pedigree-based and phylogeny-based mutation rates is caused by processes that occur on long timescales, then the use of phylogenetic rates may more accurately estimate divergence time than pedigree rates. Phylogenetic rates are estimated with data sets of contemporary sequences, which may not represent the full diversity of haplotypes throughout the past. Instead, the loss of sequence diversity through homoplasy/back mutation, natural selection, and genetic drift creates a time-dependent, evolutionary mutation rate. Divergence time estimates computed using evolutionary mutation rates can therefore be more accurate than those estimates derived using pedigree-based mutation rates that do not account for population and mutation history. That is, if the number of mutational differences estimated for two sequences, averaged over the number of generations in the combined lineages, is lower than the number of mutations observed in one generation, then the phylogenetic rate more accurately estimates older divergence dates. However, the calibration time for an event (e.g., based on archaeological evidence) and the time of coalescence for a set of lineages may not necessarily be aligned, and substantial error between actual and estimated mutation rates is possible (fig. 1).


Figure 1
View larger version (29K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 1.— Schematic illustrating expected accuracy when calibrating mutation rates from archaeological evidence and gene genealogies. Horizontal lines indicate a hypothetical time of initial settlement of a geographic location inferred from archaeological data. When the sample TMRCA and the time of initial settlement coincide, the mutation rate estimate will be accurate for that genealogy.

 
A primary tenet of the neutral theory of molecular evolution is that the neutral mutation rate is equal to the fixation rate of an allele in a species (i.e., the "substitution" rate) (Kimura 1980bGo). At times, "mutation rate" and "substitution rate" have been used interchangeably in the fields of population genetics and molecular evolution. Empirical evidence now suggests that the spontaneous mutation rate in an individual (approximated by the pedigree-based estimates) and the fixation rate of an allele in a species (approximated by phylogeny-based estimates) are not equivalent. In this current context, we distinguish between the spontaneous mutation rate, substitution rate, and the "genealogical" mutation rate; the last refers to the effective mutation rate of intraspecific lineages over multiple generations. We retain the use of the term "substitution" when referring to models of the process of nucleotide substitution (e.g., Kimura 2-parameter model [K2P]).

In this paper, we present a set of new genealogy-based estimates of the mtDNA mutation rate using both hypervariable and coding region data, which are calibrated using within-human divergence events. We also investigate whether the impact of multiple hits can explain the discrepancy between the different methods of mutation rate estimation.


    Methods
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Conclusion
 Supplemental Material
 Acknowledgements
 References
 
In order to better characterize the decline in mtDNA genealogical mutation rates over time, we have generated new genealogy-based rates from time periods of 2,500 to 50,000 years ago. We then explore one possible explanation for the cause of the mtDNA mutation rate discrepancy by assessing the impact of homoplasy on dating lineage splits. For our analyses of this discrepancy, we use the average pedigree rate 0.95/bp/My, calculated from a meta-analysis of pedigree studies (Howell et al. 2003Go).

Phlyogeny-Based HVRI Rate Estimates
We have used published mtDNA data to generate 14 genealogy-based point estimates of the human mitochondrial control region mutation rate (see table 2; supplementary table S1, Supplementary Material online; fig. 2).


View this table:
[in this window]
[in a new window]

 
Table 2 Genealogy-Based HVRI Mutation Rate Estimates

 

Figure 2
View larger version (9K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 2.— Evidence of rapid decay of genealogy-based HVRI mutation (divergence) rate estimates (from table 2). Black dots indicate {rho}-based estimates. Vertical gray lines indicate the standard error for each estimate; horizontal gray lines indicate the bounds of the archaeological calibration (time of initial settlement). The black diamond indicates the mean pedigree-based estimate, and the vertical gray line at the pedigree-based estimate represents the 95% CI from Howell et al. (2003Go). Open circles: alternative mutation rate estimates from Papua New Guinean, Australian, and Javan haplogroups given the oldest possible archaeological estimate for initial settlement.

 
Published sequence data for the HVRI were compiled from populations in the Canary Islands, Polynesia, Micronesia, North America, Taiwan, Indonesia, and Oceania. Population data sets were chosen based on the availability of data, whether the population was geographically isolated, available archaeological dates for the time of first human arrival, haplotypic data from neighboring regions, and indigenous haplotypes for that region. Estimates were first generated on the longest stretch of sequence data available for each data set (table 2) and then truncated for cross-data set comparison to 190 bp (16,184:16,373) (supplementary table S1, Supplementary Material online). Each data set was analyzed according to the principles of founders’ analysis (Richards et al. 2000Go) with some modification. First, each data set was sorted according to predefined terminal haplogroups, using the published haplogroup designations within each paper. Haplogroups that were shared across populations in other islands or geographic regions were eliminated. However, if the haplogroup was present in neighboring populations where genetic and historical evidence suggests that the directionality of recent gene flow was from the selected region to the neighboring region (e.g., New Britain to Vanuatu), then the haplogroup was included. Potential indigenous haplogroups were then sorted according to age. Where ages were provided in the published literature, the oldest haplogroup was chosen. When ages were not provided, we calculated them de novo on potential haplogroups according to the method described below in order to determine the oldest haplogroups (table 2; supplementary table S2, Supplementary Material online).

For each haplogroup, we generated a median-joining network using Network 4.5 (http://www.fluxus-engineering.com) (Bandelt et al. 1999Go). Sites were not weighted in order to maintain equivalency across haplogroups. Networks did not need preprocessing. Using Network 4.5, we calculated two summary statistics for each haplogroup: the mean pairwise difference, {pi} (i.e., the mean number of mutational differences between pairs of sequences), and the mean number of mutations from each derived haplotype to the ancestral haplotype, {rho} (Forster et al. 1996Go). We chose an ancestral haplotype for each network by selecting the node that generated the minimum {rho} value, often the central haplotype in a star expansion. Then, by dividing the {rho} and {pi} (time to the MRCA) estimates by the oldest available, well-supported archeology-based dates, we calculated genealogy-based estimates of divergence rate for each island-specific mtDNA lineage. A minimum {rho} value produces the lowest possible mutation rate estimate. In contrast, the genealogical mutation rate estimates are expected to be higher (i.e., faster) when the time to the MRCA is older than the lineage arrival on the island, assuming that the archeology-based dates are reasonably accurate (fig. 1). Given that we always chose the oldest haplogroup in a sample set, we think it is more likely that we would overestimate the rate, rather than underestimate the rate (as might happen with a minimum {rho}). We consider rates obtained in this manner as potentially biasing our mutation rate average toward the higher pedigree-based mutation rate.

Archaeological Estimates for the Time of First Arrival
The majority of human mitochondrial phylogeography papers use a hypervariable mutation rate estimate by Forster et al. (1996)Go based on a northern Native American haplogroup A2. In recognition of the overwhelming significance of this single estimate for published research, we reanalyzed their Siberian/Eskimo/NaDene A2 data with updated archaeological information (table 2; supplementary table S3, Supplementary Material online).

Although information for most of the archaeological-based arrival times is either limited or relatively straightforward (supplementary table S3, Supplementary Material online), dates for the first arrival of humans in the Americas and in Australia are more problematic. Due to the large quantity of information and the ongoing debate over the time of arrival for the first American populations, selection of archaeological dates in the Americas again merits explicit discussion. For calibration dates in the Americas, we rely primarily on recent revisions to radiocarbon estimates of Clovis-age archaeological sites that were presented by Waters and Stafford (2007)Go. The oldest, secure calibrated radiocarbon dates from South America average about 13,000 years ago, with one standard deviation ranges of approximately 200 years. The Clovis-age estimates from North America differ only slightly, ranging from 13,050 to 13,150 years ago. However, earlier radiocarbon dates do exist, ranging from 11,750 to 13,000 14C years ago for estimates made in Alaska, Oregon, and Wisconsin. We therefore use a calibrated point estimate of 14,300 years corresponding to the lower range Schaefer, Wisconsin (Joyce 2006Go), and Paisely Cave, Oregon estimates (Gilbert et al. 2008Go), and a maximum date of 15,000 years for the initial entry into North America (supplementary table S3, Supplementary Material online).

Phylogeny-Based Coding Region Rate Estimates
Although the HVR has classically been the most utilized locus for phylogeographic studies of human prehistory, advances in sequencing technology have recently made it feasible to use whole mitochondrial genomes for the same purpose. We therefore extended the analysis to explore the possibility of time dependency in coding region (15,420 bp) mutation rate estimates. We report genealogy-based estimates for 11 haplogroups obtained from Polynesian, Native American, Oceanic, and Japanese populations (table 3, fig. 3). The founders’ analysis step described above in the HVRI data set was used to provide prior information on indigenous haplogroups that had limited gene flow into neighboring regions, and haplogroups were chosen on that basis. Whole mtDNA genome sample size for many of the haplogroups was limited; thus, it was not possible to estimate U6b1, B4a2, B5a2, and M46 time to the most recent common ancestor (TMRCAs) as done for the HVRI data set. B4a1a1 data from Micronesian and Polynesian populations were combined for a single estimate calibrated by the earliest archaeological evidence of the Austronesian expansion in Micronesia.


View this table:
[in this window]
[in a new window]

 
Table 3 Genealogy-Based Coding Region Mutation Rate Estimates

 

Figure 3
View larger version (8K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 3.— Coding region indicates mutation (divergence) rate decline similar to that of HVRI (fig. 2). Diamond: pedigree-based estimate (Howell et al. 2003Go). Black dots: phylogeny-based mtDNA-coding region {rho} estimates (table 3). Vertical gray lines: standard error for each estimate. Horizontal gray lines: bounds of the archaeological calibration (time of initial settlement). Open circles: alternative mutation rate estimates from Papua New Guinean and Australian haplogroups given the oldest possible archaeological estimate for initial settlement.

 
Expectation from Substitution Models
To investigate the impact of multiple hits/homoplasy on phylogeny-based mutation rate estimation, we explored substitution models that explicitly incorporate multiple hits. Starting with the pedigree-based rate as the spontaneous mutation rate, we calculated the expected molecular divergence given a model of nucleotide substitution. The expected divergence is the number of "observed" substitutions for a pair of lineages diverging over an interval of time. In our models, the intervals of time were all less than 6.5 My. The expected divergence was then calibrated using the known (i.e., input) time of divergence to obtain a genealogical mutation rate after multiple hits/homoplasy have obscured mutations. We present results from the K2P model (Kimura 1980aGo; Jukes and Cantor [1969]Go; [Hasegawa et al. 1985Go] models were explored, results shown in supplementary fig. S1, Supplementary Material online). The HVR is thought to have substantial rate heterogeneity among sites; therefore, we used the K2P+{gamma} modification to incorporate rate differences between sites (Jin and Nei 1990Go). We used a shape parameter ({alpha}) of 0.4 (for the {gamma} distribution) that was estimated from human HVRI sequence data (Excoffier and Yang 1999Go). To capture the impact of {alpha} on the shape of the K2P curve, we generated curves for a range of different {alpha} values less than 1; values 0.8 and 0.2 are shown in figure 4. A similar analysis for the coding region is presented in supplementary figure S2 (Supplementary Material online).


Figure 4
View larger version (15K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 4.— Difference between mtDNA mutation rate estimates and rates expected under the K2P + {gamma} model. Phylogeny-based mutation rate estimates are drawn from tables 1 and 2. Solid black line: effective mutation rate under our general model of K2P + {gamma} assuming published pedigree-based mutation rate (0.95/bp/My) taken from Howell et al. (2003Go), ts/tv = 20/1, {alpha} = 0.4. We explored the sensitivity of the model parameters, examples for each parameter are displayed. Solid gray lines: effective mutation rate given alternative instantaneous "pedigree" mutation rates 1.20 and 0.75/bp/My. Dotted line: effective mutation rate given ts/tv = 100/1. Dashed lines: effective mutation rate given alternative {alpha} = 0.8 and 0.2.

 

    Results
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Conclusion
 Supplemental Material
 Acknowledgements
 References
 
Trends in Genealogy-Based HVRI Mutation Rate Estimates
Genealogy-based rate estimates between 2,500 and 5,000 years ago are indistinguishable from pedigree-based mutation rate estimate (tables 1 and 2). This remains true across different haplogroups and populations from vastly different geographic locations, for example, Canary Islands versus Samoa. For all samples, rate estimates based on {rho} are slightly higher than estimates based on {pi} (table 2). Even the {pi}-based estimates less than 5,000 years ago are generally similar to the pedigree-based estimates and within the range of the deep-rooted pedigree-based estimates, none of which extend as far back as 2,500 years ago (tables 1 and 2).

The time period between 5,000 and ~15,000 years ago represents both a break in our data set and a sudden decline in estimated mutation rate (fig. 2). Estimates for the intermediate period roughly 15,000 years ago are about 40% lower than estimates calibrated on arrival dates less than 5,000 years ago. Native American haplogroups (i.e., A–D, X2a) dating to the initial arrival of humans into the American continents show intermediate {rho}-based mutation rate estimates of 0.50–0.57/bp/My during approximately 13,000–14,300 years ago calibration dates (table 2). Specifically, a recalibration of a widely cited Native American estimate made by Forster et al. (1996)Go resulted in an increase from 0.36 to 0.57/bp/My for the combined North American Amerind data set (tables 1 and 2).

Mutation rate estimates made using initial time-of-arrival calibrations earlier than 15,000 years ago are even lower (fig. 2). Over the span of 20,000–50,000 years ago, mtDNA mutation rate estimates range from 0.22 to 0.48/bp/My. Our phylogeny-based mutation rate estimates older than 20,000 years fall outside the 99.5% confidence interval (CI) calculated by Howell et al. (2003)Go. The low rates from Taiwan B5a2a and Papua New Guinea II were checked with alternative old, indigenous lineages from the same population data sets (supplementary table S2, Supplementary Material online). Taiwan B4a2 was virtually identical to B5a2a (0.26/bp/My), and Papua New Guinea I and II displayed similarly low mutation rates (0.23–0.28/bp/My). Our oldest calibrated samples, Australia M42 and Java M46, show a small increase in the mutation rate relative to younger haplogroups (0.48 and 0.38/bp/My, respectively), although both estimates fall well below the pedigree-based mutation rate.

Mutation Rate Heterogeneity across the HVR
Mutation heterogeneity across the HVR can have a dramatic effect on the rate estimates, swamping other factors. For example, adjusting the length of HVRI for Taiwan B5a2a from 392 (16,006:16,397) to 190 bp (16,184:16,373) does not alter the {rho} estimate because the first portion of the sequence is not polymorphic (tables 2 and 3). The resulting mutation rate estimate based on Taiwan B5a2a is therefore either 0.26 or 0.54/bp/My (table 2, supplementary table S1, Supplementary Material online). The use of longer portions of the HVR has become more common in recent analyses, and the application of an appropriate (i.e., slower) mutation rate becomes important in order to prevent underestimation of divergence events. The lowest rate estimate in our data set comes from Japan M7a, but it is partly explained by the inclusion of 200 bp from the less mutable latter half of the HVRI in the data set. The majority of our samples did not extend past base pair 16,375, and a comparison of the Japan M7a rate with and without the extended 200 bp demonstrates that this latter region has very few mutations, increasing from 0.22 to 0.39/bp/My (table 2; supplementary table S2, Supplementary Material online).

When site heterogeneity is controlled for by using identical loci across the sample set, the mutation rate estimates are markedly lower for samples associated with events dated more than 5,000 years ago (supplementary table S1, Supplementary Material online) with the Australian M42 sample as an outlier. The Canary Islands sample showed no polymorphism in the 16,184–16,373 bp stretch. The Native American sample was restricted to the X2a haplogroup as sequence data were not available for the same data set reported for the A–D haplogroups in table 2.

Coding Region Results
We present 11 new phylogeny-based coding region mtDNA mutation rate estimates (fig. 3). The Polynesian-specific B4a1a1 haplogroup was the youngest haplogroup in our data set and gave an estimate of 0.046/bp/My. Estimates from Native American haplogroups A2, B2, and D1 averaged 0.049/bp/My (n = 167), with the D1 rate elevated relative to the other two American haplogroups (fig. 3). There appears to be virtually no change in the coding region rate over the 10,000 year span between the Polynesian and American estimates. Very few estimates of a pedigree-based mtDNA-coding region mutation rate were available in the literature. We cite a coding region divergence rate of 0.06/bp/My, obtained from an average of four nondisease mutations over 462 transmission events (i.e., generations) (Howell et al. 2003Go). In comparison to this tentative pedigree-based estimate, our rates from the youngest haplogroups are slightly lower.

The coding region mtDNA mutation rate estimates appear to decline by almost half between 15 and 35 thousand years ago (fig. 3). Estimates from H3, M7a, and M27 show a dramatic decrease, consistent with expectations from the HVRI estimates during this time period (fig. 3). These estimates are similar to human–chimpanzee-based coding region rates (table 1) (Ingman et al. 2000Go; Tang et al. 2002Go; Mishmar et al. 2003Go). The coding region estimates from Papua New Guinean (P1) and Australian (M42, O) populations appear to experience a slight "uptick" or small increase in rate relative to the preceding younger lineages. These three haplogroups average 0.033/bp/My, assuming a calibration date of 45 thousand years ago. The estimates from these older haplogroups have greater standard errors given the smaller sample sizes for each lineage. However, even the Oceanic-specific haplogroup P (n = 39, including P1) gives an elevated rate of 0.040/bp/My, assuming a colonization time of 45 thousand years ago for Sahul.

Although coding region mtDNA estimates also show a decline at about the Last Glacial Cycle, the curve is less pronounced than in the HVR data. The HVR phylogeny-based estimates decline by about 5-fold, whereas the coding region rates decline only about 2-fold in comparison to pedigree-based estimates.

Accounting for Homoplasy Using Substitution Models
Using the K2P+{gamma} substitution model of mutational evolution, we calculated the expectation for HVRI and coding region mutation rates calibrated for different time intervals over 6.5 My (fig. 4; supplementary figs. S1 and S2, Supplementary Material online). Assuming the Howell et al. (2003)Go pedigree-based rate, the expected mutation rate under a multiple hits model does not match the decay curve of the within-human genealogical rates. Allowing for variation in the pedigree-based HVRI mutation rate by increasing and decreasing the rate by about 25% did not alter the result that multiple hits could not account for the swift decline in genealogy-based rates (fig. 4). Additionally, assuming higher transition/transversion ratios (ts/tv) or increased rate heterogeneity ({alpha}) did move the knee of the expectation curve closer to the genealogy-based estimates but not enough to account for the actual decay curve obtained from table 2 (examples shown in fig. 4). Other substitution models such as Jukes and Cantor (1969Go) or Hasegawa–Kishino–Yano (Hasegawa et al. 1985Go) did not significantly alter our conclusion (supplementary fig. S1, Supplementary Material online). The K2P+{gamma} model does, however, appear to be consistent with both the HVRI and coding region human versus chimpanzee phylogeny-based mutation rate estimates, calibrated at 5–6 Ma (fig. 4, table 1; supplementary fig. S2, Supplementary Material online).


    Discussion
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Conclusion
 Supplemental Material
 Acknowledgements
 References
 
Pedigree-Based Estimates of the Mutation Rate
Howell et al. (2003)Go conducted a meta-analysis of pedigree studies that estimated the HVR mutation rate. Germ-line mutations in the ~1,100-bp control region totaled 28 mutations from 2,633 meioses. The average divergence rate over all studies was 0.95/bp/My, with a 99.5% CI of 0.53–1.57/bp/My. Their CI was calculated according to the normal approximation of the binomial distribution. Several variables affect the estimation of the pedigree-based mutation rate. The Howell et al. (2003)Go study included two deep-rooted pedigrees spanning multiple generations (see also table 1). Removing these two deep-rooted pedigree studies raises the pedigree-based rate to 1.19/bp/My. Generation time assumptions can also affect the estimated rate. Most pedigree studies assume a mean female generation time of 20 years. Changing the generation time from 20 to 30 years, in line with empirical estimates (Tremblay and Vezina 2000Go), decreases the pedigree-based rate to 0.80/bp/My. Additionally, reported deep-rooted pedigree estimates differ slightly between Howell et al. (2003)Go and our table 1 due to different criteria for heteroplasmic mutations (e.g., 0.63 vs. 0.48/bp/My for Sigurdardóttir et al. [2000]Go).

Although the meta-analysis of the mitochondrial control region by Howell et al. (2003)Go encompasses a large number of transmission events, the inherently small number of mutations generated over roughly 1,100 bp is governed by a highly stochastic process. Therefore, the pedigree-based estimates of the mitochondrial mutation rate lack precision. Six of the divergence rate estimates from previous pedigree studies fall outside the 99.5% CI calculated from the average number of mutations over 11 pedigree studies (Howell et al. 2003Go). Given the variation in deep-rooted pedigrees, generation time, heteroplasmy, and the relatively small number of detected mutations, the average pedigree-based estimate should be viewed as potentially subject to revision.

It is interesting then, in the context of stochastic uncertainty, that our phylogeny-based mutation rate estimates younger than 5,000 years old are generally similar to the average pedigree-based estimate (tables 2; supplementary table S1, Supplementary Material online). The average of {rho}-based rate estimates for events younger than 5,000 years is 0.895/bp/My, virtually identical to the pedigree-based average and well within the CI calculated by Howell et al. (2003)Go. Although neither the pedigree meta-analysis nor our phylogenetic analysis has accumulated enough data to be very precise, the similarity of the two average mutation rate estimates may be accurately reflecting the effective HVR mutation rate over short evolutionary time spans.

mtDNA-Coding Region Estimates Comparison
Two recent publications estimate coding region mutation rates for the haplogroups Q, P, H, and H3 using Bayesian methods to calculate the TMRCA of each haplogroup (Atkinson et al. 2008Go; Endicott and Ho 2008Go). Haplogroup Q is present only in Papua New Guinea, Australia, and Melanesia (Friedlaender et al. 2005Go). The Q estimate (Atkinson et al. 2008Go) of 0.034/bp/My is very similar to our estimates for other haplogroups indigenous to Papua New Guinea and Australia (tables 1 and 3).

Using haplogroups P, H1, and H3 as within-human (or "internal") calibration points, Ho and Endicott (2008)Go obtain a mean divergence rate of 0.04076/bp/My for the mtDNA-coding region. This rate is intermediate within the spectrum of mutation rates from our table 3. It is comparable to our rate for Oceanic P, but higher than all other estimates calibrated with dates older than 15,000 years. The Ho and Endicott (2008)Go rate was estimated using an HKY+{gamma} substitution model to correct for homoplasy/back mutation. We present estimates based on data that were not corrected for saturation, with the exception that back mutations in the coding region were taken into account if the back mutation had been explicitly discussed in the literature. It is possible that saturation correction may account for the difference between their estimate and ours.

The distribution of haplogroup P extends from Indonesia into Papua New Guinea and Australia (Lum and Cann 2000Go; Hill et al. 2007Go). Because the P distribution encompasses both Sunda and Sahul, we focused our attention on the P1 haplogroup, which fits the geographically restrictive criteria of our founders’ analysis (see Methods). We present a calibrated P coding region rate estimate for comparative purposes (table 3). Our mutation rate estimate based on haplogroup P is substantially higher than other estimates calibrated at the same time period (45 thousand years ago) (see below for further Discussion).

Haplogroups H1 and H3 purportedly originated during the last Ice Age in the southern refugia of Europe (Achilli et al. 2004Go; Loogvali et al. 2004Go; Endicott and Ho 2008Go). The last glacial maximum (LGM) extended from approximately 22 to 15 thousand years ago, followed by a secondary glacial cycle, called the Younger Dryas, at 12 thousand years ago. Haplogroups H1 or H3 could have originated at any time during this 10,000-year period or even before the LGM. Given the imprecision associated with this calibration, we do not feel that they represent ideal haplogroups for mtDNA mutation rate calibration. The H1 haplogroup, in particular, may not have a population history consistent with the LGM origin scenario. Roostalu et al. (2007)Go suggested that the multimodal mismatch distribution for H1 might support a pre-LGM origination for this haplogroup. Furthermore, the frequency distribution for H1 extends well into the Near East and western Eurasia. The difference in H1 genetic diversity between the Near East and Europe appears to be negligible; TMRCA estimates for H1 were 19 and 22 thousand years ago in the two regions, respectively (Roostalu et al. 2007Go). For these reasons, H1 has not been included in our analysis. Although haplogroup H3 suffers from the same LGM calibration imprecision as H1, the geographic distribution appears to be better restricted to Europe, with very little diversity in the Near East (Roostalu et al. 2007Go, results not shown). We therefore included H3 in our mutation rate estimate panel and chose 18,000 years ago as the calibration time in order to be comparable with Endicott and Ho (2008)Go (who used an 18,000 years ago estimate as the mean for their prior distribution of the H3 haplogroup).

Native American Mutation Rate Estimates
Several current papers have debated the possibility that Native American mtDNA supports an early colonization of the Americas, prior to 15,000 years ago (Fagundes, Kanitz, and Bonatto 2008Go; Fagundes, Kanitz, Eckert, et al. 2008Go; Ho and Endicott 2008Go). Fagundes, Kanitz, Eckert, et al. (2008Go) used a phylogeny-based mutation rate for their coding region estimates of the time of expansion of Native American haplogroups, A–D and X2a. Subsequently, Ho and Endicott (2008)Go used their new genealogy-based mutation rate estimate to reanalyze the Native American data of Fagundes, Kanitz, Eckert, et al. (2008Go). The reanalyzed estimates timed the expansion of Native American haplogroups at younger than 15,000 years ago. Although we agree with Ho and Endicott (2008)Go that genealogy-based mutation rates should be used for estimates of population events, it may be premature to use haplogroups with uncertain archaeological calibrations (as discussed above) to date events that have a strong archaeological signature, such as the initial arrival of Native Americans. Archaeological evidence for the initial colonization of the Americas is generally associated with a timing uncertainty of only 1,000–2,000 years, dramatically less than the European LGM or settlement of Sahul, which have timing uncertainties of 10,000–25,000 years. In light of this, we argue that the colonization of the Americas is among the best calibration points currently available to human geneticists. For this reason, we have calculated mutation rate estimates using Native American haplogroups: A2, B1, D1, and X2a (tables 2 and 3). Haplogroup C1 appears to be paraphyletic with respect to American and Asian populations, with the Asian C1a clade joining other basal American C1s in a polytomy (Tamm et al. 2007Go). We therefore did not include C1 in our coding region data set. The Native American mutation rate estimates are, as expected, intermediate between pedigree-based estimates and older genealogical-based estimates (figs. 2 and 3).

Mutation Rate Uptick in Oceanic Samples
The notable exceptions to the described trend of decreasing mutation rates over time involve samples from Oceania including the Papua New Guinean P1 (but not PNG II), Oceanic P, Sundaland M46, and Australian M42 and O (tables 2 and 3). These samples are calibrated during the 45,000–50,000 years ago range and show mutation rate estimates somewhat higher than samples from the 20,000 to 40,000 years ago interval. For example, while still producing a mutation rate estimate lower than the average pedigree-based estimate, the M42 HVRI sample has a rate roughly twice as large as other lineage samples older than 15,000 years (table 2). This mutation rate uptick appears to affect both the HVR and coding region estimates (figs. 2 and 3). Several possible explanations exist:

  1. It is possible that inadequate sampling on Taiwan, Japan, Papua New Guinea, etc., leads to diversity estimates that are too low for islands that have been occupied for such great time depth, that is, over 20,000 years. However, these particular island populations are among the best sampled groups in our data set. Data were gleaned from recent papers with large sample sizes and detailed haplogroup assignment (Tommaseo-Ponzetta et al. 2002Go; Tanaka et al. 2004Go; Trejaut et al. 2005Go).
  2. Lineage-specific deviations from the molecular clock have been demonstrated for some mitochondrial lineages, such as the L2 subclades (Howell et al. 2004Go). The M42 and P haplogroups appear to be subject to lineage acceleration (Gignoux C, in preparation).
  3. It is also possible that populations with high effective sizes initially colonized Australia and therefore retained diversity ancestral to their arrival in the new geographic location. The TMRCA for such samples is expected to be older than the time of arrival based on archaeological estimates. Calculating mutation rates using a TMRCA older than the archaeological-based time of arrival will generate higher/faster mutation rate estimates (fig. 1). In such cases, we would have to assume that the ancestral diversity for a genealogy within each sink sample, which might have been left in the source population, must have subsequently disappeared or been completely carried away with the sink sample. Otherwise, the comparative method of identifying neighboring haplotypes occurring within a designated sink haplogroup would have eliminated the haplogroup for potential mutation rate estimation (see Methods). It is also unlikely that serial founder effects, as envisioned for the initial expansion across Asia into Oceania, would maintain deep diversity within single-copy loci for the resulting end populations (Ramachandran et al. 2005Go). Furthermore, it is unclear why Oceania would be settled by larger populations than the populations that colonized other regions of Asia (e.g., Taiwan or Japan).
  4. It is possible that the archaeological estimates for Oceania are incorrect and underestimate the time of human arrival in each region. Australia, particularly among the locations sampled here, has been the subject of several claims for human arrival earlier than 45,000 years ago (O'Connell and Allen 2004Go). Virtually, no archaeological-based dates exist for the initial dispersal of behaviorally modern humans along the Indian Ocean coastline (from western Asia, across to southeast Asia) (James and Petraglia 2005Go; Petraglia et al. 2007Go). Rising sea levels may have subsequently covered these early coastal locations and prevented archaeologists from identifying and sampling initial sites. It is possible then that the occupation of Sunda and Sahul could have occurred earlier than 45,000–50,000 years ago. In order to generate rate estimates of the same magnitude or lower than Japan and Papua New Guinea, we would need to assume a time of initial arrival of 60,000–70,000 years ago (figs. 2 and 3), almost 15,000–25,000 years ago before there is any archaeological evidence for modern humans outside of Africa or even widespread evidence of modern human behavior within Africa (Klein 2000Go; Henshilwood et al. 2002Go).

In summary, the most likely explanation for the uptick in mutation rates among Oceanic samples is lineage-specific acceleration. Given a hypothesis of lineage-specific acceleration for M42, other Australian indigenous mitochondrial lineages might provide more accurate reflection of the timing of initial arrival. We calculated an alternative HVRI mutation rate estimate for the second oldest phylogenetic clade in Australia, the O/S haplogroup (supplementary table S2, Supplementary Material online). Although the mutation rate does decrease as expected, the relative uptick is not completely eliminated. Additionally, the coding region sample of O shows an increase similar to M42 (table 3).

Archaeological Estimates for the Time of First Arrival
Archaeological-based bounds presented in table 2 are not meant to represent hard bounds for the dates of initial population arrival. Instead, they are estimates based on generally accepted, average archaeological radiocarbon dates. They do not incorporate error in the radiocarbon method or error in the conversion of radiocarbon years to calendar years. Error in the radiocarbon estimate is usually dwarfed by variation in multiple estimates at the same site (i.e., error was usually below 500 years). Error in the conversion of radiocarbon years to calendar years was not available for most estimates. However, radiocarbon calibration curves appear to decrease in accuracy after about 30,000 years ago, and thus, our Oceanic samples may be most affected by this source of error (Fairbanks et al. 2005Go). Depending on which calibration curves archaeologists chose (not known), the error could be on the order of about 5,000 years. Thus, the presented bounds are meant to convey the basic range of uncertainty in the estimates of arrival time (supplementary table S3, Supplementary Material online).

Sensitivity of {rho}
Although a minimizing root was chosen in order to estimate {rho} consistently across different sample populations, an alternative {rho} estimate, based on the presumed ancestral root, does differ from the minimum estimate in some cases. Ancestral root estimates were based on a closely related sister outgroup. In lineages displaying star-like networks, indicating a period of population growth, the minimum root and the ancestral root appeared to coincide. This pattern characterizes population samples with arrival times of less than 5,000 years (for populations, see table 2). Lineages with elongated networks, where nodes consisted of few identical haplotypes on long branches, tended to be associated with initial arrival time dates of greater than 20,000 years. However, both the Australian M42 and North American X2a HVRI samples displayed elongated networks with a single enlarged node of many identical haplotypes (data not shown). In this case, the ancestral node and the most common node did not coincide. The ancestral node mutation rate estimate for North American X2a is 1.00/bp/My as compared with the minimum estimate of 0.47/bp/My. Locating the root haplotype several mutational steps from a node containing a significant proportion of the individuals in the sample thus dramatically increases the {rho} estimate.

Historical information and other mitochondrial genetic lineages provide potential explanations for an enlarged node in an otherwise elongated network. For the North American X2a sample, Algonquian speakers represent the majority of the X2a individuals available for analysis. Algonquian speakers have undergone a recent population and range expansion beginning about 4,000 years ago, which is reflected in data from other mitochondrial haplogroups (Malhi et al. 2001Go, 2002Go).

Sensitivity of {pi}
Sample size has a substantial effect on the mutation rate estimates based on the mean pairwise difference ({pi}) estimator. For several populations, we were able to obtain multiple samples from two different data sets and subsequently apply the same estimation method as described in Methods. Lineage samples with the greater sample size were included in table 2, whereas alternative estimates with smaller sample sizes are reported in supplementary table 2 (Supplementary Material online). When the estimates are compared, a clear relationship emerges between the increase in sample size and a decrease in 2µ{pi}. Papua New Guinea, Australia, and Java (compared with Sundaland) mutation rate estimates with larger sample sizes were 71–80% of the estimates with smaller sample sizes. A decrease in {pi} with an increase in sample size is theoretically expected (Rosenberg and Hirsh 2003Go).

Very small sample sizes could in theory underestimate the TMRCA for a sample from a population growing at a constant rate because not all deep divergences will be observed. However, the probability of capturing both sides of the basal split in a sample of n lineages is (n – 1)/(n + 1) (Saunders et al. 1984Go). This would mean that there is an 87% chance of observing the deepest divergence from just 14 samples, for example, our smallest sample from Palau. There is a 94% chance of observing the deepest branching point given our median sample size of 31 (median of samples listed in table 2). This model assumes random mating among the population from which lineages are drawn and that the lineages are randomly sampled from the study population.

The Impact of Homoplasy Over Time
Mutational hot spots have been documented in the human mitochondrial genome (Howell et al. 2003Go; Kivisild et al. 2006Go). Median-joining networks and the {rho} estimation method can account for sites with hot spot mutations. No differential site weighting scheme was applied to our data during median-joining network construction. Multiple mutation events within a sample are inferred during the network construction process, and these multiple mutations are either incorporated into multiple branches directly or represent reticulation within the network (Bandelt et al. 1999Go) (reticulation in our samples was very low [data not shown]). In this way, the network has revealed homoplasy among different sublineages within a sample that would have otherwise been "hidden" in uncorrected summary statistics, such as the mean pairwise difference. Thus, "extra" mutational events occurring at hot spot sites are incorporated into the {rho} parameter because it measures the number of mutational events from derived haplotypes to an ancestral one along the constructed network branches. However, back mutations along a single branch will not be inferred unless contemporary haplotypes along the branch or derived branches differ from the ancestral state. So although hot spot mutations within samples were generally accounted for in our {rho} estimates, we also asked if substitution saturation models could explain the difference between samples. Although the curve expected under the K2P+{gamma} model does not fit the genealogy-based estimates, it does come close to the human versus chimpanzee phylogeny-based rates (fig. 4; supplementary fig. S2, Supplementary Material online). We expect our homoplasy/back mutation model-based calculations to align with rate estimates that were not corrected for saturation; uncorrected mutation rates should be higher than corrected ones because saturation correct increases the amount of divergence between two lineages. Interestingly, published phylogeny-based estimates that had been corrected for saturation were not always higher than uncorrected estimates (e.g., Tang et al. 2002Go vs. Kivisild et al. 2006Go). Variation among the published human versus chimpanzee phylogeny-based rates is due to the optional use of substitution models to correct for multiple hits (by some but not all the authors), the different substitution models employed, and different sample sets (Hasegawa and Horai 1991Go; Vigilant et al. 1991Go; Horai et al. 1995Go).

There is often little difference (generally 2-fold or less) between our genealogy-based mutation rates older than 20,000 years ago and the phylogeny-based estimates (tables 1 and 3). Once all genealogical lineages in a sample have coalesced to single ancestor, homoplasy/back mutation is the primary mutable process affecting the loss of diversity along lineage. We suggest that sequence saturation is maintaining this lower "equilibrium rate" after all lineages have coalesced into a single ancestral sequence.


    Conclusion
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Conclusion
 Supplemental Material
 Acknowledgements
 References
 
We have characterized, for the first time, the shape of the time-dependent rate curve of human mtDNA mutation rate estimates for both the hypervariable and coding regions. We have shown that multiple hits can effectively account for the difference between pedigree-based and human versus chimpanzee phylogeny-based mutation rate estimates. However, the multiple hits model is unlikely to explain the time-dependent decay for the genealogy-based estimates, where rate estimates associated with more recent events are high and then decay over 50,000 years going back in time.

Purifying selection and population bottlenecks may explain the time dependency of within-species mutation rate estimates. Specifically, the genetic diversity estimators {pi} and {rho} are proportional to effective population size (Forster et al. 1996Go). Thus, our human diversity estimates are lower if a population history was characterized by serial bottlenecks as compared with a history of constant population size (Nei et al. 1975Go). Because phylogeny-based mutation rates are calibrated with "arrival time" archaeological information that is blind to later population size fluctuations, the mutation rate estimates based on {pi} and {rho} are also lower than expected. Using simulation, Zhivotovsky et al. (2006)Go showed that microsatellite mutation rates estimated from small populations (haplogroups) undergoing serial bottlenecks are indeed reduced compared with pedigree-based rates. Alternatively, purifying selection acts to eliminate newly arising deleterious alleles and linked diversity throughout both the coding and HVRs (Kivisild et al. 2006Go). This also decreases genetic diversity estimates (Charlesworth et al. 1995Go) and thus the mutation rate estimates.

Our mutation rate estimates from all but the most recent samples are lower than the pedigree-based rate estimate, which is not affected by the above processes of bottlenecks and selection (except for purifying selection on lethal alleles). Purifying selection and population size changes are not mutually exclusive. Both processes could underlie the exponentially decaying mutation rate curves (figs. 2 and 3). However, reduction in population size is expected to reduce the efficacy of natural selection relative to genetic drift.

The striking difference between our mutation rate estimates from before and after 20,000 years ago suggests that demographic history may play an important role. Prior to the LGM lasting between 22,000 and 15,000 years ago, human populations were characterized by small, mobile hunter-gatherer groups that may have been frequently subject to fluctuations in population size. Following the LGM, humans experienced far fewer climatic swings (Mithen 2004Go). Particularly, after the Younger Dryas cycle, agriculture facilitated dramatic population growth and serial bottlenecks were unlikely to substantially reduce the diversity of such large populations. A dramatic increase in population size during the Neolithic period is supported by mtDNA genomes from African, southeastern Asian, and European populations (Gignoux C, Henn B, unpublished data). We propose that a population history consisting of serial bottlenecks followed by recent population growth currently provides the most compelling explanation for the time dependency of human hypervariable and coding region mtDNA mutation rate estimates. It is possible that this type of demographic explanation applies to other species (e.g., birds and fish) (Ho et al. 2005Go; Burridge et al. 2008Go), but this depends on whether the species experiences serial bottlenecks, whether they have large effective population sizes, and if the species is significantly substructured. In this context, it is important that each species be evaluated separately for population history events before a more general conclusion about time dependency can be reached.


    Supplemental Material
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Conclusion
 Supplemental Material
 Acknowledgements
 References
 
Supplementary tables S1–S3 and figures S1 and S2 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).


    Acknowledgements
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Conclusion
 Supplemental Material
 Acknowledgements
 References
 
We thank Charles Roseman and Tim Weaver for stimulating discussions in the early stages of research. We would like to thank Barbara Engelhardt for assistance with the coding region calculations. We also wish to thank David DeGusta for a close reading of the manuscript. Two anonymous reviewers contributed significantly to the clarity of our manuscript. This work was funded by the Department of Anthropology, Stanford University (to B.M.H.).


    Footnotes
 
1 Present address: Program in Pharmaceutical Sciences and Pharmacogenomics, University of California, San Francisco. Back

Barbara Holland, Associate Editor


    References
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Conclusion
 Supplemental Material
 Acknowledgements
 References
 

    Achilli A, Rengo C, Magri C, et al, (21 co-authors). The molecular dissection of mtDNA haplogroup H confirms that the Franco-Cantabrian glacial refuge was a major source for the European gene pool. Am J Hum Genet (2004) 75:910–918.[CrossRef][Web of Science][Medline]

    Atkinson QD, Gray RD, Drummond AJ. MtDNA variation predicts population size in humans and reveals a major southern Asian chapter in human prehistory. Mol Biol Evol (2008) 25:468–474.[Abstract/Free Full Text]

    Bandelt HJ. Time dependency of molecular rate estimates: tempest in a teacup. Heredity (2008) 100:1–2.[CrossRef][Web of Science][Medline]

    Bandelt HJ, Forster P, Röhl A. Median-joining networks for inferring intraspecific phylogenies. Mol Biol Evol (1999) 16:37–48.[Abstract]

    Brown MD, Hosseini SH, Torroni A, Bandelt HJ, Allen JC, Schurr TG, Scozzari R, Cruciani F, Wallace DC. mtDNA haplogroup X: an ancient link between Europe/Western Asia and North America? Am J Hum Genet (1998) 63:1852–1861.[CrossRef][Web of Science][Medline]

    Burridge CP, Craw D, Fletcher D, Waters JM. Geological dates and molecular rates: fish DNA sheds light on time dependency. Mol Biol Evol (2008) 25:624–633.[Abstract/Free Full Text]

    Charlesworth D, Charlesworth B, Morgan MT. The pattern of neutral molecular variation under the background selection model. Genetics (1995) 141:1619–1632.[Abstract]

    Denver DR, Morris K, Lynch M, Vassilieva LL, Thomas WK. High direct estimate of the mutation rate in the mitochondrial genome of Caenorhabditis elegans. Science (2000) 289:2342–2344.[Abstract/Free Full Text]

    Endicott P, Ho SYW. A Bayesian evaluation of human mitochondrial substitution rates. Am J Hum Genet (2008) 82:895–902.[CrossRef][Web of Science][Medline]

    Excoffier L, Yang ZH. Substitution rate variation among sites in mitochondrial hypervariable region I of humans and chimpanzees. Mol Biol Evol (1999) 16:1357–1368.[Abstract]

    Fagundes NJR, Kanitz R, Bonatto SL. The crucial role of calibration in molecular date estimates for the peopling of the Americas—reply to Ho and Endicott. Am J Hum Genet (2008) 83:146–147.[CrossRef][Web of Science]

    Fagundes NJR, Kanitz R, Eckert R, et al, (13 co-authors). Mitochondrial population genomics supports a single pre-Clovis origin with a coastal route for the peopling of the Americas. Am J Hum Genet (2008) 82:583–592.[CrossRef][Web of Science][Medline]

    Fairbanks RG, Mortlock RA, Chiu TC, Cao L, Kaplan A, Guilderson TP, Fairbanks TW, Bloom AL, Grootes PM, Nadeau MJ. Radiocarbon calibration curve spanning 0 to 50,000 years BP based on paired Th-230/U-234/U-238 and C-14 dates on pristine corals. Quater Sci Rev (2005) 24:1781–1796.[CrossRef]

    Forster P, Harding R, Torroni A, Bandelt HJ. Origin and evolution of native American mtDNA variation: a reappraisal. Am J Hum Genet (1996) 59:935–945.[Web of Science][Medline]

    Forster P, Rohl A, Lunnemann P, Brinkmann C, Zerjal T, Tyler-Smith C, Brinkmann B. A short tandem repeat-based phylogeny for the human Y chromosome. Am J Hum Genet (2000) 67:182–196.[CrossRef][Web of Science][Medline]

    Friedlaender J, Schurr T, Gentz F, et al, (13 co-authors). Expanding Southwest Pacific mitochondrial haplogroups P and Q. Mol Biol Evol (2005) 22:1506–1517.[Abstract/Free Full Text]

    Gilbert MTP, Jenkins DL, Gotherstrom A, et al, (13 co-authors). DNA from pre-Clovis human coprolites in Oregon, North America. Science (2008) 320:786–789.[Abstract/Free Full Text]

    Groube L, Chappell J, Muke J, Price D. A 40,000-year-old human occupation site at Huon Peninsula, Papua New Guinea. Nature (1986) 324:453–455.[CrossRef]

    Haile-Selassie Y, Suwa G, White TD. Late Miocene teeth from Middle Awash, Ethiopia, and early hominid dental evolution. Science (2004) 303:1503–1505.[Abstract/Free Full Text]

    Hasegawa M, Horai S. Time of the deepest root for polymorphism in human mitochondrial DNA. J Mol Evol (1991) 32:37–42.[CrossRef][Web of Science][Medline]

    Hasegawa M, Kishino H, Yano TA. Dating of the human ape splitting by a molecular clock of mitochondrial-DNA. J Mol Evol (1985) 22:160–174.[CrossRef][Web of Science][Medline]

    Henshilwood CS, d'Errico F, Yates R, et al, (11 co-authors). Emergence of modern human behavior: Middle Stone Age engravings from South Africa. Science (2002) 295:1278–1280.[Abstract/Free Full Text]

    Heyer E, Puymirat J, Dieltjes P, Bakker E, de Knijff P. Estimating Y chromosome specific microsatellite mutation frequencies using deep rooting pedigrees. Hum Mol Genet (1997) 6:799–803.[Abstract/Free Full Text]

    Heyer E, Zietkiewicz E, Rochowski A, Yotova V, Puymirat J, Labuda D. Phylogenetic and familial estimates of mitochondrial substitution rates: study of control region mutations in deep-rooting pedigrees. Am J Hum Genet (2001) 69:1113–1126.[CrossRef][Web of Science][Medline]

    Hill C, Soares P, Mormina M, et al, (11 co-authors). A mitochondrial stratigraphy for island southeast Asia. Am J Hum Genet (2007) 80:29–43.[CrossRef][Web of Science][Medline]

    Ho SY, Phillips MJ, Cooper A, Drummond AJ. Time dependency of molecular rate estimates and systematic overestimation of recent divergence times. Mol Biol Evol (2005) 22:1561–1568.[Abstract/Free Full Text]

    Ho SYW, Endicott P. The crucial role of calibration in molecular date estimates for the peopling of the Americas. Am J Hum Genet (2008) 83:142–146.[CrossRef][Web of Science][Medline]

    Ho SYW, Larson G. Molecular clocks: when times are a-changin’. Trends Genet (2006) 22:79–83.[CrossRef][Web of Science][Medline]

    Ho SYW, Shapiro B, Phillips MJ, Cooper A, Drummond AJ. Evidence for time dependency of molecular rate estimates. Syst Biol (2007) 56:515–522.[Free Full Text]

    Horai S, Hayasaka K, Kondo R, Tsugane K, Takahata N. Recent African origin of modern humans revealed by complete sequences of hominoid mitochondrial DNAs. Proc Natl Acad Sci USA (1995) 92:532–536.[Abstract/Free Full Text]

    Howell N, Elson JL, Turnbull DM, Herrnstadt C. African haplogroup L mtDNA sequences show violations of clock-like evolution. Mol Biol Evol (2004) 21:1843–1854.[Abstract/Free Full Text]

    Howell N, Howell C, Elson JL. Time dependency of molecular rate estimates for mtDNA: this is not the time for wishful thinking. Heredity (2008) 101:107–108.[CrossRef][Web of Science][Medline]

    Howell N, Smejkal CB, Mackey DA, Chinnery PF, Turnbull DM, Herrnstadt C. The pedigree rate of sequence divergence in the human mitochondrial genome: there is a difference between phylogenetic and pedigree rates. Am J Hum Genet (2003) 72:659–670.[CrossRef][Web of Science][Medline]

    Ingman M, Kaessmann H, Pääbo S, Gyllensten U. Mitochondrial genome variation and the origin of modern humans. Nature (2000) 408:708–713.[CrossRef][Medline]

    James HVA, Petraglia MD. Modern human origins and the evolution of behavior in the Later Pleistocene record of South Asia. Curr Anthropol (2005) 46:S3–S27.[CrossRef]

    Jin L, Nei M. Limitations of the evolutionary parsimony method of phylogenetic analysis. Mol Biol Evol (1990) 7:82–102.[Abstract]

    Joyce DJ. Chronology and new research on the Schaefer mammoth (Mammuthus primigenius) site, Kenosha County, Wisconsin, USA. Quat Int (2006) 142:44–57.[CrossRef]

    Jukes TH, Cantor CR. Evolution of protein molecules. Mammal Prot Metab (1969) 3:21–132.

    Kayser M, Roewer L, Hedman M, et al, (14 co-authors). Characteristics and frequency of germline mutations at microsatellite loci from the human Y chromosome, as revealed by direct observation in father/son pairs. Am J Hum Genet (2000) 66:1580–1588.[CrossRef][Web of Science][Medline]

    Kemp BM, Malhi RS, McDonough J, et al, (14 co-authors). Genetic analysis of early Holocene skeletal remains from Alaska and its implications for the settlement of the Americas. Am J Phys Anthropol (2007) 132:605–621.[CrossRef][Web of Science][Medline]

    Kimura M. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol (1980a) 16:111–120.[CrossRef][Web of Science][Medline]

    Kimura M. Average time until fixation of a mutant allele in a finite population under continued mutation pressure—studies by analytical, numerical, and pseudo-sampling methods. Proc Natl Acad Sci USA Biol Sci (1980b) 77:522–526.[CrossRef]

    Kivisild T, Shen P, Wall DP, et al, (17 co-authors). The role of selection in the evolution of human mitochondrial genomes. Genetics (2006) 172:373–387.[Abstract/Free Full Text]

    Klein RG. Archeology and the evolution of human behavior. Evol Anthropol (2000) 9:17–36.[CrossRef][Web of Science]

    Lambert DM, Ritchie PA, Millar CD, Holland B, Drummond AJ, Baroni C. Rates of evolution in ancient DNA from Adelie penguins. Science (2002) 295:2270–2273.[Abstract/Free Full Text]

    Loogvali EL, Roostalu U, Malyarchuk BA, et al, (35 co-authors). Disuniting uniformity: a pied cladistic canvas of mtDNA haplogroup H in Eurasia. Mol Biol Evol (2004) 21:2012–2021.[Abstract/Free Full Text]

    Lum JK, Cann RL. mtDNA lineage analyses: origins and migrations of Micronesians and Polynesians. Am J Phys Anthropol (2000) 113:151–168.[CrossRef][Web of Science][Medline]

    Malhi RS, Eshleman JA, Greenberg JA, Weiss DA, Schultz Shook BA, Kaestle FA, Lorenz JG, Kemp BM, Johnson JR, Smith DG. The structure of diversity within New World mitochondrial DNA haplogroups: implications for the prehistory of North America. Am J Hum Genet (2002) 70:905–919.[CrossRef][Web of Science][Medline]

    Malhi RS, Schultz BA, Smith DG. Distribution of mitochondrial DNA lineages among Native American tribes of Northeastern North America. Hum Biol (2001) 73:17–55.[Web of Science][Medline]

    Mishmar D, Ruiz-Pesini E, Golik P, et al, (13 co-authors). Natural selection shaped regional mtDNA variation in humans. Proc Natl Acad Sci USA (2003) 100:171–176.[Abstract/Free Full Text]

    Mithen S. After the ice: a global human history, 20,000–5000 BC (2004) Cambridge (MA): Harvard University Press.

    Nei M, Maruyama T, Chakraborty R. Bottleneck effect and genetic-variability in populations. Evolution (1975) 29:1–10.[CrossRef][Web of Science]

    O'Connell JF, Allen J. Dating the colonization of Sahul (Pleistocene Australia-New Guinea): a review of recent research. J Archaeol Sci (2004) 31:835–853.[CrossRef][Web of Science]

    Parsons TJ, Muniec DS, Sullivan K, et al, (11 co-authors). A high observed substitution rate in the human mitochondrial DNA control region. Nat Genet (1997) 15:363–368.[CrossRef][Web of Science][Medline]

    Petraglia M, Korisettar R, Boivin N, et al, (14 co-authors). Middle paleolithic assemblages from the Indian subcontinent before and after the Toba super-eruption. Science (2007) 317:114–116.[Abstract/Free Full Text]

    Pierson MJ, Martinez-Arias R, Holland BR, Gemmell NJ, Hurles ME, Penny D. Deciphering past human population movements in Oceania: provably optimal trees of 127 mtDNA genomes. Mol Biol Evol (2006) 23:1966–1975.[Abstract/Free Full Text]

    Ramachandran S, Deshpande O, Roseman CC, Rosenberg NA, Feldman MW, Cavalli-Sforza LL. Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. Proc Natl Acad Sci USA (2005) 102:15942–15947.[Abstract/Free Full Text]

    Rando JC, Cabrera VM, Larruga JM, Hernandez M, Gonzalez AM, Pinto F, Bandelt HJ. Phylogeographic patterns of mtDNA reflecting the colonization of the Canary Islands. Ann Hum Genet (1999) 63:413–428.[CrossRef][Web of Science][Medline]

    Richards M, Macaulay V, Hickey E, et al, (37 co-authors). Tracing European founder lineages in the near eastern mtDNA pool. Am J Hum Genet (2000) 67:1251–1276.[Web of Science][Medline]

    Roostalu U, Kutuev I, Loogvali EL, Metspalu E, Tambets K, Reidla M, Khusnutdinova EK, Usanga E, Kivisild T, Villems R. Origin and expansion of haplogroup H, the dominant human mitochondrial DNA lineage in West Eurasia: the near eastern and Caucasian perspective. Mol Biol Evol (2007) 24:436–448.[Abstract/Free Full Text]

    Rosenberg NA, Hirsh AE. On the use of star-shaped genealogies in inference of coalescence times. Genetics (2003) 164:1677–1682.[Abstract/Free Full Text]

    Saillard J, Forster P, Lynnerup N, Bandelt HJ, Norby S. mtDNA variation among Greenland Eskimos: the edge of the Beringian expansion. Am J Hum Genet (2000) 67:718–726.[CrossRef][Web of Science][Medline]

    Santos C, Montiel R, Arruda A, Alvarez L, Aluja MP, Lima M. Mutation patterns of mtDNA: empirical inferences for the coding region. BMC Evol Biol (2008) 8:167.[CrossRef][Medline]

    Santos C, Montiel R, Sierra B, Bettencourt C, Fernandez E, Alvarez L, Lima M, Abade A, Aluja MP. Understanding differences between phylogenetic and pedigree-derived mtDNA mutation rate: a model using families from the Azores Islands (Portugal). Mol Biol Evol (2005) 22:1490–1505.[Abstract/Free Full Text]

    Saunders IW, Tavare S, Watterson GA. On the genealogy of nested subsamples from a haploid population. Adv Appl Prob (1984) 16:471–491.[CrossRef]

    Sigurdardóttir S, Helgason A, Gulcher JR, Stefansson K, Donnelly P. The mutation rate in the human mtDNA control region. Am J Hum Genet (2000) 66:1599–1609.[CrossRef][Web of Science][Medline]

    Smith DG, Malhi RS, Eshleman J, Lorenz JG, Kaestle FA. Distribution of mtDNA haplogroup X among Native North Americans. Am J Phys Anthropol (1999) 110:271–284.[CrossRef][Web of Science][Medline]

    Stoneking M, Sherry ST, Redd AJ, Vigilant L. New approaches to dating suggest a recent age for the human MtDNA ancestor. Philos Trans R Soc Lond B Biol Sci (1992) 337:167–175.[Abstract/Free Full Text]

    Tamm E, Kivisild T, Reidla M, et al, (21 co-authors). Beringian standstill and spread of Native American founders. PLoS ONE (2007) 2:e829.[CrossRef]

    Tamura K, Nei M. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol (1993) 10:512–526.[Abstract]

    Tanaka M, Cabrera VM, Gonzalez AM, et al, (28 co-authors). Mitochondrial genome variation in Eastern Asia and the peopling of Japan. Genome Res (2004) 14:1832–1850.[Abstract/Free Full Text]

    Tang H, Siegmund DO, Shen P, Oefner PJ, Feldman MW. Frequentist estimation of coalescence times from nucleotide sequence data using a tree-based partition. Genetics (2002) 161:447–459.[Abstract/Free Full Text]

    Tommaseo-Ponzetta M, Attimonelli M, De Robertis M, Tanzariello F, Saccone C. Mitochondrial DNA variability of West New Guinea populations. Am J Phys Anthropol (2002) 117:49–67.[CrossRef][Web of Science][Medline]

    Trejaut JA, Kivisild T, Loo JH, Lee CL, He CL, Hsu CJ, Li ZY, Lin M. Traces of archaic mitochondrial lineages persist in Austronesian-speaking Formosan populations. PLoS Biol (2005) 3:1362–1372.[Web of Science]

    Tremblay M, Vezina H. New estimates of intergenerational time intervals for the calculation of age and origins of mutations. Am J Hum Genet (2000) 66:651–658.[CrossRef][Web of Science][Medline]

    Van Holst Pellekaan SM, Ingman M, Roberts-Thomson J, Harding RM. Mitochondrial genomics identifies major haplogroups in Aboriginal Australians. Am J Phys Anthropol (2006) 131:282–294.[CrossRef][Web of Science][Medline]

    Vigilant L, Stoneking M, Harpending H, Hawkes K, Wilson AC. African populations and the evolution of human mitochondrial DNA. Science (1991) 253:1503–1507.[Abstract/Free Full Text]

    Vignaud P, Duringer P, Mackaye HT, et al, (21 co-authors). Geology and palaeontology of the Upper Miocene Toros-Menalla hominid locality, Chad. Nature (2002) 418:152–155.[CrossRef]

    Ward RH, Frazier BL, Dewjager K, Paabo S. Extensive Mitochondrial diversity within a single Amerindian tribe. Proc Natl Acad Sci USA (1991) 88:8720–8724.[Abstract/Free Full Text]

    Waters MR, Stafford TW. Redefining the age of Clovis: implications for the peopling of the Americas. Science (2007) 315:1122–1126.[Abstract/Free Full Text]

    Watson E, Bauer K, Aman R, Weiss G, vonHaeseler A, Paabo S. mtDNA sequence diversity in Africa. Am J Hum Genet (1996) 59:437–444.[Web of Science][Medline]

    Woodhams M. Can deleterious mutations explain the time dependency of molecular rate estimates? Mol Biol Evol (2006) 23:2271–2273.[Abstract/Free Full Text]

    Zhivotovsky LA, Underhill PA, Cinnioglu C, et al, (17 co-authors). The effective mutation rate at Y chromosome short tandem repeats, with application to human population-divergence time. Am J Hum Genet (2004) 74:50–61.[CrossRef][Web of Science][Medline]

    Zhivotovsky LA, Underhill PA, Feldman MW. Difference between evolutionarily effective and germ line mutation rate due to stochastically varying haplogroup size. Mol Biol Evol (2006) 23:2268–2270.[Abstract/Free Full Text]

Accepted for publication October 21, 2008.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Data
Right arrow An erratum has been published
Right arrowOA All Versions of this Article:
26/1/217    most recent
msn244v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Google Scholar
Right arrow Articles by Henn, B. M.
Right arrow Articles by Mountain, J. L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Henn, B. M.
Right arrow Articles by Mountain, J. L.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?