MBE Advance Access originally published online on September 6, 2006
Molecular Biology and Evolution 2006 23(12):2268-2270; doi:10.1093/molbev/msl105
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Letters |
Difference between Evolutionarily Effective and Germ line Mutation Rate Due to Stochastically Varying Haplogroup Size


* N.I.Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
Department of Genetics, Stanford University Medical School
Department of Biological Sciences, Stanford University
E-mail: levazh{at}gmail.com.
| Abstract |
|---|
|
|
|---|
Within a Y-chromosome haplogroup defined by unique event mutations, variation in microsatellites can accumulate due to their rapid mutation. Estimates based on pedigrees for the Y-chromosome microsatellite mutation rate are 3 or more times greater than the same estimates from evolutionary considerations. We show by simulation that the haplogroups that survive the stochastic processes of drift and extinction accumulate microsatellite variation at a lower rate than predicted from corresponding pedigree estimates; in particular, under constant total population size, the accumulated variance is on average 34 times smaller.
Key Words: microsatellite Y chromosome mutation rate haplogroup
Microsatellite variation within a haploid Y-chromosome lineage, defined by a set of unique substitutions at nucleotide sites or deletions/insertions, that is, a haplogroup by the terminology of de Knijff (2000)
, accumulates over generational time. It then becomes possible to infer the genetic history of populations by comparing microsatellite diversity within this haplogroup in geographically or ethnically different populations. The accumulated microsatellite variation depends directly on the mutation rate, as well as on the time since the haplogroup was initially founded by a single male. What Y-chromosomal microsatellite mutation rate should be used for evolutionary studies?
Mutation rates observed in pedigree and family studies (Heyer et al. 1997
; Bianchi et al. 1998
; Kayser et al. 2000
; Weale et al. 2001
) are, on average, 2.12.6 x 103 per generation. By studying short tandem repeat (STR, or microsatellite) variation within haplogroups C2 and H1 in populations with known short-term foundation events (Bulgarian Gypsy and New Zealand Maori) and comparing autosomal and Y-chromosomal microsatellite variation, Zhivotovsky et al. (2004)
suggested that an effective mutation rate of 0.69 x 103 per 25 years could serve as the rate of evolution at Y-chromosomal STR loci (see further discussion in Di Giacomo et al. 2004
; Zhivotovsky and Underhill 2005
). A mechanism that might explain this 3- to 4-fold discrepancy is that a large part of STR variation derived within a haplogroup is being effectively removed by genetic drift caused by multiple bottlenecks during random fluctuations in haplogroup abundance. An analogous discrepancy has been observed for mtDNA (Heyer et al. 2001
) and between short-term and long-term estimates of evolutionary divergence times (Ho et al. 2005
).
In order to evaluate the influence of the removal of STR variation from a haplogroup and its effect on the rate of increase of microsatellite variance within the haplogroup, we constructed a simple model that combines haplogroup population dynamics and evolution of its microsatellite variation. In the model, a haplogroup arises as a unique mutation from one haploid male individual in a population and is then subject to random fluctuation in size via a Poisson branching process that produces k progeny from each individual (0, 1, 2, 3, ...) with an average of m male progeny per father. We assume that m characterizes the demography of the entire population in which the haplogroup has arisen. In particular, if the population has constant size and the frequency of the haplogroup is not high, m is considered to be 1. If the frequency is too high, it may approach the size of the population, Npop, which of course is the limit of a haplogroup's abundance. Formally, we consider 2 Poisson processes, one for the new haplogroup and the other for the rest of population with the same parameter m; if at a certain generation their sum exceeds Npop, we decrease the sizes proportionally to have Npop as the total. Generations do not overlap. Within the haplogroup, a microsatellite locus evolves under genetic drift caused by the final haplogroup size and germ line stepwise mutation rate µ per generation. In general, the mutation scheme may include multistep changes, in which µ is the effective mutation rate, namely, the product of the mutation rate and the mutational variance in the number of repeats (Zhivotovsky et al. 2001
). In our simulations, we consider a mutation rate of µ = 0.001; all the results will be qualitatively the same for other values of µ. At the initial generation, microsatellite variation is zero because there is only one individual carrying that Y chromosome. Then, variation at this locus progressively accumulates due to mutation and can be measured as the usual variance in the repeat number. The expected value of the variance in generation t, Vt, satisfies the recursion
where Nt is the effective size of the haplogroup at generation t. For humans, the effective size of a population is on average around a third of the actual size of a population, the one-third rule of Cavalli-Sforza et al. (1994
, p. 13). We introduce it as the haplogroup size times 1/3. For convenience, the effective haplogroup size is assumed to be one if its actual size is less than 3.
If the haplogroup size has been sufficiently large throughout the entire process, then the variance would increase linearly as time increases with slope µ, Vt=µxt, or 0.001 per generation in our simulations. In all, 50,000 haplogroups were generated and allowed to evolve for 4,000 generations, corresponding roughly to 100,000 years for humans. In simulations of a neutral process with average rate of increase m = 1, the number of surviving haplogroups rapidly decreased with time and corresponded well with the theory of mutant survival (Li 1955
, p. 242), and the average size of the surviving haplogroups increased each generation by a value rapidly approaching 0.5 (data not shown), which agrees with asymptotic fraction of 2/t of haplotypes that survive at generation t (Athreya and Ney 1972
, p. 19). The accumulated variance increased almost linearly (fig. 1), at a rate of increase about 0.00028 per generation; that is, the actual rate of accumulation microsatellite variation was about 3.6 times less than that predicted from the germ line mutation rate. This corresponds perfectly to the 3- to 4-fold difference observed between germ line and evolutionarily effective mutation rate.
|
The discrepancy is due to loss of a large part of the within-haplogroup accumulated variation by strong genetic drift caused by random fluctuations of the haplogroup size, which may produce many bottlenecks (data not shown). Thus, the 34 difference simply reflects the recent bottlenecks in the demographic history of haplogroups and is a consequence of random survival and death of haplogroups. We should emphasize that the number, 3.6, is the average across all surviving haplogroups; each individual haplogroup has its own unique demographic history, not discernible from current information, although around that average. There will always be uncertainty about the exact factor by which an evolutionarily effective mutation rate differs from the germ line mutation rate for a particular haplogroup in a particular population because that factor depends on a particular demographic history and thus provides a "demographic error" which is largely unknown.
Other demographic parameters could also affect the difference between germ line and evolutionarily effective mutation rate. Indeed, if the effective population size is less than the actual size, the difference becomes even larger. For instance, under the "one-third rule," the rate of increase in microsatellite variation is about 0.00012 per generation, more than 8 times less than expected (fig. 2, diamonds). If population growth is restricted, the accumulated microsatellite variance saturates (fig. 2, open circles). In a growing population, the rate of increase in microsatellite variation becomes greater because haplogroup size is rapidly increasing; in the simulation (fig. 2, squares), the accumulated variance is less than predicted by 2.9, 2.4, 1.8, and 1.3 times at 80, 200, 400, and 1,000 generations, respectively; but the population size exceeds 1 million by generation 1000, which is not realistic for many local tribes. Only if a carrier of a rare Y-chromosome haplogroup or its founder had many sons, the sons had many sons, and so on, which formally means a sudden "infinite" jump in abundance of the haplogroup, the variance would accumulate in this haplogroup with a rate of germ line mutations (fig. 1., asterisks). Otherwise, the variance accumulates much slowly. Under a more realistic scenario for the total world human population growth, with the one-third rule, the accumulated variance becomes less (fig. 2, triangles), the actual haplogroup size remains large, and the assumption of limited population size would lead to a greater decrease in the rate of accumulation of microsatellite variation.
|
Thus, genetic variation accumulating in a haplogroup is strongly influenced by stochastic fluctuation in its size, especially multiple bottlenecks. This greatly reduces the rate of increase of microsatellite variation and explains why the germ line mutation rate observed in pedigree and family studies greatly exceeds that computed from lineages in a population. Germ line mutation rates can be useful for forensic and disease studies but provide biased evolutionary estimates for haplogroups toward much younger age. Therefore, the evolutionarily effective mutation rate is more appropriate for evolutionary studies. Although our analysis concerns the rate of accumulation of microsatellite variance, the latter is a consequence of the accumulation of new mutations, and thus these points of view might be applied to any kind of within-haplogroup genetic variation, for example, to mitochondrial DNA and autosomal and X-chromosomal haplotypic blocks because bottlenecks and purifying selection can purge mutations translating to small effective haplogroup size. Slatkin and Rannala (2000)
| Acknowledgements |
|---|
|
|
|---|
Research supported in part by National Institutes of Health grant GM 28016, the RAS Program "Human molecular polymorphism," and the Russian Foundation for Basic Research (grant 04-04-48639). We are indebted to Michael N. Humphrey for valuable discussions and to Dr. Marcy Uyenoyama and 2 anonymous reviewers for helpful comments.
| Footnotes |
|---|
Marcy Uyenoyama, Associate Editor
| References |
|---|
|
|
|---|
Athreya KB and Ney PE. (1972) Branching processes. (Springer-Verlag, Berlin (Germany)).
Bianchi NO, Catanesi CI, Bailliet G, Martinez-Marignac VL, Bravi CM, Vidal-Rioja LB, Herrera RJ, López-Camelo JS. (1998) Characterization of ancestral and derived Y-chromosome haplotypes of New World native populations. Am J Hum Genet 63:18621871.[CrossRef][Web of Science][Medline]
Cavalli-Sforza LL, Menozzi P, Piazza A. (1994) The history and geography of human genes. (Princeton University Press, Princeton (NJ)).
de Knijff P. (2000) Messages through bottlenecks: on the combined use of slow and fast evolving polymorphic markers on the human Y chromosome. Am J Hum Genet 67:10551061.[Web of Science][Medline]
Di Giacomo F, Luca F, Popa LO, et al. (2004) Y chromosomal haplogroup J as a signature of the post-neolithic colonization of Europe. Hum Genet 115:357371 (27 co-authors).[Web of Science][Medline]
Heyer E, Puymirat J, Dietjes P, Bakker E, de Knijff P. (1997) Estimating Y chromosome specific microsatellite mutation frequencies using deep rooting pedigrees. Hum Mol Genet 6:799803.
Heyer E, Zietkiewicz E, Rochowski A, Yotova V, Puymirat J, Labuda D. (2001) Phylogenetic and familial estimates of mitochondrial substitution rates: study of control region mutations in deep-rooting pedigrees. Am J Hum Genet 69:11131126.[CrossRef][Web of Science][Medline]
Ho SYW, Phillips MJ, Cooper A, Drummond AJ. (2005) Time dependency of molecular rate estimates and systematic overestimation of recent divergence times. Mol Biol Evol 22:15611568.
Kayser M, Roewer L, Hedman M, et al. (2000) Characteristics and frequency of germline mutations at microsatellite loci from the human Y chromosome, as revealed by direct observation in father/son pairs. Am J Hum Genet 66:15801588 (13 co-authors).[CrossRef][Web of Science][Medline]
Li CC. (1955) Population genetics. (The University of Chicago Press, New York).
Slatkin M and Rannala B. (2000) Estimating allele age. Annu Rev Genomics Hum Genet 1:225249.[CrossRef][Web of Science][Medline]
Weale ME, Yepiskoposian L, Jager RF, Hovhannisyan N, Khudoyan A, Burbage-Hall O, Bradman N, Thomas MG. (2001) Armenian Y chromosome haplotypes reveal strong regional structure within a single ethno-national group. Hum Genet 109:659674.[CrossRef][Web of Science][Medline]
Zhivotovsky LA, Goldstein DB, Feldman MW. (2001) Genetic sampling error of distance (
µ)2 and variation in mutation rate among microsatellite loci. Mol Biol Evol 18:21412145.
Zhivotovsky LA and Underhill PA. (2005) On the evolutionary mutation rate at Y-chromosome STRs: comments on paper by Di Giacomo et al. 2004. Hum Genet 116:529532.[CrossRef][Web of Science][Medline]
Zhivotovsky LA, Underhill PA, Cinnio
lu C, et al. (2004) The effective mutation rate at Y chromosome short tandem repeats, with application to human population-divergence time. Am J Hum Genet 74:5061 (17 co-authors).[CrossRef][Web of Science][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
B. M. Henn, C. R. Gignoux, M. W. Feldman, and J. L. Mountain Characterizing the Time Dependency of Human Mitochondrial DNA Mutation Rate Estimates Mol. Biol. Evol., January 1, 2009; 26(1): 217 - 230. [Abstract] [Full Text] [PDF] |
||||
![]() |
A.-L. Raquin, F. Depaulis, A. Lambert, N. Galic, P. Brabant, and I. Goldringer Experimental Estimation of Mutation Rates in a Wheat Population With a Gene Genealogy Approach Genetics, August 1, 2008; 179(4): 2195 - 2211. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. P. Burridge, D. Craw, D. Fletcher, and J. M. Waters Geological Dates and Molecular Rates: Fish DNA Sheds Light on Time Dependency Mol. Biol. Evol., April 1, 2008; 25(4): 624 - 633. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Y. W. Ho, B. Shapiro, M. J. Phillips, A. Cooper, and A. J. Drummond Evidence for Time Dependency of Molecular Rate Estimates Syst Biol, June 1, 2007; 56(3): 515 - 522. [Full Text] [PDF] |
||||
![]() |
F. Cruciani, R. La Fratta, B. Trombetta, P. Santolamazza, D. Sellitto, E. B. Colomb, J.-M. Dugoujon, F. Crivellaro, T. Benincasa, R. Pascone, et al. Tracing Past Human Male Movements in Northern/Eastern Africa and Western Eurasia: New Clues from Y-Chromosomal Haplogroups E-M78 and J-M12 Mol. Biol. Evol., June 1, 2007; 24(6): 1300 - 1311. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


: within haplogroups in a population with effective size under the "one-third rule" [m = 1, Npop is large];
: variance within haplogroups in a population with maximal population size of 250 [m = 1, Npop = 250];
: variance within haplogroups in a large population with constant growth rate [m = 1.01];
: variance within haplogroups in a large population with changing growth rates and under the "one-third rule" for the effective size [m = 1.002 before 400 ga (generations ago), m = 1.012 from 400 to 14 ga, m = 1.12 from 14 to 8 ga, and m = 1.25 from 8 ga to the current time].

