MBE Advance Access originally published online on June 6, 2006
Molecular Biology and Evolution 2006 23(9):1643-1647; doi:10.1093/molbev/msl031
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Letter |
Positive Selection Versus Demography: Evolutionary Inferences Based on an Unusual Haplotype Structure in Drosophila simulans




* Departamento de Bioquímica, Genética e Inmunología, Facultad de Ciencias, Universidad de Vigo, Vigo, Spain; and
Departament de Genètica, Facultat de Biologia, Universitat de Barcelona, Barcelona, Spain
E-mail: hquesada{at}uvigo.es.
| Abstract |
|---|
|
|
|---|
Coalescent simulations were used to investigate the possible role of population subdivision and history in shaping nucleotide variation in a recombining 88-kb genomic fragment of Drosophila simulans displaying an unusual large-scale haplotype structure. The multilocus analysis, based on summary statistics using specific demographic null models under recombination, indicates that the observed levels of linkage disequilibrium differed significantly from the values expected under different bottleneck and population admixture scenarios. These results indicate that demography alone may not account for the observed pattern of variation and support the previous claim that the data are better described by a model in which an adaptive mutation has not yet gone to fixation.
Key Words: Drosophila selective sweep coalescent simulations bottleneck population admixture
Uncertainty about the demographic history of populations can hinder genome-wide scans for selection (Harr et al. 2002
; Glinka et al. 2003
; Orengo and Aguadé 2004
; Akey et al. 2004
; Haddrill et al. 2005
; Ometto et al. 2005
; Schmid et al. 2005
). Although demographic events apply to the whole genome and selective events are locus specific, the large variance of nucleotide diversity in stationary panmictic populations may generate large differences among loci just by chance. Indeed, different regions on a recombining chromosome have different genealogies whose sizes can differ considerably by genetic drift. These local fluctuations of variation may be amplified by the superimposed effect of demographic events and recombination, which, in the absence of selection, may mimic the pattern expected under natural selection. The theory of coalescence provides the framework to develop robust statistical tests and therefore to obtain the probability of empirical data under different evolutionary scenarios (Hudson 1990
).
Drosophila simulans has historically served as an important model system in molecular and evolutionary genetics, although its demographic history is still far from being well understood (Lachaise et al. 1988
; Powell 1997
). In a recent study, Quesada et al. (2003)
detected an unusual haplotype structure in a recombining 88-kb genomic fragment in an African population of this species. They found a core region of up to 38 kb with a major haplotype at intermediate frequency. This unusual haplotype structure gradually vanished with distance from the core region until disappearing, thus supporting a recent (
6500 years ago [yra]) and incomplete selective sweep (Rozas et al. 2001
; Quesada et al. 2003
). The observed pattern is incompatible with a pancmictic population in mutation-drift equilibrium. Here we analyze these data (Quesada et al. 2003
) under 2 different nonequilibrium demographic scenarios in an effort to ascertain whether nonselective processes can explain the presence of 2 subsets of sequences (Parsch et al. 2001
; Rozas et al. 2001
). The first model considers population subdivision and subsequent admixture, which appears to be the most relevant scenario to African populations (Hamblin and Veuille 1999
). The second model envisions a population of constant size that experiences a recent bottleneck with a few lineages surviving.
Demography and selection can both affect patterns of linkage disequilibrium (LD) in the genome (Przeworski 2002
; Haddrill et al. 2005
). To investigate the effects caused by population admixture and bottlenecks on nucleotide variation, we determined whether the overall LD between polymorphic sites, as measured by the ZnS statistic (Kelly 1997
), departs from that expected under each explicit demographic scenario. The
multilocus summary statistic
(Quesada et al. 2003
), where Oi is the observed ZnS value in region i, Ei is the ZnS value expected under the corresponding null hypothesis, and n is the number of regions (n = 11 in our case; sequence data from Quesada et al. 2003
), was used for hypothesis testing. A positive and a negative
value indicate, respectively, an excess or a deficit of observed LD with respect to the value expected under the corresponding null demographic model. Expected ZnS values for each region and the empirical distribution of
were obtained by neutral coalescent simulations (1,000 replicates) with recombination. The estimate of the population recombination parameter (CM = 0.0368) was obtained from the comparison of physical and genetic maps and assuming N = 2 x 106 and c = 0.92 x 108 (Rozas et al. 2001
; C = 2Nc in Drosophila given that males do not recombine). Because summary statistics may be sensitive to assumptions about recombination rates under certain demographic models (Thornton 2005
), 2 additional C values were considered in the simulations. The first, CL, which constitutes the lower bound of CM, is based on the minimum number of recombination events (RM) in the sample (Hudson 1987
). It is defined as the lowest value of C for which the right tail (5%) of the RM distribution (obtained by coalescent simulations) contains values equal or higher than the observed value of RM (CL = 0.0284; Quesada et al. 2003
). The second, CH, represents the highest estimate of C in the 3R chromosome of Drosophila melanogaster (CH = 0.0650; see Hey and Kliman 2002
). Simulations were conditioned on the number of segregating sites (S). Random data sets of DNA fragments as long as the surveyed fragment (88 kb) were generated in the simulations. The simulation program is available from the authors upon request. The P-value of the test (2-tailed test) was obtained as the proportion of computer replicates with
values more extreme than the observed value.
To test for admixture, we considered a simple model where an ancestral population split into 2 subpopulations at time t1, and admixture occurred at time t2 (fig. 1). Times t1 and t2 (measured in 4N generations) were obtained from extant nucleotide variation. Under an admixture model, most nucleotide differences between the 2 subsets of sequences would have accumulated after the population split. Thus, the split time (t1 = 1.04) was estimated from the average silent nucleotide divergence between the 2 subsets of sequences and the estimated silent mutation rate per base pair and year (1.4 x 108; Rozas et al. 2001
). Although t1 might be overestimated by our method, computer simulations using t1 = 1.04 yielded an average silent nucleotide divergence between the 2 most divergent subsets of sequences close to that observed (results not shown).
|
Similarly, recombination events between both subsets of haplotypes would have occurred after admixture. In this case, the time since admixture (t2 = 0.003) was estimated as the average coalescent time required to account for the number of recombination events observed between the 2 sequence subsets (Quesada et al. 2003
statistic to compare the observed and simulated number of haplotypes normalized by the number of segregating sites [Przeworski 2002
|
|
A recent population bottleneck, occurring at different times after the last glacial maximum (
20,000 yra), cannot account for the observed data either. As the power of ZnS decreases rapidly with the age of the bottleneck (Depaulis et al. 2003
0.01) when assuming the upper bound of C (fig. 4). These nonsignificant values can be attributed to the fact that ZnS has a lower statistical power than under a more severe bottleneck scenario (Depaulis et al. 2003
statistic). Similar or more extreme simulation results were obtained when the assumption of equal sizes for the ancestral and derived populations were relaxed and ratios of derived to ancestral population size equal to 0.5 and 0.25 were considered (results not shown).
|
Because the power of coalescent-based tests is not independent of the mutation population parameter (
) and, therefore, the statistical power might vary as a function of the true value of
for a given sample size and S (Markovtsova et al. 2001
instead of on S did not have a substantial effect on the results. Indeed, similar or more extreme P-values were obtained when simulations were conditioned on
. This observation is consistent with computer simulations indicating that the difference in the rejection probability due to using S instead of
in coalescent-based tests is substantially reduced in regions with moderate or high rates of recombination, as those studied here (SE Ramos-Onsins, unpublished data).
Tests using specific demographic null models under recombination are emerging as an alternative to null stationary panmictic models (Glinka et al. 2003
; Orengo and Aguadé 2004
; Bauer DuMont and Aquadro 2005
; Haddrill et al. 2005
; Ometto et al. 2005
; Wright and Gaut 2005
; Beisswanger et al. 2006
; Pool et al. 2006
). The models used in this study, like in other studies, are simple but likely are the most relevant for African D. simulans populations. The present analysis using a relatively large range of parameter values allows us to conclude that demography alone is unlikely to account for the observed haplotype structure. By contrast, previous experimental studies for this genomic region reveal a pattern that corresponds very well with the outline predicted under an incomplete selective sweep: the strong haplotype structure dissipates with distance to the core region, polymorphism decreases in the structured domain, and there is an increased frequency of derived variants (Parsch et al. 2001
; Rozas et al. 2001
; Quesada et al. 2003
; Meiklejohn et al. 2004
). The new evidence provided here indicates that a major role of demographic effects may be disregarded, thus supporting the previous claim that the data are better described by a model in which directional selection has acted recently on this region (Quesada et al. 2003
; Meiklejohn et al. 2004
).
| Acknowledgements |
|---|
|
|
|---|
This research has been partially performed using the Centre de Supercomputació de Catalunya facilities. This work was supported by grants BMC2001-2909 and BFU2004-02253 from Comisión Interdepartamental de Ciencia y Tecnología, Spain; 2001SGR-00101 from Comissió Interdepartamental de Recerca i Innovació Tecnològica, Catalonia, Spain; and by special support (Distinció per la Promoció de la Recerca Universitària) from Generalitat de Catalunya to M.A.
| Footnotes |
|---|
Diethard Tautz, Associate Editor
| References |
|---|
|
|
|---|
Akey JM, Eberle MA, Rieder MJ, Carlson CS, Shriver MD, Nickerson DA, Kruglyak L. 2004. Population history and natural selection shape patterns of genetic variation in 132 genes. PLoS Biol 2:15919.
Bauer DuMont V, Aquadro CF. 2005. Multiple signatures of positive selection downtream of Notch on the X chromosome in Drosophila melanogaster. Genetics 171:63953.
Beisswanger S, Stephan W, De Lorenzo D. 2006. Evidence for a selective swep in the wapl region of Drosophila melanogaster. Genetics 172:26574.
Depaulis F, Mousset S, Veuille M. 2003. Power of neutrality tests to detect bottlenecks and hitchhiking. J Mol Evol 57:S190200.
Fay JC, Wu CI. 1999. A human population bottleneck can account for the discordance between patterns of mitochondrial versus nuclear DNA variation. Mol Biol Evol 16:10035.[ISI][Medline]
Glinka S, Ometto L, Mousset S, Stephan W, De Lorenzo D. 2003. Demography and natural selection have shaped genetic variation in Drosophila melanogaster: a multi-locus approach. Genetics 165:126978.
Haddrill PR, Thornton KR, Charlesworth B, Andolfatto P. 2005. Multilocus patterns of nucleotide variability and the demographic and selection history of Drosophila melanogaster populations. Genome Res 15:7909.
Hamblin MT, Veuille M. 1999. Population structure among African and derived populations of Drosophila simulans: evidence for ancient subdivision and recent admixture. Genetics 153:30517.
Harr B, Kauer M, Schlötterer C. 2002. Hitchhiking mapping: a population-based fine-mapping strategy for adaptive mutations in Drosophila melanogaster. Proc Natl Acad Sci USA 99:1294954.
Hey J, Kliman RM. 2002. Interactions between natural selection, recombination and gene density in the genes of Drosophila. Genetics 160:595608.
Hudson RR. 1987. Estimating recombination parameter of a finite population model without selection. Genet Res 50:24550.[ISI][Medline]
Hudson RR. 1990. Gene genealogies and the coalescent process. In: Harvey PL, Partridge L, editors. Oxford surveys in evolutionary biology. New York: Oxford University Press. p 144.
Kelly JK. 1997. A test of neutrality based on interlocus associations. Genetics 146:1197206.[Abstract]
Lachaise D, Cariou ML, David JR, Lemeunier F, Tsacas L, Ashburner M. 1988. Historical biogeography of the Drosophila melanogaster species subgroup. Evol Biol 22:159225.
Markovtsova L, Marjoram P, Tavaré S. 2001. On a test of Depaulis and Veuille. Mol Biol Evol 18:11323.
Meiklejohn CD, Kim Y, Hartl DL, Parsch J. 2004. Identification of a locus under complex positive selection in Drosophila simulans by haplotype mapping and composite-likelihood estimation. Genetics 168:26579.
Ometto L, Glinka S, De Lorenzo D, Stephan W. 2005. Inferring the effects of demography and selection on Drosophila melanogaster populations from a chromosome-wide scan of DNA variation. Mol Biol Evol 22:21192130.
Orengo DJ, Aguadé M. 2004. Detecting the footprint of positive selection in a European population of Drosophila melanogaster: multilocus pattem of variation and distance to coding regions. Genetics 167:175966.
Parsch J, Meiklejohn CD, Hartl DL. 2001. Patterns of DNA sequence variation suggest the recent action of positive selection in the janus-ocnus region of Drosophila simulans. Genetics 159:64757.
Pool JE, Bauer DuMont V, Mueller JL, Aquadro CF. 2006. A scan of molecular variation leads to the narrow localization of a selective sweep affecting both afrotropical and cosmopolitan populations of Drosophila melanogaster. Genetics 172:1093105.
Powell JR. 1997. Progress and prospects in evolutionary biology: the Drosophila model. New York: Oxford University Press.
Przeworski M. 2002. The signature of positive selection at randomly chosen loci. Genetics 160:117989.
Quesada H, Ramirez UEM, Rozas J, Aguadé M. 2003. Large-scale adaptive hitchhiking upon high recombination in Drosophila simulans. Genetics 165:895900.
Rozas J, Gullaud M, Blandin G, Aguadé M. 2001. DNA variation at the rp49 gene region of Drosophila simulans: evolutionary inferences from an unusual haplotype structure. Genetics 158:114755.
Schmid KJ, Ramos-Onsins S, Ringys-Beckstein H, Weisshaar B, Mitchell-Olds T. 2005. A multilocus sequence survey in Arabidopsis thaliana reveals a genome-wide departure from a neutral model of DNA sequence polymorphism. Genetics 169:160115.
Thornton K. 2005. Recombination and the properties of Tajima's D in the context of approximate likelihood calculation. Genetics 171:21432148.
Wright SI, Gaut BS. 2005. Molecular population genetics and the search for adaptive evolution in plants. Mol Biol Evol 22:50619.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. Ramirez-Soriano, S. E. Ramos-Onsins, J. Rozas, F. Calafell, and A. Navarro Statistical Power Analysis of Neutrality Tests Under Demographic Expansions, Contractions and Bottlenecks With Recombination Genetics, May 1, 2008; 179(1): 555 - 567. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Pyhajarvi, M. R. Garcia-Gil, T. Knurr, M. Mikkonen, W. Wachowiak, and O. Savolainen Demographic History Has Influenced Nucleotide Diversity in European Pinus sylvestris Populations Genetics, November 1, 2007; 177(3): 1713 - 1724. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




