Skip Navigation


MBE Advance Access originally published online on June 6, 2006
Molecular Biology and Evolution 2006 23(9):1643-1647; doi:10.1093/molbev/msl031
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
23/9/1643    most recent
msl031v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Quesada, H.
Right arrow Articles by Aguadé, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Quesada, H.
Right arrow Articles by Aguadé, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

Letter

Positive Selection Versus Demography: Evolutionary Inferences Based on an Unusual Haplotype Structure in Drosophila simulans

Humberto Quesada*,{dagger}, Sebastián E. Ramos-Onsins{dagger}, Julio Rozas{dagger} and Montserrat Aguadé{dagger}

* Departamento de Bioquímica, Genética e Inmunología, Facultad de Ciencias, Universidad de Vigo, Vigo, Spain; and {dagger} Departament de Genètica, Facultat de Biologia, Universitat de Barcelona, Barcelona, Spain

E-mail: hquesada{at}uvigo.es.


    Abstract
 TOP
 Abstract
 Acknowledgements
 References
 
Coalescent simulations were used to investigate the possible role of population subdivision and history in shaping nucleotide variation in a recombining 88-kb genomic fragment of Drosophila simulans displaying an unusual large-scale haplotype structure. The multilocus analysis, based on summary statistics using specific demographic null models under recombination, indicates that the observed levels of linkage disequilibrium differed significantly from the values expected under different bottleneck and population admixture scenarios. These results indicate that demography alone may not account for the observed pattern of variation and support the previous claim that the data are better described by a model in which an adaptive mutation has not yet gone to fixation.

Key Words: Drosophila • selective sweep • coalescent simulations • bottleneck • population admixture

Uncertainty about the demographic history of populations can hinder genome-wide scans for selection (Harr et al. 2002Go; Glinka et al. 2003Go; Orengo and Aguadé 2004Go; Akey et al. 2004Go; Haddrill et al. 2005Go; Ometto et al. 2005Go; Schmid et al. 2005Go). Although demographic events apply to the whole genome and selective events are locus specific, the large variance of nucleotide diversity in stationary panmictic populations may generate large differences among loci just by chance. Indeed, different regions on a recombining chromosome have different genealogies whose sizes can differ considerably by genetic drift. These local fluctuations of variation may be amplified by the superimposed effect of demographic events and recombination, which, in the absence of selection, may mimic the pattern expected under natural selection. The theory of coalescence provides the framework to develop robust statistical tests and therefore to obtain the probability of empirical data under different evolutionary scenarios (Hudson 1990Go).

Drosophila simulans has historically served as an important model system in molecular and evolutionary genetics, although its demographic history is still far from being well understood (Lachaise et al. 1988Go; Powell 1997Go). In a recent study, Quesada et al. (2003)Go detected an unusual haplotype structure in a recombining 88-kb genomic fragment in an African population of this species. They found a core region of up to 38 kb with a major haplotype at intermediate frequency. This unusual haplotype structure gradually vanished with distance from the core region until disappearing, thus supporting a recent (~6500 years ago [yra]) and incomplete selective sweep (Rozas et al. 2001Go; Quesada et al. 2003Go). The observed pattern is incompatible with a pancmictic population in mutation-drift equilibrium. Here we analyze these data (Quesada et al. 2003Go) under 2 different nonequilibrium demographic scenarios in an effort to ascertain whether nonselective processes can explain the presence of 2 subsets of sequences (Parsch et al. 2001Go; Rozas et al. 2001Go). The first model considers population subdivision and subsequent admixture, which appears to be the most relevant scenario to African populations (Hamblin and Veuille 1999Go). The second model envisions a population of constant size that experiences a recent bottleneck with a few lineages surviving.

Demography and selection can both affect patterns of linkage disequilibrium (LD) in the genome (Przeworski 2002Go; Haddrill et al. 2005Go). To investigate the effects caused by population admixture and bottlenecks on nucleotide variation, we determined whether the overall LD between polymorphic sites, as measured by the ZnS statistic (Kelly 1997Go), departs from that expected under each explicit demographic scenario. The {Psi} multilocus summary statistic Formula (Quesada et al. 2003Go), where Oi is the observed ZnS value in region i, Ei is the ZnS value expected under the corresponding null hypothesis, and n is the number of regions (n = 11 in our case; sequence data from Quesada et al. 2003Go), was used for hypothesis testing. A positive and a negative {Psi} value indicate, respectively, an excess or a deficit of observed LD with respect to the value expected under the corresponding null demographic model. Expected ZnS values for each region and the empirical distribution of {Psi} were obtained by neutral coalescent simulations (1,000 replicates) with recombination. The estimate of the population recombination parameter (CM = 0.0368) was obtained from the comparison of physical and genetic maps and assuming N = 2 x 106 and c = 0.92 x 10–8 (Rozas et al. 2001Go; C = 2Nc in Drosophila given that males do not recombine). Because summary statistics may be sensitive to assumptions about recombination rates under certain demographic models (Thornton 2005Go), 2 additional C values were considered in the simulations. The first, CL, which constitutes the lower bound of CM, is based on the minimum number of recombination events (RM) in the sample (Hudson 1987Go). It is defined as the lowest value of C for which the right tail (5%) of the RM distribution (obtained by coalescent simulations) contains values equal or higher than the observed value of RM (CL = 0.0284; Quesada et al. 2003Go). The second, CH, represents the highest estimate of C in the 3R chromosome of Drosophila melanogaster (CH = 0.0650; see Hey and Kliman 2002Go). Simulations were conditioned on the number of segregating sites (S). Random data sets of DNA fragments as long as the surveyed fragment (88 kb) were generated in the simulations. The simulation program is available from the authors upon request. The P-value of the test (2-tailed test) was obtained as the proportion of computer replicates with {Psi} values more extreme than the observed value.

To test for admixture, we considered a simple model where an ancestral population split into 2 subpopulations at time t1, and admixture occurred at time t2 (fig. 1). Times t1 and t2 (measured in 4N generations) were obtained from extant nucleotide variation. Under an admixture model, most nucleotide differences between the 2 subsets of sequences would have accumulated after the population split. Thus, the split time (t1 = 1.04) was estimated from the average silent nucleotide divergence between the 2 subsets of sequences and the estimated silent mutation rate per base pair and year (1.4 x 10–8; Rozas et al. 2001Go). Although t1 might be overestimated by our method, computer simulations using t1 = 1.04 yielded an average silent nucleotide divergence between the 2 most divergent subsets of sequences close to that observed (results not shown).


Figure 1
View larger version (8K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 1.— Demographic models. Left panel: Population subdivision. Right panel: Population bottleneck. See text for symbols and details.

 
Similarly, recombination events between both subsets of haplotypes would have occurred after admixture. In this case, the time since admixture (t2 = 0.003) was estimated as the average coalescent time required to account for the number of recombination events observed between the 2 sequence subsets (Quesada et al. 2003Go). An elevated level of LD is a typical signature of recent population admixture. However, as can be seen in figure 2, observed levels of LD were consistently and significantly much lower than expected under admixture. This conclusion is robust even to an error of at least one order of magnitude in the estimated time since admixture and also after considering very unequal effective sizes of both subpopulations (up to 1:99). Incorporating some migration between subpopulations also resulted in highly significant departures from the admixture hypothesis, despite that the homogenizing effect of migration leads to a less extreme subdivision and, thus, to less power to reject the null admixture model (figs. 2 and 3). Similarly, the correlation in the genealogies of nearby segregating sites (and LD) decreases as C, or the time since admixture, increases, and thus, there will be less useful information for making demographic inferences. However, varying the recombination rate did not have any substantial effect on the results. Only when considering a very ancient admixture event (t2 two orders of magnitude higher than estimated), and an unrealistically high recombination rate for this region, the admixture model was not rejected (fig. 2). Moreover, using t1 values lower than 1.04 led to a decrease in LD and therefore made the rejection of the admixture model more unlikely. However, in neither of these more unlikely scenarios did simulated data show any remarkable haplotype structure (P < 0.05, using the {Psi} statistic to compare the observed and simulated number of haplotypes normalized by the number of segregating sites [Przeworski 2002Go]).


Figure 2
View larger version (13K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 2.— Admixture test results (I). The multilocus {Psi} statistic and its empirical distribution was obtained as indicated in the text, assuming different recombination rates (CL, CM, CH) and population admixture scenarios, with or without migration allowed. Equal (1:1) and very unequal (1:99) subpopulation sizes were considered. The P-values (2-tailed tests) were obtained as the proportion of computer replicates with {Psi} values more extreme than observed. Admixture time is in 4N generations. Open symbols: P < 0.05. Closed symbols: P > 0.05.

 

Figure 3
View larger version (11K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 3.— Admixture test results (II). Observed and expected ZnS values for each region surveyed. Observed and expected ZnS values are depicted as closed and open circles, respectively. Expected ZnS values for each region were obtained by neutral coalescent simulations (1,000 replicates) with recombination (CM) under a simple admixture model with some (lower panel) or no migration (upper panel) allowed and considering unequal subpopulation sizes. t2: admixture time in 4N generations.

 
A recent population bottleneck, occurring at different times after the last glacial maximum (~20,000 yra), cannot account for the observed data either. As the power of ZnS decreases rapidly with the age of the bottleneck (Depaulis et al. 2003Go), we restricted our simulations to the time range with the highest power (0.00125–0.025 in 4N generations; fig. 3 in Depaulis et al. 2003Go). We simulated a population of initial effective size Ni, crashing to size Nb at time tb, and growing to the current effective population size N0 at time t0 (fig. 1). The severity of the bottleneck Sb is determined by the reduction in population size and its duration (Fay and Wu 1999Go). Furthermore, the distribution of summary statistics is not affected by the specific values of these 2 variables for bottlenecks of the same severity (Fay and Wu 1999Go; Orengo and Aguadé 2004Go). Severities ranging from weak (Sb = 0.005) to intermediate (Sb = 0.1) were considered to allow some lineages to survive the bottleneck, with Nb/Ni values varying from 1 x 10–1 to 5 x 10–3 and times measured in 4N generations. Simulation results reveal that the observed level of LD is in all cases much lower than that expected under a bottleneck scenario (fig. 4), a pattern similar to that observed under the admixture model. LD values compatible with a bottleneck model were only observed after relatively weak bottlenecks (Sb ≤ 0.01) when assuming the upper bound of C (fig. 4). These nonsignificant values can be attributed to the fact that ZnS has a lower statistical power than under a more severe bottleneck scenario (Depaulis et al. 2003Go). Indeed, none of these simulated samples had a major haplotype class compatible with the observed haploype structure (P < 0.05, comparing the observed and simulated number of haplotypes, normalized by the number of segregating sites [Przeworski 2002Go], using the {Psi} statistic). Similar or more extreme simulation results were obtained when the assumption of equal sizes for the ancestral and derived populations were relaxed and ratios of derived to ancestral population size equal to 0.5 and 0.25 were considered (results not shown).


Figure 4
View larger version (10K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 4.— Bottleneck test results. The multilocus {Psi} statistic and its empirical distribution was obtained as indicated in the text, assuming different recombination rates (CL, CM, CH) and bottleneck severities (Sb). The duration of the bottleneck in 4N generations is fixed to 0.0005. The conservative criterion of equal sizes for the ancestral and derived populations is used here. The P-values were obtained as in the admixture model. Times are calculated assuming N = 2 x 106 and 10 generations per year. Open symbols: P < 0.05. Closed symbols: P > 0.05.

 
Because the power of coalescent-based tests is not independent of the mutation population parameter ({theta}) and, therefore, the statistical power might vary as a function of the true value of {theta} for a given sample size and S (Markovtsova et al. 2001Go), we performed a prospective analysis to test this effect on our data. However, conditioning coalescent simulations on the estimated value of {theta} instead of on S did not have a substantial effect on the results. Indeed, similar or more extreme P-values were obtained when simulations were conditioned on {theta}. This observation is consistent with computer simulations indicating that the difference in the rejection probability due to using S instead of {theta} in coalescent-based tests is substantially reduced in regions with moderate or high rates of recombination, as those studied here (SE Ramos-Onsins, unpublished data).

Tests using specific demographic null models under recombination are emerging as an alternative to null stationary panmictic models (Glinka et al. 2003Go; Orengo and Aguadé 2004Go; Bauer DuMont and Aquadro 2005Go; Haddrill et al. 2005Go; Ometto et al. 2005Go; Wright and Gaut 2005Go; Beisswanger et al. 2006Go; Pool et al. 2006Go). The models used in this study, like in other studies, are simple but likely are the most relevant for African D. simulans populations. The present analysis using a relatively large range of parameter values allows us to conclude that demography alone is unlikely to account for the observed haplotype structure. By contrast, previous experimental studies for this genomic region reveal a pattern that corresponds very well with the outline predicted under an incomplete selective sweep: the strong haplotype structure dissipates with distance to the core region, polymorphism decreases in the structured domain, and there is an increased frequency of derived variants (Parsch et al. 2001Go; Rozas et al. 2001Go; Quesada et al. 2003Go; Meiklejohn et al. 2004Go). The new evidence provided here indicates that a major role of demographic effects may be disregarded, thus supporting the previous claim that the data are better described by a model in which directional selection has acted recently on this region (Quesada et al. 2003Go; Meiklejohn et al. 2004Go).


    Acknowledgements
 TOP
 Abstract
 Acknowledgements
 References
 
This research has been partially performed using the Centre de Supercomputació de Catalunya facilities. This work was supported by grants BMC2001-2909 and BFU2004-02253 from Comisión Interdepartamental de Ciencia y Tecnología, Spain; 2001SGR-00101 from Comissió Interdepartamental de Recerca i Innovació Tecnològica, Catalonia, Spain; and by special support (Distinció per la Promoció de la Recerca Universitària) from Generalitat de Catalunya to M.A.


    Footnotes
 
Diethard Tautz, Associate Editor


    References
 TOP
 Abstract
 Acknowledgements
 References
 

    Akey JM, Eberle MA, Rieder MJ, Carlson CS, Shriver MD, Nickerson DA, Kruglyak L. 2004. Population history and natural selection shape patterns of genetic variation in 132 genes. PLoS Biol 2:1591–9.

    Bauer DuMont V, Aquadro CF. 2005. Multiple signatures of positive selection downtream of Notch on the X chromosome in Drosophila melanogaster. Genetics 171:639–53.[Abstract/Free Full Text]

    Beisswanger S, Stephan W, De Lorenzo D. 2006. Evidence for a selective swep in the wapl region of Drosophila melanogaster. Genetics 172:265–74.[Abstract/Free Full Text]

    Depaulis F, Mousset S, Veuille M. 2003. Power of neutrality tests to detect bottlenecks and hitchhiking. J Mol Evol 57:S190–200.

    Fay JC, Wu CI. 1999. A human population bottleneck can account for the discordance between patterns of mitochondrial versus nuclear DNA variation. Mol Biol Evol 16:1003–5.[Web of Science][Medline]

    Glinka S, Ometto L, Mousset S, Stephan W, De Lorenzo D. 2003. Demography and natural selection have shaped genetic variation in Drosophila melanogaster: a multi-locus approach. Genetics 165:1269–78.[Abstract/Free Full Text]

    Haddrill PR, Thornton KR, Charlesworth B, Andolfatto P. 2005. Multilocus patterns of nucleotide variability and the demographic and selection history of Drosophila melanogaster populations. Genome Res 15:790–9.[Abstract/Free Full Text]

    Hamblin MT, Veuille M. 1999. Population structure among African and derived populations of Drosophila simulans: evidence for ancient subdivision and recent admixture. Genetics 153:305–17.[Abstract/Free Full Text]

    Harr B, Kauer M, Schlötterer C. 2002. Hitchhiking mapping: a population-based fine-mapping strategy for adaptive mutations in Drosophila melanogaster. Proc Natl Acad Sci USA 99:12949–54.[Abstract/Free Full Text]

    Hey J, Kliman RM. 2002. Interactions between natural selection, recombination and gene density in the genes of Drosophila. Genetics 160:595–608.[Abstract/Free Full Text]

    Hudson RR. 1987. Estimating recombination parameter of a finite population model without selection. Genet Res 50:245–50.[Web of Science][Medline]

    Hudson RR. 1990. Gene genealogies and the coalescent process. In: Harvey PL, Partridge L, editors. Oxford surveys in evolutionary biology. New York: Oxford University Press. p 1–44.

    Kelly JK. 1997. A test of neutrality based on interlocus associations. Genetics 146:1197–206.[Abstract]

    Lachaise D, Cariou ML, David JR, Lemeunier F, Tsacas L, Ashburner M. 1988. Historical biogeography of the Drosophila melanogaster species subgroup. Evol Biol 22:159–225.

    Markovtsova L, Marjoram P, Tavaré S. 2001. On a test of Depaulis and Veuille. Mol Biol Evol 18:1132–3.[Free Full Text]

    Meiklejohn CD, Kim Y, Hartl DL, Parsch J. 2004. Identification of a locus under complex positive selection in Drosophila simulans by haplotype mapping and composite-likelihood estimation. Genetics 168:265–79.[Abstract/Free Full Text]

    Ometto L, Glinka S, De Lorenzo D, Stephan W. 2005. Inferring the effects of demography and selection on Drosophila melanogaster populations from a chromosome-wide scan of DNA variation. Mol Biol Evol 22:2119–2130.[Abstract/Free Full Text]

    Orengo DJ, Aguadé M. 2004. Detecting the footprint of positive selection in a European population of Drosophila melanogaster: multilocus pattem of variation and distance to coding regions. Genetics 167:1759–66.[Abstract/Free Full Text]

    Parsch J, Meiklejohn CD, Hartl DL. 2001. Patterns of DNA sequence variation suggest the recent action of positive selection in the janus-ocnus region of Drosophila simulans. Genetics 159:647–57.[Abstract/Free Full Text]

    Pool JE, Bauer DuMont V, Mueller JL, Aquadro CF. 2006. A scan of molecular variation leads to the narrow localization of a selective sweep affecting both afrotropical and cosmopolitan populations of Drosophila melanogaster. Genetics 172:1093–105.[Abstract/Free Full Text]

    Powell JR. 1997. Progress and prospects in evolutionary biology: the Drosophila model. New York: Oxford University Press.

    Przeworski M. 2002. The signature of positive selection at randomly chosen loci. Genetics 160:1179–89.[Abstract/Free Full Text]

    Quesada H, Ramirez UEM, Rozas J, Aguadé M. 2003. Large-scale adaptive hitchhiking upon high recombination in Drosophila simulans. Genetics 165:895–900.[Abstract/Free Full Text]

    Rozas J, Gullaud M, Blandin G, Aguadé M. 2001. DNA variation at the rp49 gene region of Drosophila simulans: evolutionary inferences from an unusual haplotype structure. Genetics 158:1147–55.[Abstract/Free Full Text]

    Schmid KJ, Ramos-Onsins S, Ringys-Beckstein H, Weisshaar B, Mitchell-Olds T. 2005. A multilocus sequence survey in Arabidopsis thaliana reveals a genome-wide departure from a neutral model of DNA sequence polymorphism. Genetics 169:1601–15.[Abstract/Free Full Text]

    Thornton K. 2005. Recombination and the properties of Tajima's D in the context of approximate likelihood calculation. Genetics 171:2143–2148.[Abstract/Free Full Text]

    Wright SI, Gaut BS. 2005. Molecular population genetics and the search for adaptive evolution in plants. Mol Biol Evol 22:506–19.[Abstract/Free Full Text]

Accepted for publication June 1, 2006.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
GeneticsHome page
A. Ramirez-Soriano, S. E. Ramos-Onsins, J. Rozas, F. Calafell, and A. Navarro
Statistical Power Analysis of Neutrality Tests Under Demographic Expansions, Contractions and Bottlenecks With Recombination
Genetics, May 1, 2008; 179(1): 555 - 567.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
T. Pyhajarvi, M. R. Garcia-Gil, T. Knurr, M. Mikkonen, W. Wachowiak, and O. Savolainen
Demographic History Has Influenced Nucleotide Diversity in European Pinus sylvestris Populations
Genetics, November 1, 2007; 177(3): 1713 - 1724.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
23/9/1643    most recent
msl031v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Quesada, H.
Right arrow Articles by Aguadé, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Quesada, H.
Right arrow Articles by Aguadé, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?