Skip Navigation


MBE Advance Access originally published online on July 25, 2006
Molecular Biology and Evolution 2006 23(10):1869-1878; doi:10.1093/molbev/msl069
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrow All Versions of this Article:
23/10/1869    most recent
msl069v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (5)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Glinka, S.
Right arrow Articles by Stephan, W.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Glinka, S.
Right arrow Articles by Stephan, W.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

Research Article

Evidence of Gene Conversion Associated with a Selective Sweep in Drosophila melanogaster

Sascha Glinka, David De Lorenzo and Wolfgang Stephan

Section of Evolutionary Biology, Department of Biology II, Ludwig-Maximilians University, Planegg-Martinsried, Germany

E-mail: glinka{at}zi.biologie.uni-muenchen.de.


    Abstract
 TOP
 Abstract
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Since Drosophila melanogaster colonized Europe from tropical Africa 10 to 15 thousand years ago, it is expected that adaptation has played a major role in this species in recent times. A previously conducted multilocus scan of noncoding DNA sequences on the X chromosome in an ancestral and a derived population of D. melanogaster revealed that some loci have been affected by directional selection in the European population. We investigated if the pattern of DNA sequence polymorphism in a region surrounding one of these loci can be explained by a hitchhiking event. We found strong evidence that the studied region around the gene unc-119 was shaped by a recent selective sweep, including a valley of reduced heterozygosity of 83.4 kb, a skew in the frequency spectrum, and significant linkage disequilibrium on one side of the valley. This region, however, was interrupted by gene conversion events leading to a strong haplotype structure in the center of the valley of reduced variation.

Key Words: Drosophila melanogaster • nucleotide diversity • gene conversion • selective sweep

Distinguishing between demography (e.g., bottlenecks) and selection has received much recent attention in population genetics (e.g., Glinka et al. 2003Go; Orengo and Aguadé 2004Go; Storz et al. 2004Go; Ometto et al. 2005Go; Stajich and Hahn 2005Go) because both forces can lead to a reduction in diversity (Galtier et al. 2000Go). Demographic events will affect the whole genome, whereas selective events (e.g., directional selection) will affect only specific loci (Andolfatto 2001Go).

Genetic hitchhiking of neutral loci linked to rapidly fixed beneficial mutations (Maynard Smith and Haigh 1974Go) is expected to reduce heterozygosity locally, and the size of the affected region depends on the selection coefficient and the recombination rate (Kaplan et al. 1989Go; Stephan et al. 1992Go). The reduction is greatest at the site of the beneficial mutation but decreases with increasing distance from the selected site due to recombination. This results in a valley of reduced nucleotide diversity (Kim and Stephan 2002Go). In the absence of recombination, variation at linked neutral sites is completely removed but recovers slowly due to newly arising mutations. This leads to an excess of low-frequency variants and a star-shaped genealogy (Braverman et al. 1995Go). In the presence of recombination, hitchhiking may be incomplete such that the frequencies of neutral loci depend on whether they belong to the same lineage as the beneficial mutation or not. As a result, neutral variation may form a bipartite frequency spectrum. With the knowledge of the ancestral and derived states (using an outgroup), one can distinguish between low- and high-frequency variants (Fay and Wu 2000Go; Przeworski 2002Go). The resulting genealogy of surrounding neutral loci is also star-shaped but with long branches between the recombined and the swept lineages (Fay and Wu 2000Go; Przeworski 2002Go; Meiklejohn et al. 2004Go). This topology creates a strong association among alleles due to the long branches in the genealogy. Therefore, the resulting haplotype structure leads to linkage disequilibrium (LD) between polymorphisms at neutral loci (on the side of the selected site), which decreases with increasing distance from the target of selection (Przeworski 2002Go; Kim and Nielsen 2004Go; Stephan et al. 2006Go). These features are unique to genetic hitchhiking (Kim and Stephan 2002Go) and can therefore be used to distinguish it from background selection, the selection against recurrent deleterious mutations (Charlesworth et al. 1993Go).

A combination of these features has recently been observed in various studies of Drosophila. Evidence for directional selection has been reported for Drosophila simulans (Parsch et al. 2001Go; Quesada et al. 2003Go; Schlenke and Begun 2004Go) and Drosophila melanogaster (Depaulis et al. 1999Go; Nurminsky et al. 2001Go; Mousset et al. 2003Go; Bauer DuMont and Aquadro 2005Go; Beisswanger et al. 2006Go). Both species have extended their range from tropical Africa (south of the Sahara) to the Eurasian continent after the last glaciation 10 to 15 thousand years ago (kya) (David and Capy 1988Go). Due to these colonization events, the genetic composition of these species is likely to be affected by both demographic and selective processes.

A recent multilocus scan of noncoding DNA sequences on the X chromosome of a putatively ancestral population from Africa (Lake Kariba, Zimbabwe) and a derived population from Europe (Leiden, The Netherlands) of D. melanogaster revealed a large number of loci with no variation in the derived population (Glinka et al. 2003Go; Ometto et al. 2005Go). Although demography may explain most of the chromosome-wide lack of variation, there is evidence that the reduced polymorphism of some loci cannot be explained by bottlenecks alone (Ometto et al. 2005Go). One locus with zero polymorphism, a fragment within an intron of gene unc-119 (fragment 125; Glinka et al. 2003Go), is located in a region of intermediate recombination rate, with an estimated 1.926 x 10–8 recombination events per basepair per generation (rec/bp/gen; Comeron et al. 1999Go). This locus is about 7 Mb away from the telomere on the X chromosome (see also fig. 1; Glinka et al. 2003Go). Because a local reduction of variation on a recombining chromosome may be observed by chance (Kim and Stephan 2002Go), we further investigated if the region surrounding fragment 125 shows a similar pattern, which would support the idea of directional selection. Thus, we screened 17 loci around fragment 125, delimiting the region of reduced variation in the European population of D. melanogaster. In addition, we analyzed the same fragments in a putatively ancestral population of D. melanogaster (see above) because evidence is mounting that selective sweeps may have originated in the ancestral range of D. melanogaster (e.g., Beisswanger et al. 2006Go).


Figure 1
View larger version (4K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 1.— Map of the studied region, 7A2–7A5, around fragment 125 on the X chromosome, oriented from the telomere, T, to the centromere, C. The arrow indicates the direction of transcription of each gene.

 

    Materials and Methods
 TOP
 Abstract
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Population Samples, Polymerase Chain Reaction Amplification, and DNA Sequencing
For the following analyses, we used 12 highly inbred lines of both a European (Leiden, The Netherlands; kindly provided by A. J. Davis) and an African (Lake Kariba, Zimbabwe; Begun and Aquadro 1993Go; kindly provided by C. F. Aquadro) D. melanogaster population and a single strain of D. simulans (Davis, CA; kindly provided by H. A. Orr), as described in Glinka et al. (2003)Go. Following their procedure, we polymerase chain reaction amplified and sequenced (on both strands) 16 more noncoding loci proximal and distal to fragment 125 (European Molecular Biology Laboratory [EMBL] database, http://www.ebi.ac.uk, accession numbers AJ571381405; Glinka et al. 2003Go) on the X chromosome (see fig. 1). This was done based on the available DNA sequence of the D. melanogaster genome (Flybase 2004, Release 3.2.0, http://www.flybase.org). In addition, we sequenced the coding regions of 3 genes (CG1677, CG2059, and unc-119) and their 5' flanking regions (fig. 1). The 5' region of unc-119 begins 5.7 kb away from the start codon and contains a binding site for the transcription factor Dorsal (Markstein et al. 2002Go). We aligned only high-quality sequences with the application Seqman of the DNAstar (Madison, WI) package, as described in Glinka et al. (2003)Go. All sequences were deposited in the EMBL database with the accession numbers AM284420AM284965. The alignments used for the following analyses are available at http://www.zi.biologie.uni-muenchen.de/evol/Downloads.

Sequence Analyses
Standard population genetic analyses were performed using a program kindly provided by H. Li. To investigate the extent of gene conversion in our data set, we used the gene-conversion presence (GCP) test (Song et al. 2006Go), which is implemented in the program SHRUB-GC and was kindly provided by Y. Song (http://www.csif.cs.ucdavis.edu/~gusfield/). The program of H. Li was also used to conduct coalescent simulations for determining the probabilities of the statistical significance of Tajima's D (Tajima 1989Go), Fay and Wu's H (Fay and Wu 2000Go), and Fu and Li's D (Fu and Li 1993Go), making the assumption of no recombination. The homologous sequences of D. simulans were used to determine the derived state of a given site for Fay and Wu's H and Fu and Li's D and to perform the multilocus Hudson-Kreitman-Aguadé test (Hudson et al. 1987Go; Kliman et al. 2000Go). The latter approach is implemented in the program HKA, which was kindly provided by J. Hey (http://www.lifesci.rutgers.edu/~heylab). We assumed no intragenic but free recombination between fragments for the HKA test, and we have not corrected for multiple tests. For those sites where we could not identify a base in the D. simulans sequence, we used the corresponding position of the published Drosophila yakuba genome (http://flybase.net/blast/). We estimated interspecific divergence and the LD measure Zns (Kelly 1997Go) for each locus using the program VariScan (Vilella et al. 2005Go). The probability associated with LD measure Zns was calculated using DnaSP 3.99 (Rozas et al. 2003Go).

Estimation of the Selective Sweep Parameters
To examine the significance of the observed local reduction of genetic variation, we applied a composite likelihood ratio (CLR) test (Kim and Stephan 2002Go). This test requires independent estimates of the mutational parameter {theta} and the scaled recombination rate Rn. Because it is difficult to estimate {theta} by 3Neµ, where Ne is the effective population size and µ the mutation rate, we used the mean (standard error [SE]) of the Watterson estimator (Watterson 1975Go), {theta}W, of 0.0044 and 0.0127 estimated from 105 loci of the European and the African population, respectively (Glinka et al. 2003Go). Because the value of the European population size is about one-third of that of the African one (which we assumed as 106; Przeworski et al. 2001Go), Ne = 300,000 is used for the European population. Due to the absence of recombination in male D. melanogaster, Rn was estimated by 2Ner (Przeworski et al. 2001Go), where Ne is 300,000 or 106 (see above) and r is the per-site recombination rate of 1.926 x 10–8 rec/bp/gen (Comeron et al. 1999Go). The probability of the initiation per nucleotide, Gn, of a gene conversion event is estimated by 2Rn (Andolfatto and Wall 2003Go). For this test, we used a mean tract length of 352 bp (see Hilliker et al. 1994Go). The input files used for the CLR test are available at http://www.zi.biologie.uni-muenchen.de/evol/Downloads.

The original approach by Kim and Stephan (2002)Go incorporates only the spatial distribution of polymorphic sites and the frequency spectrum. We therefore applied the extended version of the maximum likelihood method that uses information of LD as well (Kim and Nielsen 2004Go). Both methods allow us to evaluate the maximum likelihood estimates for the position of the selected site X and the population selection parameter {alpha}. We used 1-kb intervals between initial steps for X over the entire range of the studied region and calculated the selection coefficient s by {alpha}/1.5Ne (e.g., Kaplan et al. 1989Go; Braverman et al. 1995Go). In the case where the neutral model is rejected in favor of the hitchhiking model by the CLR test (see above), we evaluated the significance of the selective hypothesis by a goodness-of-fit (GOF) test (Jensen et al. 2005Go). For this test, we compared the GOF values of our data with a distribution generated from 1,000 simulated data sets under a selective scenario. These simulated genealogies were also used to estimate the confidence intervals (CIs) of X and s (J Jensen, personal communication).

Demographic Modeling of the European Population
To examine if the observed pattern of nucleotide diversity in the European population could also be explained by a bottleneck, we used an extended version (Beisswanger et al. 2006Go) of a maximum likelihood approach (Ometto et al. 2005Go) implemented in a coalescent-based program (Ramos-Onsins et al. 2004Go). Following the model proposed by Galtier et al. (2000)Go, a bottleneck is characterized by its time Tb and strength Sb and the population mutation rate {theta}. As input parameters, we used different combinations of Tb (i.e., 0.0100–0.0500) and Sb (i.e., 0.340 and 0.400) and the average African {theta}W observed in the region of reduced heterozygosity (see also Beisswanger et al. 2006Go). Then, the probability of our data (i.e., the valley of reduced variation) under the bottleneck scenario was calculated as the fraction of those simulated genealogies (i.e., 100,000) with at most the observed segregating sites in the entire region (provided that fragment 125 was monomorphic). These simulations were conducted with (r = 1.926 x 10–8 rec/bp/gen; see above) and without recombination between fragments.


    Results
 TOP
 Abstract
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Region of Reduced Level of Nucleotide Diversity
To investigate levels of nucleotide diversity surrounding fragment 125, we surveyed a total of 17 loci with an average distance between loci of 4.5 kb in both the European and the African D. melanogaster samples (fig. 1, tables 1 and 2). The mean size (SE) of the DNA fragments analyzed (excluding insertions and deletions) varied slightly between the 2 population samples (368 [19] bp to 363 [17] bp for the European and African sample, respectively; see tables 1 and 2), and the entire region in which these 17 fragments are located spans 83.4 kb (tables 1 and 2).


View this table:
[in this window]
[in a new window]

 
Table 1 Summary of Sequence Data of Each Locus in the Studied Region of the European Sample

 

View this table:
[in this window]
[in a new window]

 
Table 2 Summary of Sequence Data of Each Locus in the Studied Region of the African Sample

 
The observed level of nucleotide diversity varies along the studied region within the European and the African sample (tables 1 and 2) and between both population samples (fig. 2). Whereas the nucleotide diversity levels of the flanking loci (fragment 553, 596, 861, and 862) and a central locus (fragment 593) of the European sample are similar (see table 1) as the reported mean (SE) values of 0.0046 (0.0005) and 0.0044 (0.0004; for {pi} and {theta}W, respectively; see Glinka et al. 2003Go), the remaining 12 loci show either very low or zero polymorphism (table 1). In contrast, levels of nucleotide diversity are much higher in the African sample (see table 2) and similar to those reported by Glinka et al. (2003)Go (0.0112 [0.0007] and 0.0127 [0.0007] for {pi} and {theta}W, respectively), with the exception of 4 low-variation loci (fragment 605 and 570, 609 and 596; fig. 2). As a result, this leaves a small valley of low variation at the centromere-proximal end of the studied region in the African sample, which differs from that of the European sample (fig. 2).


Figure 2
View larger version (15K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 2.— Nucleotide diversity ({pi}; Tajima 1983Go) of the European and African sample and divergence (K) against the relative position of each fragment (in bp; see tables 1 and 2). Solid black lines and diamonds correspond to the European, solid black lines and open triangles to the African sample, and dashed black lines and squares to divergence.

 
Taking the interspecific divergence between D. melanogaster and D. simulans into consideration, the observed low variation for fragment 605 and 570 in both population samples could be explained by a low mutation rate or selective constraints (fig. 2). The opposite might be true for fragment 593 (fig. 2). A higher mutation rate could have led to the observed high peak in nucleotide diversity in the European sample in this fragment. However, the observed level of polymorphism in the European sample results from a distinct haplotype structure, that is, 6 sites are segregating in 3 haplotypes present at different frequencies (fig. 3A). In addition, all sites segregating in the European sample are also segregating in the African sample (fig. 3A and B). Because the genealogy of fragment 593 (i.e., star-shaped, but with long inner branches) differs from those observed in the surrounding fragments (fragments 592 and 594), we investigated if the observed haplotype pattern in the European sample is caused by a gene conversion event. To be able to partition sequences into derived and ancestral, we used, in addition to fragment 593, its adjacent fragments 592 and 594 of both population samples. The GCP test (Song et al. 2006Go) on this data set revealed 2 conversion events in this region of tract length less than 100 bp. The haplotype observed in the European D. melanogaster line 12 can be explained by a gene conversion event between sites 33,990 and 34,069, where its tract was donated by the African D. melanogaster line 377 and the flanking part by an ancestor of all other European lines (see fig. 3A and B). A second gene conversion event between the European D. melanogaster lines 12, 16, and 18 (fig. 3A) and the African D. melanogaster line 84 (fig. 3B) was detected by the GCP test between sites 34,029 and 34,036. The flanking part of the observed haplotype in lines 16 and 18 came from line 12, whereas line 84 donated the tract sequences (see Discussion).


Figure 3
View larger version (21K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 3.— Alignment of polymorphic sites observed in 12 lines of the European (A) and the African (B) Drosophila melanogaster sample for each fragment. The relative position (see table 1) and the derived state inferred from D. melanogaster/Drosophila simulans or Drosophila yakuba (see Materials and Methods) comparisons are given for each polymorphic site. Dashes indicate 1-bp deletion.

 
Departure from Standard Neutral Model
To evaluate the significance of the observed reduction in nucleotide diversity in both populations, we used the multilocus HKA test (Kliman et al. 2000Go). This test takes the observed differences in mutation rates in our data sets into account by comparing intraspecific diversity and interspecific divergence (Hudson et al. 1987Go). Although we detected a significant deviation from the neutral expectation for the studied region in the European sample ({chi}2 = 62.015, P = 0.0001), no departure from the neutral equilibrium model was found for the African sample ({chi}2 = 10.679, P = 0.8289). Further evidence for a departure from neutrality can be obtained by examining the frequency spectrum. A skew toward low-frequency variation can be measured by Tajima's D statistic (Tajima 1989Go). Under the standard neutral model, Tajima's D is expected to be zero. Out of 10 fragments with some variation in the European sample, we observed 4 fragments (i.e., 555, 590, 594, and 607; see table 1) with D values significantly less than zero (P < 0.05), indicating an excess of singletons (table 1). In comparison, we found only 1 out of 17 loci (i.e., fragment 609; see table 2) with a significantly negative D value in the African sample. If this skew in the frequency spectrum is due to new mutations, then these singletons should represent derived variants. This can be examined by Fu and Li's D statistic (Fu and Li 1993Go), which uses an outgroup to identify the state of a mutation. In this statistic, the number of mutations observed in internal and external branches is compared with the expectations under neutrality (Fu and Li 1993Go). The same fragments that showed a departure from neutrality by Tajima's D statistic also deviated from neutrality for Fu and Li's D statistic in both population samples (see tables 1 and 2). Support for a hitchhiking event can also be obtained from Fay and Wu's H statistic (Fay and Wu 2000Go). This statistic measures the skew toward high-frequency derived variants. However, we observed a deviation from neutrality in the H statistic neither in the European nor in the African sample (tables 1 and 2). Given the strong haplotype structure in fragment 593 in the European sample, we would expect to find LD among the alleles as well. Using the assumption of no recombination, the Zns value of 0.76 is significantly higher than expected under neutrality (P = 0.049; table 1). In addition, whereas the Zns values of 3 of the 4 loci flanking the region of reduced variation (see above; table 1) are comparable with the chromosome-wide average (Ometto et al. 2005Go), one is significantly higher (P < 0.05). In contrast, we observed no LD in any fragment of the African sample (table 2). The latter result agrees with those observed for the entire X chromosome (see Ometto et al. 2005Go).

Estimation of Selective Sweep Parameters
The observed valley of variation and the skew in the frequency spectrum provide strong evidence for the recent occurrence of a selective sweep in the European population (see table 1 and fig. 2). In addition, we observed similar signs of selection in the African population, however, to a different degree (see table 2 and fig. 2). Because we have independent estimates of the effective population size, the mutational parameter {theta}, and the recombination rate for both populations (see Materials and Methods), we can apply a composite maximum likelihood approach (Kim and Stephan 2002Go; Kim and Nielsen 2004Go) to simultaneously test for a hitchhiking event and to estimate the location of the beneficial mutation and the strength of selection using all loci together. Under the assumption of a randomly mating population of constant size and given the estimates of parameters used for the simulations, our data of the European population fit significantly better a hitchhiking than a neutral model using the CLR test proposed by Kim and Stephan (2002Go; P < 0.0001). Furthermore, the estimated strength of selection, s, is 0.0038, and the estimated position of the selected site, X, is 22,147 (i.e., near fragment 590; fig. 3A). The test proposed by Kim and Nielsen (2004)Go, which includes information about LD, however, did not reject neutrality in favor of a hitchhiking model for the European population (P = 0.1250). This can be explained by the one-sided LD structure in our region, which is different from the one outlined in Kim and Nielsen (2004Go; see Discussion). When we applied the CLR test (Kim and Stephan 2002Go) to the data of the African population, we also observed a significantly better fit to a hitchhiking than a neutral model (P = 0.0300). The strength of selection, s, is 0.0016, and the position of the selected site, X, is 56,808 (i.e., corresponding to fragment 609). We did not apply the Kim and Nielsen test (2004) to this data set because we did not observe LD in the African sample (see table 2). To investigate if the observed polymorphism pattern of both populations is caused by a selective sweep alone or by demographic events (i.e., population structure or a recent bottleneck), we applied the GOF test (Jensen et al. 2005Go) to both data sets.

Neither the polymorphism pattern of the European nor that of the African population can solely be explained by demographic events (P = 0.171 and P = 0.326 for the European and the African population sample, respectively). The 95% CIs for X of the European and African population are (2,650, 73,151) and (17,901, 76,183), and for s (0.0004, 0.0110) and (0.0001, 0.0028), respectively (see Materials and Methods). The large CIs for X and s may be due to partial sequencing of the region around unc-119 used in this part of the analysis (J Jensen, personal communication). Finally, we note that the observed P value of the GOF test for the European population is rather low, which is consistent with either a selective sweep or a recent, severe bottleneck (see Jensen et al. 2005Go).

Demographic Modeling of the European Population
Because the observed P value of the GOF test for the European population is rather low, we investigated if the reduced variation in the studied region could be the result of a population bottleneck. To do this, we applied an extended version of a maximum likelihood approach (Ometto et al. 2005Go; Beisswanger et al. 2006Go) implemented in a coalescent-based program (Ramos-Onsins et al. 2004Go). We simulated various bottleneck scenarios by changing the time (i.e., Tb = 0.0100–0.0500) and strength (i.e., Sb = 0.340 and 0.400 for each Tb value) (see table 2). These values were chosen from the parameter range used in Ometto et al. (2005)Go. They reflect bottlenecks starting between 3,000 and 15,000 years ago (assuming 10 generations per year and Tb measured in 3Ne generations). In addition, we investigated the effect of recombination between loci (i.e., Method I vs. Method II; see below). However, because this method considers only crossing-over but not gene conversion, we excluded fragment 593 from the analysis. In the case of no recombination between loci (i.e., Method I), the probability of observing at most 7 segregating sites between fragments 603609 (located in the valley of reduced variation) in the European population is low for all examined bottleneck scenarios (i.e., conditioned on the observation of at most 37 segregating sites in the entire region where fragment 125 was monomorphic; table 3). However, when we assumed some recombination (i.e., 1.926 x 10–8 rec/bp/gen) between loci (i.e., Method II) and conditioned on the same assumptions as in Method I, the polymorphism data can be explained by young bottlenecks (Tb < 0.02; see table 3). However, such young bottlenecks seem to be unrealistic (Ometto et al. 2005Go).


View this table:
[in this window]
[in a new window]

 
Table 3 Summary of Bottleneck Simulations

 
Localization of Potential Beneficial Mutation
The predicted site of the beneficial mutation of the European population is located between gene CG1958 and a cluster of 3 genes, CG1677, CG2059, and unc-119, which are located –7.5 kb and 6.7, 12.3, and 14.6 kb away from the predicted site (fig. 1). For the African population, however, the predicted site is located –13.4 kb from the 5'-region of gene unc-119 (see Materials and Methods) and within gene brk (fig. 1). Because the potential target site of selection is likely to be found in a regulatory or coding region and because the mutation (divergence) rate in the fragments near the 2 genes CG1958 and brk (i.e., 605 and 609 respectively; fig. 2) is low, we focused our investigation on the gene cluster and its 5'-regions. This may partly alleviate the problems arising from the large confidence interval of X reported above. We sequenced the 5' flanking and the coding regions of all 3 genes in the European and African D. melanogaster samples and in the D. simulans strain. In the 5' region of the genes CG1677 and unc-119 (which are 514- and 401-bp long, respectively), we found neither length differences nor substantial sequence divergence between the European and the African samples (supplemental figs. 1 and 3, Supplementary Material online). However, in the 5' region of gene CG2059 (with a length of 504 bp), we observed substantial divergence at 3 sites and a similar haplotype structure as in fragment 593 in the European sample, which extends until (relative) position 34,237 (indicating the centromere-proximal end of the first gene conversion event, where the African D. melanogaster line 377 donated its tract to European line 12; see supplemental fig. 2, Supplementary Material online). Further visual inspection of the sequences revealed 1 fixed replacement site in a derived state in CG1677 and 2 in CG2059 and 1 fixed replacement site in the ancestral state in CG1677 and unc-119 in the European sample (see supplemental figs. 1, 2, and 3, Supplementary Material online).


    Discussion
 TOP
 Abstract
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Our study provides evidence that beneficial mutations were recently fixed in the unc-119 region of a European and an African D. melanogaster population, causing the observed valley of reduced variation. In addition, we detected gene conversion events leading to an unusual haplotype pattern in the center of this valley in the European population.

Evidence for a Selective Sweep
Our results suggest that the observed reduction in nucleotide diversity was caused by recent selective sweeps in the European and African D. melanogaster populations (similar to the observations of Beisswanger et al. 2006Go). The origin of the sweep in the European population, however, remains unknown and may not be related to the sweep observed in the African population. The CIs of the predicted sites of selection of both populations are overlapping, which may suggest an African origin of the European sweep. However, they are too large to delimit the genomic positions of the beneficial mutation(s) to sufficiently small regions (see Results).

A candidate for the selected mutation may be found in the 5' region of gene CG2059 and within the CG1677 and the CG2059 genes. Although we observed 3 fixed substitutions in the 5' region of gene CG2059, it is unlikely that these sites altered the cis-regulation of this gene because their distance to each other (i.e., 31 and 38 bp; see supplemental fig. 2, Supplementary Material online) exceeds the value of 14 bases estimated for the mean conservation length of such elements (Richards et al. 2005Go). Therefore, if the European sweep has not originated in Africa, we propose that the candidates for the selective target are the replacement substitutions occurring in the CG1677 and the CG2059 genes about 6.8 kb and 12.3 kb away from the predicted sweep center. Because these mutations are also present in the African D. melanogaster sample (see supplemental figs. 1, 2, and 3, Supplementary Material online), it is most likely that the sweep occurred from the standing variation in the ancestral population when D. melanogaster colonized Europe 10 to 15 kya (i.e., soft sweep).

Our analyses are based on a model of a sweep associated with a single new mutant. However, this does not affect the analyses because both modes of selection cause the same evolutionary trajectory as long as the frequency of the favored allele is very low at the beginning of the sweep (Innan and Kim 2004Go; Hermisson and Pennings 2005Go; Przeworski et al. 2005Go).

Gene Conversion Associated with the Selective Sweep
Our analysis suggests that 2 gene conversion events associated with the selective sweep are responsible for the strong haplotype structure observed in fragment 593. Given the observed valley of nucleotide diversity, the following hypothetical scenario can explain the haplotype structure observed in fragment 593 and in the 5' region of gene CG2059. Consider neutral variants linked to a selected mutation going to fixation. In the later phase of the sweep, the first gene conversion event led to the haplotype structure observed in the European lineage 12. Given the frequency of the haplotype observed in the European lineages 16 and 18, the second gene conversion event must have happened before the favored allele was fixed in the European population. Similarly, Meiklejohn et al. (2004)Go observed a potential gene conversion tract in which a stretch of ancestral variants was present in an otherwise derived haplotype associated with a selective sweep in the janus region of D. simulans. However, in this case only a single chromosome showed evidence for gene conversion, suggesting that the conversion event occurred relatively late in the sweep.

A similar pattern of nucleotide diversity has been reported from a natural population of D. melanogaster due to a break point of the common cosmopolitan inversion In(2L)t (Andolfatto et al. 1999Go). Although this inversion is probably recent (Andolfatto et al. 1999Go) and has reached high frequency in a population from the Ivory Coast (Bénassi et al. 1993Go), a sweep of the Suppressor of Hairless gene, Su(H), occurred independently of the inversion in that population (Depaulis et al. 1999Go; Mousset et al. 2003Go). However, no chromosomal rearrangement on the X chromosome has been observed in any of the European lines used in this study (Ometto et al. 2005Go). This reflects the rarity of inversions on the X chromosome in D. melanogaster, possibly due to their potential deleterious effect in hemizygous males (Coyne et al. 1991Go). Only 2 studies reported inversion polymorphism on the X chromosome in natural population of D. melanogaster (Das and Singh 1991Go; Aulard et al. 2002Go).

If a crossing-over event would have caused the strong haplotype structure observed in fragment 593 and the fixation of the beneficial mutation occurred very quickly, one would expect to find high LD on both sides of the beneficial mutation due to the mutations on the long inner branches (see fig. 7 in Kim and Nielsen 2004Go). However, LD is expected to decrease quickly due to the increase of recombination break points on both sides of the beneficial mutation, leading eventually to genealogies as expected under neutrality (Kim and Nielsen 2004Go). When we consider only gene conversion, however, the expected LD pattern is different. Assuming that the gene conversion event happened only on one side of the beneficial mutation A (fig. 4), a genealogy with long inner branches responsible for the high observed LD (segment 2) is surrounded by star-like genealogies (segment 1 and 3). This is due to the relatively short track length of a gene conversion (with a mean of 352 bp; Hilliker et al. 1994Go). However, if in addition, a crossing-over event happened at some distance to either side of the beneficial mutation during the selective sweep, genealogies will be found as described by Kim and Nielsen (2004Go; see above). The predicted spatial pattern of LD, which was not present in our study, was detected by Kim and Nielsen (2004)Go in the sequencing data of a Californian D. simulans population (Schlenke and Begun 2004Go).


Figure 4
View larger version (9K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 4.— An example of DNA sequences (horizontal lines) and the genealogical structure resulting from a recent selective sweep with gene conversion (after fig. 7 in Kim and Nielsen 2004Go). Solid lines represent sequences originally linked to the beneficial mutation A. Dashed lines represent "recombinant" sequences originally linked to the unfavored allele a but recombined via gene conversion with A during the selective phase. Break points of gene conversion are labeled as a and b. Segments between break points are defined as segment 1, 2, and 3, and the coalescent tree is given below for each segment.

 
The results of our study indicate that the signature of a selective sweep may be obscured by gene conversion events occurring during the course of the sweep. Previous statistical methods that consider only LD caused by reciprocal recombination (Kim and Nielsen 2004Go) may thus overlook potential sweep regions. A more detailed analysis of the location and length of stretches of high LD may lead to better detection of sweep regions and more accurate mapping of beneficial nucleotide substitutions.


    Supplementary Material
 TOP
 Abstract
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Supplementary figures 1, 2, and 3 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).


    Acknowledgements
 TOP
 Abstract
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
We thank Y. Kim for advice on his test, L. Ometto for the use of his bottleneck program, Y. Song for comments and advice on his gene conversion test, J. Jensen for the interpretation of our results on CIs, A. Wilken for excellent technical assistance, and J. Parsch, P. S. Pennings, and a reviewer for comments on the manuscript. This work was funded by the Deutsche Forschungsgemeinschaft (STE 325/6) and the Volkswagenstiftung (I/78815).


    Footnotes
 
Jody Hey, Associate Editor


    References
 TOP
 Abstract
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 

    Andolfatto P. 2001. Adaptive hitchhiking effects on genome variability. Curr Opin Genet Dev 11:635–41.[CrossRef][ISI][Medline]

    Andolfatto P, Wall JD. 2003. Linkage disequilibrium patterns across a recombination gradient in African Drosophila melanogaster. Genetics 165:1289–305.[Abstract/Free Full Text]

    Andolfatto P, Wall JD, Kreitman M. 1999. Unusual haplotype structure at the proximal breakpoint of In(2L)t in a natural population of Drosophila melanogaster. Genetics 153:1297–311.[Abstract/Free Full Text]

    Aulard S, David JR, Lemeunier F. 2002. Chromosomal inversion polymorphism in Afrotropical populations of Drosophila melanogaster. Genet Res 79:49–63.[CrossRef][ISI][Medline]

    Bauer DuMont V, Aquadro CF. 2005. Multiple signatures of positive selection downstream of notch on the X chromosome in Drosophila melanogaster. Genetics 171:639–53.[Abstract/Free Full Text]

    Begun DJ, Aquadro CF. 1993. African and North American populations of Drosophila melanogaster are very different at the DNA level. Nature 365:548–50.[CrossRef][Medline]

    Beisswanger S, Stephan W, De Lorenzo D. 2006. Evidence for a selective sweep in the wapl region of Drosophila melanogaster. Genetics 172:265–74.[Abstract/Free Full Text]

    Bénassi V, Aulard S, Mazeau S, Veuille M. 1993. Molecular variation of Adh and P6 genes in an African population of Drosophila melanogaster and its relation to chromosomal inversions. Genetics 134:789–99.[Abstract]

    Braverman JM, Hudson RR, Kaplan NL, Langley CH, Stephan W. 1995. The hitchhiking effect on the site frequency spectrum of DNA polymorphisms. Genetics 140:783–96.[Abstract]

    Charlesworth B, Morgan MT, Charlesworth D. 1993. The effect of deleterious mutations on neutral molecular variation. Genetics 134:1289–303.[Abstract]

    Comeron JM, Kreitman M, Aguadé M. 1999. Natural selection on synonymous sites is correlated with gene length and recombination in Drosophila. Genetics 151:239–49.[Abstract/Free Full Text]

    Coyne JA, Aulard S, Berry A. 1991. Lack of underdominance in a naturally occurring pericentric inversion in Drosophila melanogaster and its implications for chromosome evolution. Genetics 129:791–802.[Abstract]

    Das A, Singh BN. 1991. Genetic differentiation and inversion clines in Indian natural populations of Drosophila melanogaster. Genome 34:618–25.

    David JR, Capy P. 1988. Genetic variation of Drosophila melanogaster natural populations. Trends Genet 4:106–11.[CrossRef][ISI][Medline]

    Depaulis F, Brazier L, Veuille M. 1999. Selective sweep at the Drosophila melanogaster Suppressor of Hairless locus and its association with the In(2L)t inversion polymorphism. Genetics 152:1017–24.[Abstract/Free Full Text]

    Fay JC, Wu C-I. 2000. Hitchhiking under positive Darwinian selection. Genetics 155:1405–13.[Abstract/Free Full Text]

    Fu Y-X, Li W-H. 1993. Statistical tests of neutrality of mutations. Genetics 133:693–709.[Abstract]

    Galtier N, Depaulis F, Barton NH. 2000. Detecting bottlenecks and selective sweeps from DNA sequence polymorphism. Genetics 155:981–7.[Abstract/Free Full Text]

    Glinka S, Ometto L, Mousset S, Stephan W, De Lorenzo D. 2003. Demography and natural selection have shaped genetic variation in Drosophila melanogaster: a multi-locus approach. Genetics 165:1269–78.[Abstract/Free Full Text]

    Hermisson J, Pennings PS. 2005. Soft sweep: molecular population genetics of adaptation from the standing variation. Genetics 169:2335–52.[Abstract/Free Full Text]

    Hilliker AJ, Harauz G, Reaume AG, Gray M, Clark SH, Chovnick A. 1994. Meiotic gene conversion tract length distribution within the rosy locus of Drosophila melanogaster. Genetics 137:1019–26.[Abstract]

    Hudson RR, Kreitman M, Aguadé M. 1987. A test of neutral molecular evolution based on nucleotide data. Genetics 116:153–9.[Abstract/Free Full Text]

    Innan H, Kim Y. 2004. Pattern of polymorphism after strong artificial selection in a domestication event. Proc Natl Acad Sci USA 101:10667–72.[Abstract/Free Full Text]

    Jensen JD, Kim Y, Bauer DuMont V, Aquadro CF, Bustamante CD. 2005. Distinguishing between selective sweeps and demography using DNA polymorphism data. Genetics 170:1401–10.[Abstract/Free Full Text]

    Kaplan NL, Hudson RR, Langley CH. 1989. The "hitchhiking effect" revisited. Genetics 123:887–99.[Abstract/Free Full Text]

    Kelly JK. 1997. A test of neutrality based on interlocus associations. Genetics 146:1197–206.[Abstract]

    Kim Y, Nielsen R. 2004. Linkage disequilibrium as a signature of selective sweeps. Genetics 167:1513–24.[Abstract/Free Full Text]

    Kim Y, Stephan W. 2002. Detecting the local signature of genetic hitchhiking along a recombining chromosome. Genetics 160:765–77.[Abstract/Free Full Text]

    Kliman RM, Andolfatto P, Coyne JA, Depaulis F, Kreitman M, Berry AJ, McCarter J, Wakeley J, Hey J. 2000. The population genetics of the origin and divergence of the Drosophila simulans complex species. Genetics 156:1913–31.[Abstract/Free Full Text]

    Markstein M, Markstein P, Markstein V, Levine MS. 2002. Genome-wide analysis of clustered Dorsal binding sites identifies putative target genes in the Drosophila embryo. Proc Natl Acad Sci USA 99:763–8.[Abstract/Free Full Text]

    Maynard Smith J, Haigh J. 1974. The hitch-hiking effect of a favourable gene. Genet Res 23:23–35.[ISI][Medline]

    McDonald JH, Kreitman M. 1991. Adaptive protein evolution at the Adh locus in Drosophila. Nature 351:652–4.[CrossRef][Medline]

    Meiklejohn CD, Kim Y, Hartl DL, Parsch J. 2004. Identification of a locus under complex positive selection in Drosophila simulans by haplotype mapping and composite-likelihood estimation. Genetics 168:265–79.[Abstract/Free Full Text]

    Mousset S, Brazier L, Cariou M-L, Chartois F, Depaulis F, Veuille M. 2003. Evidence of a high rate of selective sweeps in African Drosophila melanogaster. Genetics 163:599–609.[Abstract/Free Full Text]

    Nurminsky D, De Aguiar D, Bustamante D, Hartl DL. 2001. Chromosomal effects of rapid gene evolution in Drosophila melanogaster. Science 291:128–30.[Abstract/Free Full Text]

    Ometto L, Glinka S, De Lorenzo D, Stephan W. 2005. Inferring the effects of demography and selection on Drosophila melanogaster from a chromosome-wide DNA polymorphism study. Mol Biol Evol 22:2119–30.[Abstract/Free Full Text]

    Orengo DJ, Aguadé M. 2004. Detecting the footprint of positive selection in a European population of Drosophila melanogaster: multilocus pattern of variation and distance to coding regions. Genetics 167:1759–66.[Abstract/Free Full Text]

    Parsch J, Meiklejohn CD, Hartl DL. 2001. Patterns of DNA sequence variation suggest the recent action of positive selection in the janus-ocnus region of Drosophila simulans. Genetics 159:647–57.[Abstract/Free Full Text]

    Przeworski M. 2002. The signature of positive selection at randomly chosen loci. Genetics 160:1179–89.[Abstract/Free Full Text]

    Przeworski M, Coop G, Wall JD. 2005. The signature of positive selection on standing genetic variation. Evolution 59:2311–23.

    Przeworski M, Wall JD, Andolfatto P. 2001. Recombination and the frequency spectrum in Drosophila melanogaster and Drosophila simulans. Mol Biol Evol 18:291–8.[Abstract/Free Full Text]

    Quesada H, Ramirez UE, Rozas J, Aguadé M. 2003. Large-scale adaptive hitchhiking upon high recombination in Drosophila simulans. Genetics 165:895–900.[Abstract/Free Full Text]

    Ramos-Onsins SE, Stranger BE, Mitchell-Olds T, Aquadé M. 2004. Multilocus analysis of variation and speciation in the closely related species Arabidopsis halleri and A. lyrata. Genetics 166:373–88.[Abstract/Free Full Text]

    Richards S, Liu Y, Bettencourt BR, Hradecky P, Letovsky S, et al. (52 co-authors). 2005. Comparative genome sequencing of Drosophila pseudoobscura: chromosomal, gene, and cis-element evolution. Genome Res 15:1–18.[Abstract/Free Full Text]

    Rozas J, Sánchez-Del Barrio JC, Messeguer X, Rozas R. 2003. DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19:2496–7.[Abstract/Free Full Text]

    Schlenke TA, Begun DJ. 2004. Strong selective sweep associated with a transposon insertion in Drosophila simulans. Proc Natl Acad Sci USA 101:1626–31.[Abstract/Free Full Text]

    Song YS, Ding Z, Gusfield D, Langley CH, Wu Y. 2006. Algorithms to distinguish the role of gene-conversion from single-crossover recombination in the derivation of SNP sequences in population. Proceedings of Recomb2006. Available from: http://recomb06.dei.unipd.it/. Accessed 2006 Apr 4.

    Stajich JE, Hahn MW. 2005. Disentangling the effects of demography and selection in human. Mol Biol Evol 22:63–73.[Abstract/Free Full Text]

    Stephan W, Song YS, Langley CH. 2006. The hitchhiking effect on linkage disequilibrium between linked neutral loci. Genetics 172:2647–63.[Abstract/Free Full Text]

    Stephan W, Wiehe T, Lenz MW. 1992. The effect of strongly selected substitutions on neutral polymorphism: analytical results based on diffusion theory. Theor Popul Biol 41:237–54.

    Storz JF, Payseur BA, Nachman MW. 2004. Genome scans of DNA variability in humans reveal evidence for selective sweeps outside of Africa. Mol Biol Evol 21:1800–11.[Abstract/Free Full Text]

    Tajima F. 1983. Evolutionary relationship of DNA sequences in finite populations. Genetics 105:437–60.[Abstract/Free Full Text]

    Tajima F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585–95.[Abstract/Free Full Text]

    Vilella AJ, Blanco-Garcia A, Hutter S, Rozas J. 2005. VariScan: analysis of evolutionary patterns from large-scale DNA sequence polymorphism data. Bioinformatics 21:2791–3.[Abstract/Free Full Text]

    Watterson GA. 1975. On the number of segregating sites in genetical models without recombination. Theor Popul Biol 7:256–76.[CrossRef][ISI][Medline]

Accepted for publication June 23, 2006.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Mol Biol EvolHome page
D. J. Orengo and M. Aguade
Genome Scans of Variation and Adaptive Change: Extended Analysis of a Candidate Locus Close to the phantom Gene Region in Drosophila melanogaster
Mol. Biol. Evol., May 1, 2007; 24(5): 1122 - 1129.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrow All Versions of this Article:
23/10/1869    most recent
msl069v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (5)