Skip Navigation


MBE Advance Access originally published online on January 16, 2008
Molecular Biology and Evolution 2008 25(6):1025-1042; doi:10.1093/molbev/msn007
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
25/6/1025    most recent
msn007v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Macpherson, J. M.
Right arrow Articles by Petrov, D. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Macpherson, J. M.
Right arrow Articles by Petrov, D. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2008. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

Research Articles

Nonadaptive Explanations for Signatures of Partial Selective Sweeps in Drosophila

J. Michael Macpherson*, Josefa González*, Daniela M. Witten{dagger}, Jerel C. Davis*, Noah A. Rosenberg{ddagger}, Aaron E. Hirsh§ and Dmitri A. Petrov*

* Department of Biological Sciences, Stanford University
{dagger} Department of Statistics, Stanford University
{ddagger} Department of Human Genetics, University of Michigan, Ann Arbor
§ Department of Ecology & Evolutionary Biology, University of Colorado at Boulder

E-mail: macpher{at}stanford.edu


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix
 Acknowledgements
 References
 
A beneficial mutation that has nearly but not yet fixed in a population produces a characteristic haplotype configuration, called a partial selective sweep. Whether nonadaptive processes might generate similar haplotype configurations has not been extensively explored. Here, we consider 5 population genetic data sets taken from regions flanking high-frequency transposable elements in North American strains of Drosophila melanogaster, each of which appears to be consistent with the expectations of a partial selective sweep. We use coalescent simulations to explore whether incorporation of the species' demographic history, purifying selection against the element, or suppression of recombination caused by the element could generate putatively adaptive haplotype configurations. Whereas most of the data sets would be rejected as nonneutral under the standard neutral null model, only the data set for which there is strong external evidence in support of an adaptive transposition appears to be nonneutral under the more complex null model and in particular when demography is taken into account. High-frequency, derived mutations from a recently bottlenecked population, such as we study here, are of great interest to evolutionary genetics in the context of scans for adaptive events; we discuss the broader implications of our findings in this context.

Key Words: bottleneck • transposable element • coalescent simulation • partial selective sweep


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix
 Acknowledgements
 References
 
Judging whether a mutation is adaptive is a question of central importance to evolutionary genetics. When an adaptive, or positively selected, mutation sweeps through a population, it leaves behind several traces, including reduced linked neutral variation (Maynard Smith and Haigh 1974Go; Kaplan et al. 1988Go, 1989Go), excessive low- and high-frequency polymorphisms (Tajima 1989aGo; Fay and Wu (2000)Go, and unusually long haplotypes associated with the beneficial mutant (Sabeti et al. 2002Go; Kim and Nielsen 2004Go; Voight et al. 2006Go); each of these signatures has been used to identify probable instances of positive selection. However, the concurrent operation of nonadaptive evolutionary forces, perhaps most potently the force of demographic change, can mimic one or more of the signatures of adaptation. This greatly complicates the task of inferring the action of positive selection (Kreitman 2000Go; Andolfatto and Przeworski 2001Go; Teshima et al. 2006Go; Thornton et al. 2007Go). In view of this difficulty, it is essential to specify the null hypothesis accurately to avoid spurious inference of positive selection.

Transposable elements are increasingly being recognized as agents of adaptive change (Brosius 2003Go; Brookfield 2004Go, 2005Go). Although characteristically under negative, or purifying, selection (Montgomery et al. 1987Go; Nuzhdin 1999Go; Petrov et al. 2003Go), several instances of likely adaptive transposition in which transposons confer insecticide resistance to their respective hosts have been discovered in recent years (Daborn et al. 2002Go; McCollum et al. 2002Go; Catania et al. 2004Go; Schlenke and Begun 2004Go; Aminetzach et al. 2005Go; Chung et al. 2007Go). Furthermore, the intriguing possibility is emerging that transposable elements might play a vital role in the evolution of genomic regulatory systems, providing the raw material for the action of regulatory control sequences (Davidson and Britten 1973Go; Brosius 2003Go; Bejerano et al. 2006Go). Because population genetic analyses of long haplotypes are important tools in identifying and characterizing adaptive transposition, it is important to form a suitable null hypothesis for this form of variation.

An accurate null hypothesis for a putatively adaptive transposable element clearly must account for demographic history. In addition, transposon-specific forces might be expected to affect the pattern of polymorphism near an insertion. First, there is considerable support for the notion that transposition into a functional genomic region can have deleterious effects and that chromosomal rearrangements caused by ectopic recombination due to nearby transposons may also be deleterious (Montgomery et al. 1987Go; Nuzhdin 1999Go; Petrov et al. 2003Go). If transposons are often selected against, then using a null model that assumes that they segregate neutrally may be inappropriate. Second, there is evidence to suggest that transposons disrupt nearby recombination in heterozygous individuals (Clark et al. 1986Go, 1988Go). Because the signatures of a selective sweep depend critically on recombination events (Maynard Smith and Haigh 1974Go; Hudson and Kaplan 1988Go; Przeworski 2002Go; Kim and Nielsen 2004Go), recombination suppression might also alter the expected pattern of polymorphism near a transposon.

Here, we employ coalescent simulations to explore how each of these 3 possibilities, namely demography, purifying selection, and recombination suppression, affects our interpretation of polymorphism data from 5 loci that flank transposable elements segregating in Drosophila melanogaster. All 5 data sets exhibit polymorphism patterns resembling the pattern expected under a partial selective sweep: few haplotypes and reduced nucleotide diversity linked to the element. One of these elements, a non-LTR retrotransposon from the Doc family called doc1420, was studied previously by our group (Aminetzach et al. 2005Go). This element was found to confer resistance to organophosphate pesticides by inserting into a coding exon of the gene CHKov1 and truncating its protein product. The other 4 loci are also non-LTR retrotransposons and are all from the BS family. They were selected for sequencing because, like doc1420, they are found at intermediate or high frequency in the non-African subpopulation of D. melanogaster. In contrast to doc1420, there is no evidence to suggest that any of the 4 BS elements are positively selected. None of the BS elements insert into a region known to be functional; although some possibility exists that one or more of the BS elements has experienced positive selection, we regard doc1420 as a positive control and the BS elements as negative controls. Our simulation results support this hypothesis, suggesting that, for elements ascertained at high frequency in a bottlenecked population, a polymorphism pattern consistent with a partial selective sweep is not unusual. When demography is incorporated into the null hypothesis, the doc1420 element retains a significant signature of positive selection, whereas the BS elements largely do not; purifying selection and recombination suppression both tend to strengthen this result. These simulations apply beyond transposable elements because high frequency–derived mutations are of particular interest in evolutionary studies; we consider the implications of our results in the broader context of genomic scans for adaptive variation.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix
 Acknowledgements
 References
 
Drosophila Strains
All strains studied in all 5 data sets are from non-African populations of D. melanogaster. Strains a1, a3, a6, a8, a18, and a20 are from Ann Arbor, MI (courtesy of Greg Gibson). Strains wi1, wi1.5, wi3, wi4, wi9, wi15, wi18, wi31, wi35, wi41, wi45, wi68, wi69, wi77, wi83, wi98, wi137, and wi146 are from Wolfskill Orchard, Davis, CA (courtesy of Sergey Nuzhdin). Strains we4, we7, we10, we11, we13, we17, we21, we25, we28, we33, we37, we44, we 47, we50, we57, we60, we63, we67, we70, we75, we 80, we83, we88, and we91 are from Raleigh, NC (courtesy of Greg Gibson). Strains 5A, 11B, 21A, 20A, 31A, and 33A are from Countryside Winery, Blountville, TN (courtesy of Lev Yampolsky). Strains w2, w7, w9, w11, w22, w29, and w31 are from a (non-African) worldwide collection. Based on the sequenced genome of D. melanogaster, we designed primers to amplify the 5' and 3' flanking regions of the 4 BS insertions. Polymerase Chain Reaction products were sequenced by Genaissance Pharmaceuticals, Inc. (Genaissance, New Haven, CT). Details about the doc1420 strains are given in Aminetzach et al. (2005)Go; only the North American strains from that study appear here. The FlyBase IDs corresponding to the 4 BS insertions are BS3443:FBti0019604, BS3457:FBti0018879, BS3618:FBti0019133, and BS3730:FBti0019410.

Recombination rates were obtained from FlyBase (Grumbling and Strelets 2006Go). In calculating all statistics, gapped sites were ignored and sites with more than 2 alleles were ignored.

Coalescent Simulations
We simulated polymorphism data for a neutrally evolving, recombining locus of some known length in nucleotides and known sample size. This "neutral" locus is linked to a second, "segregating" locus intended to represent the transposable element, which may or may not evolve neutrally. The simulation derives from the coalescent-based model described in Kaplan et al. (1988)Go, Hudson and Kaplan (1988)Go, and Przeworski (2002)Go, in which the recombinant sample genealogy of the neutral locus is generated conditional on the frequency trajectory of a new selected allele at the segregating locus. For a review of this model, see Hein et al. (2005)Go. Briefly, under this model the population is partitioned into 2 subpopulations, and the sample is partitioned into 2 subsamples, defined by which of the 2 alleles at the segregating locus they possess. The simulation proceeds backwards in time until full coalescence of the sample, and during this time recombination may occur between the segregating locus and the neutral locus or within the neutral locus. Mutations are then placed on the sample genealogy according to the infinite sites model. In Kaplan et al. (1988)Go, Hudson and Kaplan (1988)Go, and Przeworski (2002)Go, the focus was on strongly advantageous alleles at the segregating locus, but here, we are interested in the trajectories of neutral or weakly selected alleles at the segregating locus, that is, the trajectories of the transposable elements. Because our interest is in weak selection, we describe simulation methods better suited to the case of a weakly selected variant than the methods in these earlier studies. After introducing the specific demographic histories we are considering, we describe extensions to this model that allow ancestral purifying selection and recombination suppression to be incorporated.

Demographic Models
We consider 3 demographic scenarios for the history of the non-African subpopulation from which our 5 data sets derive. The first is simply a randomly mating population of constant effective size Ne = 106, corresponding to the standard neutral null hypothesis. The second demographic scenario derives from Thornton and Andolfatto (2006)Go (henceforth "TA"). In this scenario, the non-African subpopulation emigrates from Africa at a time tFormula. From that time until a time tFormula, its size is instantaneously reduced to a fraction fTA its ancestral African size, after which it instantaneously grows to its present size, presumed equal to the ancestral size (fig. 1a). We denote the identical prebottleneck and postbottleneck sizes under this scenario NFormula. Thornton and Andolfatto (2006)Go obtained estimates for several recombination regimes, and we used their point estimate corresponding to {rho}/{theta} = 10, where {rho} = 4Nr is the population recombination rate, as is appropriate to the highly recombining regions in which the 5 loci we study are found. Their point estimate is tFormula = 0.022x4NFormula, tFormula = 0.0042x4NFormula, and fTA = 0.029.


Figure 1
View larger version (8K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 1.— The 2 bottleneck scenarios explored. The population size is shown as a function of time. Part (a) corresponds to the scenario of Thornton and Andolfatto (2006)Go, and part (b) corresponds to the scenario of Li and Stephan (2006)Go.

 
The third demographic scenario derives from Li and Stephan (2006)Go (henceforth "LS"). The difference between this and the TA bottleneck scenario is that the prebottleneck African population is presumed to have undergone an expansion in size. This model requires 3 parameters beyond fLS, NFormula, tFormula, and tFormula: the respective pre-expansion and postexpansion population sizes NFormula and NFormula, and a time of expansion tFormula (cf., fig. 1b). Their point estimate is fLS = 2200/1.075 x 106 = 0.002, tFormula = 60000 y, tFormula = 15800 y,tFormula = 15460 y,NFormula = 1.6xNFormula, and NFormula = 8xNFormula. We converted the time units from years into coalescent units of 4 Formula generations, assuming, as do both TA and LS, 10 generations per year, yielding Formula Formula Formula Both bottleneck studies explicitly provide a modern non-African population size, which depends on an assumed mutation rate of ~1.5 x 10–9 bp–1 gen–1. These are Formula = 2.4 x 106 and Formula = 1.075 x 106. TA assumed equal male and female effective population sizes, and because the estimates derive from X chromosomal data, they multiplied their estimate of effective population size by 4/3. LS did not perform this multiplication, although they also used X chromosomal data, so the ratio of the non-African population size estimates is Formula To ease comparison of the 2 bottleneck studies, we use this size ratio to rescale the population size change times for LS in terms of Formula rather than Formula generations are then: Formula Formula Formula For computational convenience, in our simulations we used Formula

Simulation of Transposable Element Trajectories
We use 3 distinct techniques to produce the transposable element frequency trajectories at the segregating locus used in this study: 1) forward simulation by binomial sampling to a prespecified frequency, a technique we introduce here; 2) simple forward simulation by binomial sampling; and 3) reverse simulation using Slatkin's importance sampling method (Slatkin 2001Go). By "trajectory," we mean the full history of an allele's frequency from the origin of the mutation to the present.

To simulate transposable element trajectories under the demographic scenario of constant population size, we use only the first of these 3 techniques. First, to account for the ascertainment of the element in k of n strains sampled, we sample a frequency x isin [1/(2N), 1 – 1/(2N)] from the equilibrium neutral frequency spectrum 1/(x log(2N – 1)) dx. If a random draw from Binomial(n, x) equals k, we accept x. If not, we discard x and repeat the procedure until this ascertainment condition is met.

Next, we wish to sample an element frequency trajectory from the distribution of all random trajectories from the Wright–Fisher process which terminate at frequency x. We assume a panmictic, constant-sized population of N diploid individuals, with genotype fitnesses 1 and 1 + s for the 2 homozygotes and fitness 1 + hs for the heterozygote, for some fixed selection coefficient s and dominance effect h. In this paper, we assume h = 1/2. To sample from this distribution, we generate a trajectory by iterating forward by the Wright–Fisher process from initial frequency 1/(2N) and continuing until either fixation or loss occurs (Ewens 2004Go). If the element is lost before frequency x is reached at all, that trajectory is discarded. As there may be multiple occasions at which x is reached, some choice must be made about which of these occasions will represent the end of the trajectory. We truncate the trajectory at the last generation at which frequency x was reached. A demonstration that this procedure generates allele trajectories correctly may be found in the Appendix.

To generate element frequency trajectories for the demographic scenario specified in TA, we use a combination of the truncation procedure and simple binomial sampling. First, we use simple forward iteration to generate the postbottleneck segment of the element trajectory that runs from the beginning of the bottleneck to the present. We assume that the transposable element was at transposition–drift equilibrium when the bottleneck began, and sample a frequency x isin [1/(2N), 1 – 1/(2N)] from 1/(x log(2N – 1)) dx. This implicitly assumes that the element had entered the population by the time the bottleneck began, discussed further below. The value of N used is the ancestral African population size immediately before the bottleneck begins (cf., fig. 1a). We iterate from x forward in time through the bottleneck by binomial sampling until either fixation or loss occurs, or the trajectory reaches the present time. If the element becomes fixed or lost before the present, we discard the trajectory and begin anew. For those elements that are not fixed or lost, which reach the present time at some frequency x', we then account for ascertainment of the element in k of n strains by drawing from Binomial(n, x'); if the drawn value is equal to k, we accept the trajectory and if not the process is begun anew.

The prebottleneck segment of the trajectory, that ends at frequency x, is then generated according to the truncation procedure described above for the constant-sized population. The prebottleneck and postbottleneck segments of the trajectory are then concatenated together to form the complete trajectory of the transposable element from insertion to the present day.

To generate element trajectories for the demographic scenario specified in LS, we also generate the element trajectory in 2 stages. We generate the postbottleneck segment of the trajectory exactly as described above, except that the bottleneck parameters differ as detailed above and in figure 1. To generate the prebottleneck segment of the trajectory, we use the importance sampling method of Slatkin (2001)Go. This is done because our truncation method assumes constant population size, whereas this bottleneck scenario specifies a population size expansion in the prebottleneck population. We generate proposal trajectories backwards in time, starting from frequency x, that is, the frequency of the postbottleneck segment at the beginning of the bottleneck, according to the reverse process given in Slatkin (2001)Go. The prebottleneck segment of the trajectory thus produced is assigned an importance weight w, which is the ratio of the segment's probability under the reverse process to its probability under the forward process (Slatkin 2001)Go. When the mean and confidence intervals (CIs) are later computed over the distributions of summary statistics of these polymorphism data sets generated using this method, the summary statistics are weighted by these weights. As above, the prebottleneck and postbottleneck segments of the trajectory are concatenated to form a complete trajectory.

Genealogical Simulation
Recombinant genealogies were generated conditional on the transposable element trajectories described above according to the algorithm described in Hudson (1993)Go. For each of the 5 data sets, respectively, genealogies were generated conditional on the local recombination rate, the length of the locus, and the position of the transposable element within the locus. Then, the same number of segregating sites as observed in the respective sample was distributed over the resulting genealogy according to the infinite sites model, as described in Hudson (1993)Go.

Modeling Prebottleneck Purifying Selection
To explore the effect of purifying selection against transposable elements prior to the bottleneck, we generated element trajectories which assume a population selection coefficient of Ns = –4 until time tb, and Ns = 0 from time tb until the present. The choice of Ns = –4 stems from a maximum likelihood–based estimate of the selection coefficient, which utilizes BS element frequency data from both non-African and African samples (unpublished results). We note that the doc1420 element we study here is not in the BS family and was likely under considerably stronger purifying selection than Ns = –4 prior to the bottleneck (Petrov et al. 2003Go). At the present, we lack sufficient data to obtain a parallel estimate for the Doc family and proceed with Ns = –4. We generated the postbottleneck segment of each trajectory as described above, except that the frequency x from which the postbottleneck segment begins was drawn from the equilibrium transposition–selection distribution obtained by diffusion approximation with Ns = –4 (eq. 5.48, Ewens 2004Go):


Formula (1)

Once the postbottleneck trajectory segment was obtained, the prebottleneck segment was generated by the truncation method we introduced here for the TA demographic scenario. For the LS demographic scenario, the prebottleneck trajectory segment was generated using the method of Slatkin (2001)Go with Ns = +4.

Recombination Suppression
Recombination suppression was incorporated into the coalescent simulations by modifying the probabilities of recombination between the TE-bearing (TE) and non-TE–bearing (non-TE) subpopulations. We assume that the sample is so small compared with the population size that 2 chromosomes from the sample never recombine with one another. There are then 4 distinct types of recombination that can occur during the simulation. A TE chromosome from the sample may recombine with a TE chromosome from the population or a non-TE chromosome from the sample may recombine with a non-TE chromosome from the population; these within-class events are the first 2 types of recombination. There are also 2 cross-class types of recombination: a TE chromosome from the sample may recombine with a non-TE chromosome from the population or a non-TE chromosome from the sample may recombine with a TE chromosome from the population. The probability of recombination per generation is then 4LcnjNj, where nj and Nj are the sizes of the respective subsample and subpopulation, j indicates whether the TE or non-TE subsample/subpopulation is used, L is the number of nucleotide links in the sample at which recombination may occur, and c is the per-link recombination rate (cf., Przeworski 2002Go).

We consider 2 models of recombination suppression. In the first model, we assume that recombination does not occur in individuals heterozygous for the transposable element. Here, we simply disallow any recombination in the 2 classes that involve crossing-over between a TE and a non-TE chromosome by setting their probabilities to zero. In the second model, we assume that recombination cannot occur within a radius of fixed size surrounding the element, in individuals heterozygous for the element. To model this, we disallow recombination breakpoints at sites falling inside the restricted region surrounding the element in the calculation of L for each chromosome in the sample. If a cross-class recombination event then does occur, we ensure that the crossover location is chosen uniformly from among the links outside the restricted region.

Statistical Tests of Neutrality
For each simulated data set, we calculated a number of standard statistics, including {theta}W (Watterson 1975Go) and {pi} (Tajima 1983Go), Tajima's D (Tajima 1989aGo), the number of haplotypes H, and the haplotype diversity hd (Nei 1987Go). Each statistic was computed over the entire sample and then separately for both the TE and non-TE subsamples.

Two additional statistics specifically designed to measure differences between the TE and non-TE subsamples were calculated. These include fTE, defined as {pi}TE/({pi}TE + {pi}non-TE), a measure we introduce to quantify the relative reduction in diversity. The second statistic is a modified version of iHS (Voight et al. 2006Go). As originally defined, iHS is the logarithm of iHHancestral/iHHderived, where iHH is the integral of haplotype homozygosity, measured from the putatively adaptive focal site, for the subsamples linked, respectively, to the ancestral and derived states of the focal site. This integral is taken over the region delimited by the positions at which the haplotype homozygosity drops to 5% on the left side of the focal site and at which haplotype homozygosity drops to 5% on the right side of the focal site. In some of the 5 data sets, the flanking regions do not extend far enough away from the focal site (here, the transposable element) for the haplotype homozygosity to drop to 5%. We computed the respective iHH values to the end of the sequenced region, regardless of the value of haplotype homozygosity there. As in Voight et al. (2006)Go, we standardized iHS, according to iHS = (iHS' – E[iHS'])/{sigma}[iHS'], where iHS' is the unstandardized iHS value, and the expectation and standard deviation are computed from a set of simulations run with parameters identical to the pertinent simulation, but under the standard neutral null model.

To determine statistical significance, we performed 1-sided tests, defining the P value of each statistic from each distribution as the fraction of simulated values more extreme than the observed value. The null hypothesis for each of D, H, hd, fTE, and iHS is that they are less than their expectation under neutrality. Because we are performing numerous statistical tests on numerous loci, we must make a correction for multiple comparisons. We use the false discovery rate (FDR) measure, as implemented in the qvalue package (Storey and Tibshirani 2003Go), to account for this and evaluate significance at the 5% and 1% FDR thresholds.

Implementation Notes
The trajectory and genealogy simulation package, called combesce, was written in java; the trajectory implementation was verified where possible against test cases derived from Slatkin (2001)Go, and the genealogy implementation was verified where possible against Hudson's ms Hudson (2002)Go. Population genetic statistics were computed using a package called pgTools written by the authors in python; their implementation was verified by comparing the output to that of DnaSP (Rozas et al. 2003Go) on constructed and actual data sets. Postprocessing was performed with scripts written in R and python. The simulations were run on a cluster of 24 dual-processor 2GHz Xeon computers at the Stanford Genome Technology Center. All software is available online at http://petrov.stanford.edu.


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix
 Acknowledgements
 References
 
Summary of 5 Data Sets
The haplotype configurations of the 5 loci are shown in figures 2–GoGoGoGo and are statistically summarized in table 1. Strains with the insertion are separated from those without by a horizontal black line; the presence of the transposable element is indicated by a filled box in the midst of a sequence. Note the visual similarity of the 5 haplotype configurations: each configuration bears a strong resemblance to the pattern expected from a partial selective sweep (Voight et al. 2006Go). Namely, there tend to be fewer haplotypes and lower nucleotide diversity in the TE subsample than in the non-TE subsample. Recombination events are evident in several of the loci, for example, in locus BS3443 to the left side and between 222 and 291 nucleotides from the insertion in strains a18 and wi15 (fig. 3).


Figure 2
View larger version (17K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 2.— Locus doc1420. A column is shown at the position of the transposable element insertion. The column is shaded only for those elements that have the insertion. A horizontal rule divides the insertion-bearing sequences from the sequences that do no bear insertions.

 

Figure 3
View larger version (10K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 3.— Locus BS3443. A column is shown at the position of the transposable element insertion. The column is shaded only for those elements that have the insertion. A horizontal rule divides the insertion-bearing sequences from the noninsertion-bearing sequences.

 

Figure 4
View larger version (17K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 4.— Locus BS3457. A column is shown at the position of the transposable element insertion. The column is shaded only for those elements that have the insertion. A horizontal rule divides the insertion-bearing sequences from the noninsertion-bearing sequences.

 

Figure 5
View larger version (10K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 5.— Locus BS3618. A column is shown at the position of the transposable element insertion. The column is shaded only for those elements that have the insertion. A horizontal rule divides the insertion-bearing sequences from the noninsertion-bearing sequences.

 

Figure 6
View larger version (10K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 6.— Locus BS3730. A column is shown at the position of the transposable element insertion. The column is shaded only for those elements that have the insertion. A horizontal rule divides the insertion-bearing sequences from the sequences that do not bear insertions.

 

View this table:
[in this window]
[in a new window]

 
Table 1 Summary Statistics for the 5 Loci

 
Standard Neutral Null Model Simulations
We wish to explore how our conclusions regarding the neutrality of 5 polymorphism patterns in figures 2–GoGoGoGo might change as successively more complex null hypotheses are considered. To do so, we compare the distributions of several statistical measures of polymorphism obtained by simulation under a given null hypothesis to the values of these statistics observed from the respective loci. We begin by characterizing the distributions expected under the "standard neutral" null hypothesis (Rosenberg and Nordborg 2002Go), which posits a panmictic population of constant size N = 106 diploid individuals. Under this model, the element drifts neutrally from initial frequency 1/(2N) to its contemporary frequency, and the pattern of neutral polymorphism in the region flanking the element is simulated conditional on the element's frequency trajectory (Materials and Methods). The results of these simulations are summarized in table 2.


View this table:
[in this window]
[in a new window]

 
Table 2 FDR-Corrected Significance of Several Neutrality Tests for Each Transposable Element, under the Standard Neutral Null, and the Demographic Models of TA and LS

 
Among the summary statistics presented in table 2, iHS is likely to be the most informative in judging whether a putative partial selective sweep is consistent with neutrality. The iHS focuses on the expected signature of a partial selective sweep, namely slow decay of homozygosity away from the putatively selected element (Voight et al. 2006Go; Materials and Methods). Voight et al. (2006)Go showed by simulation that iHS was the most powerful test of neutrality among several related tests by comparison to the alternative hypothesis of a partial selective sweep, for a panmictic population of constant size. To capture the extent of reduction in diversity between the TE and non-TE subsamples, we calculate an additional between-subsample statistic, fTE = {pi}TE/({pi}TE + {pi}non-TE), the proportion of diversity among the 2 subsamples accounted for by the TE subsample. We present further summaries, including Tajima's D and the number of distinct haplotypes, H, both overall and within each subsample. These statistics may themselves be used as tests of selective neutrality (Nielsen 2005Go) and are provided to form a more complete picture of the expected pattern of polymorphism than would be seen by iHS alone.

Based on iHS, among the 5 data sets, only locus BS3457 is consistent with the standard neutral null hypothesis, after a FDR correction for multiple hypothesis testing is made (Storey and Tibshirani 2003Go). The iHS is significant at the 1% FDR for doc1420 and 2 of the BS elements and at the 5% FDR for a third BS element. That the region flanking BS3457 fails to show significance under iHS may be due to its short length, 584 bp, and few segregating sites, 9. The other BS loci are {approx}1500 bp in length, and the region flanking the doc1420 insertion has length 3400 bp (table 1). Most of the loci have fewer haplotypes and more low frequency mutations linked to the element than expected under the constant population size null model, as indicated by small values of HTE and negative values of Tajima's DTE. However, only a few significant tests occur within the non-TE subsample, which is consistent with the visual impression from the haplotype configurations of a partial selective sweep. Further, the expected value of fTE is close to 1/2 for all 5 loci, but the observed fTE is substantially lower than this for all loci except BS3457. If the BS elements are segregating neutrally, this suggests that the constant population size null model is either not correctly predicting a reduction in diversity in the TE subsample or underestimating the variance in fTE.

Evaluated over the entire sample, Tajima's D is not significant according to a 1-sided test (H0: D > 0) for any of the loci under the constant population size null model. One of these loci, BS3457, has a large and positive Tajima's D of 2.57, which clearly would be rejected under this null hypothesis by a 2-sided test. Although the observed iHS values of each of the 5 loci are negative, the Tajima's D values seem to follow no particular pattern, taking both positive and negative values. After a recent selective sweep, Tajima's D is expected to take sharply negative values, driven by excess low- and high-frequency polymorphisms. During a sweep, however, the expectation is less clear because a beneficial allele at intermediate frequency would cause intermediate frequency linked neutral polymorphisms, tending to increase Tajima's D. It is thus not necessarily surprising that Tajima's D overall is consistent with the constant population size null model.

Lastly, the overall number of haplotypes, H, is smaller than expected under the constant population size null model, except at of locus BS3730. The number of haplotypes in the non-TE subsample, Hnon-TE, appears to be consistent with a null hypothesis of constant population size, but HTE is found to be significant in each case for which H overall is significant. If we again assume that the BS elements segregate neutrally, this collection of results suggests that the constant population size null model either overestimates HTE or underestimates its variance.

The Effects of a Bottleneck
If we were to test for the neutrality of these 5 elements, neutrality would apparently be rejected under the standard neutral null model. However, a number of studies have shown that polymorphism in D. melanogaster does not accord with this model: nucleotide diversity is consistently much lower in non-African populations than in African populations (Begun and Aquadro 1993Go; Glinka et al. 2003Go; Haddrill et al. 2005Go). The favored alternative is a bottleneck associated with an emigration event from the species’ ancestral home in Africa 10–15 kya (David and Capy 1988Go; Lachaise et al. 1988Go). Such a demographic scenario can yield spurious results in tests of neutrality for recently completed selective sweeps because both selective sweeps and population expansion from a bottleneck produce excess low-frequency polymorphisms (Tajima 1989bGo). Could a recent bottleneck also produce a pattern resembling a partial selective sweep?

We modified the standard neutral null hypothesis to account for population size change in D. melanogaster, by in turn incorporating the maximum likelihood bottleneck parametrizations from each of 2 studies from the recent literature (Li and Stephan 2006Go; Thornton and Andolfatto 2006Go). We made one further assumption that the element transposed prior to the beginning of the bottleneck, that is, before the contemporary non-African population had emigrated from Africa. For 3 of the loci, doc1420, BS3618, and BS3730, the element is observed to segregate at low frequency in sub-Saharan African populations (doc1420: 2/8, BS3618: 7/39, and BS3730: 4/40), so this assumption appears to be justified for them. For the other 2 loci, BS3443 and BS3457, we do not observe any elements in the sub-Saharan sample (BS3443: 0/34 and BS3457: 0/43). We assume nevertheless that the elements were present in the African population at the beginning of the bottleneck for consistency with the other simulations. This assumption is not unreasonable: It is quite possible that these elements are segregating in African populations that we did not sample (Schöfl et al. 2005Go).

When either of the 2 bottleneck scenarios are incorporated, the significance pattern of the neutrality tests changes substantially (table 2). Where all 5 of the loci, except the short 3457, would have been found significant under the standard neutral null model according to iHS, under the scenario of LS only doc1420 remains strongly significant (table 2). Locus 3618 is the only BS element that still appears to be significant according to iHS, but only marginally so. The distributions of iHS change from the standard neutral case; the variances generally become larger, and the means depart from zero. For the high-frequency elements doc1420 (74.4%) and BS3730 (60.0%), the expectation of iHS becomes sharply negative, whereas for the intermediate frequency elements BS3618 (47.0%), BS3443 (47.4%), and BS3457 (57.1%), the mean iHS is close to zero.

With several marginally significant exceptions, the values of Tajima's D and H for the BS elements overall and within both the TE and non-TE subsamples are within the null hypothesis CIs under the TA bottleneck model (table 2). Where the mean Tajima's D was near zero for all 5 elements under the standard neutral null model, it ranges from 1.1 to 1.5 under the bottleneck null model, and its variance increases markedly; in fact the sharply positive Tajima's D of 2.57 observed in BS3457 now falls within the 95% CI. The mean number of haplotypes overall declines under the bottleneck, and the variance in the number of haplotypes increases. Although similar changes in the distribution of the number of haplotypes occur in both the TE and non-TE subsamples, the distributions of Tajima's D within the TE and non-TE subsamples are quite different than the distribution of Tajima's D overall. Like Tajima's D overall, the variance in the subsamples’ Tajima's D increases substantially, but the means are close to zero, not positive.

Under the alternative bottleneck scenario of LS, the pattern of significance is qualitatively similar to that found under the TA model (table 2). Again, on the basis of iHS, we would reject the null hypothesis for locus doc1420 but would not reject several of the BS elements. For the high-frequency elements doc1420 and BS3730, the distributions of iHS are shifted toward negative values relative to the standard neutral null model as under the TA model, and the variance of iHS becomes larger than under the standard neutral null model. To a lesser extent than under the TA model, the distribution of overall Tajima's D shifts toward positive values and the number of haplotypes overall decreases relative to the standard neutral null model.

Thus, when we incorporate demographic change, only doc1420 stands out as particularly unusual among the 5 data sets. Some of this appears to come from the increased variance in iHS, but for the high-frequency elements, the expected iHS drops sharply, in the direction expected for a selective sweep. We note that the expected fTE stays close to 1/2 for both bottleneck scenarios, although there are fewer significant results, apparently due the increased variance in fTE.

The Effects of Purifying Selection
There is much evidence to suggest that transposable elements are subject to purifying selection, for a number of reasons, including the deleterious effects of insertion, ectopic recombination, and misexpression (Montgomery et al. 1987Go; Nuzhdin 1999Go; Petrov et al. 2003Go). If transposable elements were under purifying selection in Africa, then those elements we observe in North America today would have segregated at lower frequencies at the time the bottleneck began and thus would have entered the bottleneck at lower frequencies on average than if they evolved neutrally. This relative youth might be expected to change the genealogies of neutral variants linked to the element. To explore this effect, we extended the null hypothesis for the 2 bottleneck simulations, adding the assumption that the transposable elements were subject to a selection coefficient of Ns = –4 prior to the beginning of the bottleneck. The particular value of Ns derives from a related study of the entire BS family (Material and Methods).

Incorporating purifying selection has a small but consistent effect on the null distributions. The distribution of Tajima's D shifts toward negative values relative to the simulations without purifying selection, although the means remain positive (table 3). The distributions of iHS consistently shift toward negative values when purifying selection is included (table 3). The magnitude of these shifts is not great, and does not alter our assessment of significance for any of the loci, but we note that the change in iHS is in the direction one expects under positive selection.


View this table:
[in this window]
[in a new window]

 
Table 3 Effects of Purifying Selection on Tajima's D and iHS

 
The Effects of Recombination Suppression
Recombination is an important force in shaping the pattern of polymorphism. There is evidence to suggest that recombination is suppressed in individuals heterozygous for transposable element insertions: Clark et al. (1986Go, 1988Go) found that the presence of an 8.0-kb B104 insertion reduced by roughly half the recombination rate observed between the insertion site and a nearby marker 3 kb away, at the rosy locus in D. melanogaster. They also found that the recombination rate was reduced to roughly a quarter its background value, over a different interval, also of 3 kb, that separated 7-kb calypso and 8.5-kb B104 insertions, at the rosy locus. If recombination were suppressed in heterozygotes for the 5 polymorphic elements we study here, this might contribute to the unusual haplotype configurations we observe. To explore this possibility, we consider 2 models of recombination suppression and apply these to each of the demographic scenarios we have considered. In the first model, recombination is suppressed completely in heterozygotes over the full length of the flanking region and recombination proceeds at the normal rate in homozygous individuals (Materials and Methods). In the second model, recombination is suppressed completely within a 250-bp radius about the insertion site in heterozygotes and proceeds as usual outside this radius. The radius 250 bp was chosen using rough calculations based on the data from Clark et al. (1986Go, 1988Go) and from visual inspection of the locations of crossover events in figures 2–6GoGoGoGo.

Recombination suppression substantially affects the patterns of polymorphism we observe. For the case in which crossovers are suppressed completely over the full length of each of the 5 loci, the distribution of iHS does shift toward negative values, and its variance increases, for all demographic scenarios (fig. 7). Where the null hypothesis was rejected for some of the BS loci under normal recombination, the CIs now overlap the observed values of iHS for all BS loci under the standard neutral null model and the bottleneck null models. For locus doc1420, the iHS CIs do not overlap the observed value but come closer to the observed value than under normal recombination.


Figure 7
View larger version (25K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 7.— Comparison of CIs for Tajima's D, H, and iHS under different levels of recombination suppression. The gray, blue, and green horizontal lines represent the span between the 2.5% and 97.5% CI for the nonsuppressed, fully suppressed, and 250-bp radius of suppression models described in the text. Within each of the 3 colors, the 5 horizontal lines indicate the demographic scenario; from top to bottom, these are 1) standard neutral null, 2) Thornton and Andolfatto (2006)Go bottleneck scenario with prebottleneck Ns = 0, 3) Thornton and Andolfatto (2006)Go bottleneck scenario with prebottleneck Ns = – 4, 4) Li and Stephan (2006)Go bottleneck scenario with prebottleneck Ns = 0, and 5) Li and Stephan (2006)Go bottleneck scenario with prebottleneck Ns = –4. The black vertical line drawn through each statistic/locus pair corresponds to the location of the weighted mean value of the statistic under the standard neutral null under normal recombination. The red vertical line drawn through each statistic/locus pair corresponds to the location of the observed value of the statistic for that locus.

 
Complete recombination suppression amplifies the trends we have noted in Tajima's D and the number of haplotypes, H, in the presence of a bottleneck. For all loci, the distributions of Tajima's D are shifted toward positive values under a bottleneck, and the magnitude of this shift appears to be much greater than under normal recombination. The number of haplotypes overall is substantially smaller under complete suppression than under normal recombination. Intriguingly, under either bottleneck scenario, the number of haplotypes falls so low that even the seemingly low H = 15 in a sample of 43 for doc1420 does not appear to be unusual.

Under the second, intermediate form of recombination suppression in which crossovers could not occur within 250 bp of the insertion, we observe distributions of the statistics which are themselves intermediate to those observed under normal recombination and complete suppression (fig. 7). If the BS elements serve as a neutral reference, then it appears that the recombination suppression models are more consistent with the observed values of the statistics than is the model with normal recombination. No combination of demographic scenario and recombination model is consistent with all of the BS loci, but complete recombination suppression appears to result in an overly positive Tajima's D distribution and too few haplotypes. Recombination suppression within a 250-bp radius seems to accord better than the normal recombination model with the number of haplotypes we observe for the BS data sets, but the normal recombination model is more consistent with the observed values of Tajima's D.

Both models of recombination suppression result in reduced mean fTE, in addition to increased variance (table 4). This effect is seen both for the standard neutral null simulations and for the bottleneck simulations. The reduction is more pronounced when the element is assumed to have experienced purifying selection in its ancestral population. For the highest frequency element, doc1420, the expected fTE appears to increase somewhat relative to no recombination suppression for ancestral Ns = 0, but if Ns = –4 is assumed expected fTE for doc1420 declines below the nonsuppressed mean. If recombination is suppressed only within a 250-bp radius, the values of fTE are intermediate to those found for complete suppression.


View this table:
[in this window]
[in a new window]

 
Table 4 Effects of Recombination Restriction and Purifying Selection on fTE under Several Demographic Scenarios

 
The Effects of Reduction in Bottleneck Intensity
The 3 extensions to the standard neutral null model that we have explored each tend to make the 5 data sets appear less unexpected by comparison to the standard neutral null model. Only for the simulations of complete recombination suppression do we see a substantial reduction in diversity in the TE subsample relative to the non-TE subsample, measured by fTE, as we observe for each of the putatively neutral BS elements. However, a less intense bottleneck might also yield such a reduction in fTE. If the bottleneck were briefer than those we have considered, and reduced the population size by the same extent, then an element entering the bottleneck at low frequency and found at high frequency at the present day would have less time to traverse between these frequencies, and the trajectory of the element would more closely resemble that expected under positive selection. Our simulations so far have used the bottleneck scenario point estimates from the studies of Thornton and Andolfatto (2006)Go and Li and Stephan (2006)Go. As both studies note, there is considerable uncertainty around these point estimates, and in particular there is support in both studies for bottlenecks of lesser intensity.

To explore this possibility, we conducted a set of bottleneck simulations in which we varied the duration of the bottleneck, using sample parameters similar to those of the 5 data sets (Materials and Methods; fig. 8). We used a smaller population size N = 105 to reduce computation time and assumed that the element entered the bottleneck at frequency 5% for consistency across simulations, and to accord with the low element frequencies observed for the BS and Doc families in Africa (Petrov et al. 2003Go). As the bottleneck becomes shorter, the mean values of both fTE and iHS drop precipitously, and for the briefest bottlenecks we consider, namely 0.1–0.3 coalescent units of 4N generations, come close to the observed values of these statistics for the BS elements. For comparison, 0.62 and 0.39 coalescent units have elapsed by the beginning of the bottleneck scenarios of TA and LS, respectively. The 4 BS loci we have studied are too few to attempt to infer the most likely bottleneck scenario based on these data, but these illustrative simulations suggest that a briefer bottleneck could indeed yield the putative partial selective sweep signatures we observe.


Figure 8
View larger version (7K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 8.— Effect of bottleneck duration on fTE and iHS. Plotted are the mean values of fTE and iHS from a bottleneck scenario similar to that of Thornton and Andolfatto (2006)Go, except that the population continues at its bottlenecked size until the present. We assumed that the frequency of the transposable element at the start of the bottleneck is 5% and that during the bottleneck the population is 0.05 of its normal size. The bottleneck duration is specified in coalescent units of 4Ne generations and is the time before the present at which the bottleneck begins. The element is present in 40 of 50 individuals and is centered within a neutrally evolving flanking region of 2000 bp. There are 75 segregating sites, the recombination rate is 3.0 cM/Mb and the prebottleneck population size is Ne = 105 individuals. Each point is based on 104 replicates.

 

    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix
 Acknowledgements
 References
 
The initial motivation for this study was to understand why the polymorphism pattern of the putatively adaptive doc1420 element resembled that of the 4 putatively neutral BS elements so closely. Did an adaptive transposition underlie the seeming partial selective sweep at each locus? Alternatively, might these patterns have arisen simply as a consequence of the recent demographic history of D. melanogaster or as a result of other nonadaptive departures from the standard neutral model?

Our results suggest that the high-frequency doc1420 insertion, which appears to confer pesticide resistance (Aminetzach et al. 2005Go), has a haplotype structure that would still be considered nonneutral when the demographic history of D. melanogaster is taken into account. The 4 intermediate or high-frequency BS elements we studied, which have haplotype structures resembling doc1420, would be rejected as nonneutral on the basis of iHS under the assumption of constant population size, but in most cases would not be rejected under the 2 bottleneck models we considered. These conclusions do not change qualitatively if the element is assumed to have been under weak purifying selection in the ancestral population, but iHS shifts in the direction expected under positive selection. Intriguingly, recombination suppression in heterozygotes for the insertion tends to differentially reduce diversity in haplotypes linked to the element. Together, these extensions to the null model make it appear unlikely that the BS elements are positively selected because the patterns of polymorphism we observe may largely be explained under neutrality. After considering our simulation results in a genealogical context, we discuss further implications of our findings.

Genealogies of Regions Flanking High-Frequency Elements
Under both bottleneck scenarios, the expected Tajima's D overall increased sharply, the number of haplotypes declined and, for high-frequency elements, iHS decreased. The variance of each of these statistics increased substantially as well. These changes may be understood in a coalescent framework. The bottleneck scenarios imply that a comparatively great length of time, in coalescent units of 4Ne generations, elapses between the start of the bottleneck and the present day. Namely, 0.62 x 4Ne generations and 0.39 x 4Ne generations elapse during this period, respectively, under TA and LS. Here, Ne is understood to mean the size of the D. melanogaster population at the present time; the value of Ne inferred by TA is {approx}1.7 times larger than the value inferred by LS. Because the sample overall is expected to have coalesced to 2 individuals by 0.5 x 4Ne generations, we expect that both the TE and non-TE subsamples will often have coalesced to a single individual, retrospectively, by the start of the bottleneck under either of the demographic scenarios. This is borne out, for example, by the observation that for locus doc1420, the average sizes of the TE and non-TE subsamples at the start of the bottleneck are 1.21 and 1.13 under TA and 1.57 and 1.25 under LS. These estimates were obtained from 103 simulated genealogies, per demographic scenario, corresponding to the doc1420 dataset. This implies that the characteristic genealogical shape is very different under a bottleneck than under constant population size; the TE and non-TE subsamples coalesce quickly, typically by the beginning of the bottleneck, then find their common ancestor on average at 2NA + tp generations earlier, where tp is the age of the element at the time the bottleneck commences and NA is the ancestral population size for the given demographic scenario. Coalescence cannot occur between the TE and non-TE subsamples until the element has gone from the population, and from this time the expected time to final coalescence is 2NA generations. The characteristic genealogical shape under a bottleneck should then be 2 short subgenealogies, of height ~16000 y, connected by long internal branches, each of much greater height ~2 x 106 y, leading to the common ancestor of the sample. Thus, we expect the fraction of the total genealogical depth accounted for by the TE and non-TE subgenealogies to be much less under the bottleneck scenarios than under the standard neutral null model. This is the case again for locus doc1420, under the standard neutral model this fraction is 65 ± 14%, but under TA, the fraction is 21 ± 15%, and under LS, it is 38 ± 18%.

Given this peculiar genealogical shape, we expect most mutations to fall on the long internal branches. This should produce an excess of intermediate frequency polymorphisms, leading to positive Tajima's D values for the sample overall, as we observe for the bottleneck simulations. Because mutations falling on the internal branches will tend to share the same segregation pattern, we should also expect the number of haplotypes overall to be reduced relative to the standard neutral null model (Wall 1999Go), as is consistent with our simulation results. We would further expect the variance of both statistics to increase relative to the standard neutral null model, because there will be an ensemble of genealogies, in some of which both the TE and non-TE subgenealogies coalesce before the beginning of the bottleneck, and in others several lineages persist beyond the beginning of the bottleneck.

This change in genealogical shape due to the bottleneck also shifts the distribution of iHS, our primary measure of positive selection, toward negative values, and apparently to a greater extent for high-frequency elements. Because negative values of iHS are a signature of positive selection, such a shift implies that a bottleneck may result in spurious inference of positive selection. We can again use the coalescent framework to understand why iHS declines under a bottleneck. Unlike after a selective sweep (Barton 1998Go; Kaplan et al. 1988Go; Przeworski 2002Go), there is no great discrepancy in the depths of the TE and non-TE subgenealogies because both subsamples are forced to coalesce quickly as a result of the bottleneck. This means that the number of segregating sites should be similar in the TE and non-TE subsamples, as mean values of fTE near 1/2 for all loci under all simulated scenarios attest, and so the reduction in iHS must not arise from a simple reduction of variation in the TE subsample.

One plausible explanation is that for high-frequency elements, where by definition the size of the TE subsample is substantially larger than the size of the non-TE subsample, the bottleneck is effectively more severe for the TE subsample than the non-TE subsample as follows. A bottleneck increases linkage disequilibrium (LD) relative to the standard neutral model; the more severe the bottleneck, whether by greater duration or greater reduction in population size, the greater the increase in LD (McVean 2002Go). If the TE subsample experiences a bottleneck of greater severity than the non-TE subsample, we would expect LD to increase relatively more in the TE subsample. This in turn should depress iHH in the TE subsample relative to the non-TE subsample, and thus, iHS should decrease. It stands to reason that the TE subsample would experience a more severe bottleneck than the non-TE subsample, simply because its greater number of lineages would be forced to coalesce during roughly the same length of time.

If this is true, we would expect the greatest departures in iHS to occur for the highest frequency elements, and we would expect the size of the departure to be attenuated by a reduction in the recombination rate. To test this explanation, we conducted a further set of bottleneck simulations based on element doc1420, in which we varied the number of transposable elements in the sample (fig. 9). As we expect, the mean value of iHS decreases as the element frequency increases. As the recombination rate is decreased, mean iHS still decreases as the element frequency increases, but the magnitude of iHS is considerably less than under higher recombination, consistent with our explanation. The reason that iHS does not fall to zero for high-frequency elements in the absence of recombination is likely a consequence of the small number of sites which segregate in the 2 subgenealogies; if recombination is held at zero, but the number of segregating sites is doubled, forcing more polymorphisms in the subsamples, the value of iHS for the high-frequency elements approaches zero (results not shown).


Figure 9
View larger version (7K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 9.— Effect of ascertainment frequency and recombination rate on iHS under a bottleneck scenario. Each curve consists of the mean iHS value for TE subsample sizes 20, 25, 30, 35, and 40, in a sample of size 43, corresponding to locus doc1420, under the demographic scenario of Thornton and Andolfatto (2006)Go. The 3 curves shown differ only in the recombination rate as annotated on the plot. All other parameters are identical to those of locus doc1420. Each point in each curve is calculated with respect to a null hypothesis of constant population size (Materials and Methods) and the same respective ascertainment frequency and recombination rate; 4,000 replicates were generated for each point, 2,000 for the bottleneck scenario and 2,000 for the null scenario.

 
An element which experiences ancestral purifying selection will be younger than a counterpart that is ancestrally neutral. This implies that the long internal branches will be somewhat shorter for an ancestrally deleterious element than for an ancestrally neutral element because the time from transposition to the beginning of the bottleneck, tp, is less. Thus, Tajima's D overall should be positive, but less positive than for an ancestrally neutral element, as we observe. Because an ancestrally deleterious element must traverse a greater range of frequencies than its ancestrally neutral counterpart to arrive at the same contemporary frequency, the difference in the intensity of the bottleneck between the TE and non-TE subsample should increase, and we would expect iHS to become more negative. This is also consistent with our simulation results.

Recombination suppression prevents crossovers between the TE and non-TE subpopulations as a source of new haplotypes, which is consistent with the shift toward fewer haplotypes observed in our simulations. The shift we observe toward positive Tajima's D with recombination suppression is also expected because there are fewer opportunities to unlink a polymorphism falling on one of the long internal branches typical of our bottlenecked genealogies from the element in the TE subpopulation or from the lack of an element in the non-TE subpopulation. This results in more polymorphisms that segregate in the same proportion as the element, that is, at intermediate frequency, which should increase Tajima's D. That iHS is somewhat decreased by recombination suppression is consistent with our explanation above for the dependence of iHS on recombination. First, there are simply fewer recombination events, which tends to increase iHS (cf., fig. 9). Second, the number of between-class recombinations, which are forbidden under the recombination suppression models, depends on the product of the subsample size and the other class’ subpopulation size. Because the non-TE subpopulation is much larger than the TE subpopulation while the TE is at low frequency, and under these bottleneck models we have seen that the subsample sizes are similar during this time, between-class crossovers constitute a greater proportion of the crossovers involving the TE subsample than they do for the non-TE subsample. Thus, the number of haplotypes should be reduced in both subsamples, as we observe, but reduced more in the TE subsample than in the non-TE subsample, as we also observe. This asymmetry should tend to reduce iHS, consistent with our simulation results.

Last, reducing the intensity of the bottleneck produced a decline in fTE. In these simulations, as the bottleneck becomes shorter, the probability that the 2 subsamples each coalesce to a single individual declines. Because we assumed that the element entered the bottleneck at low frequency, consistent with the low element frequencies we observed in Africa and the substantial evidence suggesting that transposable elements are characteristically selected against (Charlesworth and Langley 1989Go), then the TE subsample still tends to coalesce fully before the beginning of the bottleneck. However, as the bottleneck length declines, the non-TE subsample increasingly has not fully coalesced by the beginning of the bottleneck. The resulting depth discrepancy between the TE and non-TE subgenealogies should lead to a reduction in the number of segregating sites in the TE subsample and thus to reduced fTE, consistent with our simulations. The same argument made above for the reduction in iHS also applies here; the TE subsample is expected to have higher LD than the non-TE subsample because it experiences a more intense bottleneck, and thus, iHS should decline to a greater extent as the bottleneck becomes shorter.

Implications for the Study of Adaptive Transposition
One of the most interesting aspects of transposable elements is their role in the evolution of the genome (Kidwell and Lisch 2001Go; Brookfield 2004Go) and, in particular, their role in adaptive evolution (Kazazian 2004Go). In several cases of possible adaptive insertions in Drosophila, population genetic data have been used to evaluate whether the transposition is adaptive (Maside et al. 2001Go; McCollum et al. 2002Go; Catania et al. 2004Go; Schlenke and Begun 2004Go; Aminetzach et al. 2005Go). The observation of a high-frequency allele with few linked haplotypes is a population genetic hallmark of positive selection, but we have shown that this is not unexpected under a null model that includes a bottleneck, on the basis of the statistic iHS. Furthermore, the highest frequency elements, namely those that might be thought most likely to have experienced positive selection, are those for which the effect of the bottleneck on iHS is strongest. The comparatively weak purifying selection we considered here, Ns = –4, results in a slight but consistent shift in iHS toward negative values. Many families of transposable elements are likely to have experienced considerably stronger purifying selection (Petrov et al. 2003Go), which implies that shifts in the distribution of iHS would be greater for elements from these families. Recombination suppression also shifts the distribution of iHS in the direction expected under positive selection. It is thus critical to exercise caution in constructing the null hypothesis for putatively adaptive transpositions.

Implications for Genomic Scans for Positive Selection
The data sets we have explored are also revealing outside the context of adaptive transposition. We have simulated the situation in which a possibly adaptive mutation reaches a high frequency in a recently bottlenecked population, which is the configuration expected under a partial selective sweep (Voight et al. 2006Go). The chief difference between the null model we consider and that of Voight et al. (2006)Go is that we focus on mutations which were segregating in the population when the bottleneck began. In their null simulations, Voight et al. (2006)Go do not condition on the trajectory of the focal site, and thus their null distributions include both mutations that arose during the bottleneck and mutations that were segregating at the beginning of the bottleneck. For their purposes, this practice is completely reasonable because they are interested in selection on both new and standing variation. In our case, we know or may reasonably expect that the elements we observe were segregating in the population at the time the bottleneck began.

The fate of standing variants under positive selection or bottleneck has received much attention in recent years (Orr and Betancourt 2001Go; Innan and Kim 2004Go; Hermisson and Pennings 2005Go; Przeworski et al. 2005Go; Teshima et al. 2006Go). Comparatively little is known about the origin of adaptations and, in particular, whether they tend to arise more often as new mutations or as standing variants which become advantageous when the environment changes. In the case of D. melanogaster, the population's migration to climates quite different from its ancestral sub-Saharan homeland, and exposure to synthetic pesticides, among other novel chemicals (David and Capy 1988Go; Lachaise et al. 1988Go), gives reason to suspect that selection pressures on existing standing variation would have changed as they emigrated. For transposable elements, apart from environmental considerations, the diminution in copy number due to the bottleneck alone is expected to reduce selection pressure from ectopic recombination. In our simulations, we considered the effect on neutral variation linked to neutral standing variants and weakly deleterious standing variants. Our simulations are closely related to those of Innan and Kim (2004)Go and Teshima et al. (2006)Go, both of which modeled the effect of a recently completed selective sweep on neutral standing variants. Those studies were concerned with characterizing the signature of positive selection on standing variation and assessing whether this signature differed from the signature of positive selection on new variation. They found that the signature of standing variation is much less regular than that of directional selection on a new mutation.

Our results expand on these conclusions in 2 ways. First, we showed that a bottleneck on standing neutral variation yields haplotype configurations similar to those expected under a partial selective sweep. This finding accords with the large body of work demonstrating that bottlenecks can mimic the effects of complete selective sweeps (e.g., Andolfatto and Przeworski 2001Go; Przeworski 2002Go; Haddrill et al. 2005Go). Second, we showed that when the standing variant is under purifying selection, the distributions of several summary statistics commonly used to test for departure from neutrality shift in the direction expected under positive selection. If the selection pressure on a large fraction of standing variation has changed from deleterious or weakly deleterious to neutral in Drosophila, or in a different bottlenecked population such as maize or human (Harpending et al. 1998Go; Wright et al. 2005Go), and this is not included in a genomic scan for selection based on a bottleneck model in which all alleles are neutral (Nielsen 2005Go; Thornton et al. 2007Go), then this may result in a substantial number of false discoveries. Alternatively, in an empirical genomic scan for selection (Thornton et al. 2007Go), a large number of ancestrally deleterious alleles could shift the distribution of the summary statistics to more conservative values and, assuming that positive selection is rare, result in a substantial number of false negatives.

Implications for Structural Variation
Our finding that recombination suppression in heterozygotes, on its own or in combination with a bottleneck, can also cause spurious inference of positive selection, is of potentially great importance to genomic inference of positive selection. There is strong but limited experimental evidence that recombination is suppressed near transposable element insertions (Clark et al. 1986Go, 1988Go). If insertions are capable of reducing recombination, then it is likely that deletions of similar size would also suppress recombination. Although there have been no studies to date examining whether deletions suppress recombination in Drosophila, there is abundant evidence showing that polymorphic deletions of a wide range of sizes are common in Drosophila (Petrov and Hartl 2000Go; Blumenstiel et al. 2002Go). It is also well documented that recombination is suppressed in individuals heterozygous for inversions near the inversion breakpoints (Navarro et al. 2000Go; Andolfatto et al. 2001Go).

These 3 forms of structural variation, that is, insertions, deletions, and inversion, are also known to exist in large numbers as polymorphisms in the human population (Iafrate et al. 2004Go; Tuzun et al. 2005Go; Feuk et al. 2006Go). Thus, any scan for the signatures of positive selection that fails to take into account whether the putatively adaptive site is nearby segregating structural variation might result in spurious inference of positive selection. If there truly is recombination suppression at some of the loci we considered, then a possible cause for our observation of an apparent signal of positive selection could be that we used an external, genetic map-based estimate of the local recombination rate that overestimates the rate. This possibility may be somewhat attenuated by using recombination rate estimates obtained directly from the data. More data and further theoretical work are needed to characterize the phenomenon more fully.


    Appendix
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix
 Acknowledgements
 References
 
Here, we demonstrate that the truncation procedure described in the Materials and Methods correctly generates trajectories from the distribution of interest. We would like to simulate from the distribution of Px, where Px denotes a random frequency trajectory from the Wright-Fisher process that begins at frequency 1/(2N) and ends at frequency x. Suppose that a trajectory from this distribution reaches frequency x with probability px, the trajectory eventually returns to frequency x; with probability fx, it goes to fixation without passing through x again; and with probability ex, it goes to extinction without passing through x again. Then px + fx + ex = 1. Let N(Px) be the total number of times that the trajectory hits frequency x given that it hits frequency x at least once. N(Px) thus follows a geometric distribution:


Formula (A1)

Suppose that we simulate random trajectories from the Wright-Fisher process that end at frequency x, according to some yet unspecified procedure that samples from the distribution of a variable Qx. Further suppose that the only trajectories accepted are those that go to fixation or that go to extinction after hitting frequency x; thus, the procedure amounts to a rule for truncating the trajectory at one of the times it hits x. If Qx is equal to Px, which would validate the procedure, it must first be true that Pr [N(Qx) = n] = Pr [N(Px) = n] for each positive integer n. For each n, it must also be true that Qx is a random draw from the set of paths that end with frequency x. Because each trajectory is simulated independently, this latter condition must be true.

Consider the procedure that truncates the trajectory at the last time it hits x before fixation or extinction. The probability, Pr[N(Qx) = k], that an accepted trajectory has k occasions at which it hits frequency x is the probability that the simulated trajectory has exactly k hits or pFormula(1 – Px). Thus, Pr [N(Qx) = n] = Pr [N(Px) = n] for all n > 0.

That this procedure generates allele trajectories of the correct age is shown in figure A1. The simulations summarized in the figure are for a neutral allele. We also confirmed that the truncation procedure gives the correct allele ages under positive and negative directional selection, in a population of constant size, by comparison to the nonneutral, constant population size entries in table 1 of Slatkin (2001)Go (results not shown).


Figure 1
View larger version (8K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. A1.— Verification of the truncation procedure. Allele trajectories terminating at several frequencies were simulated according to the truncation procedure described in Materials and Methods, using 2N = 105. The mean ages are plotted as circles, in units of 4N generations; each point is based on 104 replicates. The analytic allele age (cf., Ewens 2004Go) is superposed as a line.

 


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix
 Acknowledgements
 References
 
We thank Marc Feldman for insightful comments on the manuscript. We thank the Stanford Genome Technology Center and particularly Lisa Diamond and Ron Davis, for the use of the computing cluster. J.M.M. is an Howard Hughes Medical Institute predoctoral fellow. J.G. is a Fulbright/Secretaria de Estado de Universidades e Investigacion, Ministerio de Educación y Ciencia postdoctoral fellow. This research was supported in part by National Institutes of Health (NIH) grant GM 28016 to Marcus W. Feldman, and by NIH and National Science Foundation grants to D.A.P.


    Footnotes
 
Marcy Uyenoyama, Associate Editor


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Appendix
 Acknowledgements
 References
 

    Aminetzach YT, Macpherson JM, Petrov DA. Pesticide resistance via transposition-mediated adaptive gene truncation in Drosophila. Science (2005) 309:764–767.[Abstract/Free Full Text]

    Andolfatto P, Depaulis F, Navarro A. Inversion polymorphisms and nucleotide variability in Drosophila. Genet Res (2001) 77:1–8.[CrossRef][Web of Science][Medline]

    Andolfatto P, Przeworski M. Regions of lower crossing over harbor more rare variants in African populations of Drosophila melanogaster. Genetics (2001) 158:657–665.[Abstract/Free Full Text]

    Barton NH. The effect of hitch-hiking on neutral genealogies. Genet Res (1998) 72:123–133.[CrossRef][Web of Science]

    Begun DJ, Aquadro CF. African and North American populations of Drosophila melanogaster are very different at the DNA level. Nature (1993) 365:548–550.[CrossRef][Medline]

    Bejerano G, Lowe CB, Ahituv N, King B, Siepel A, Salama SR, Rubin EM, Kent WJ, Haussler D. A distal enhancer and an ultraconserved exon are derived from a novel retroposon. Nature (2006) 441:87–90.[CrossRef][Medline]

    Blumenstiel JP, Hartl DL, Lozovsky ER. Patterns of insertion and deletion in contrasting chromatin domains. Mol Biol Evol (2002) 19:2211–2225.[Abstract/Free Full Text]

    Brookfield JF. Evolutionary genetics: mobile DNAs as sources of adaptive change? Curr Biol (2004) 14:344–345.[CrossRef][Web of Science]

    Brookfield JF. Evolutionary forces generating sequence homogeneity and heterogeneity within retrotransposon families. Cytogenet Genome Res (2005) 110:383–391.[CrossRef][Web of Science][Medline]

    Brosius J. The contribution of RNAs and retroposition to evolutionary novelties. Genetica (2003) 118:99–116.[CrossRef][Web of Science][Medline]

    Catania F, Kauer MO, Daborn PJ, Yen JL, Ffrench-Constant RH, Schlotterer C. World-wide survey of an Accord insertion and its association with DDT resistance in Drosophila melanogaster. Mol Ecol (2004) 13:2491–2504.[CrossRef][Medline]

    Charlesworth B, Langley CH. The population genetics of Drosophila transposable elements. Ann Rev Genet (1989) 23:251–287.[CrossRef][Web of Science][Medline]

    Chung H, Bogwitz MR, McCart C, Andrianopoulos A, Ffrench-Constant RH, Batterham P, Daborn PJ. Cis-regulatory elements in the Accord retrotransposon result in tissue-specific expression of the Drosophila melanogaster insecticide resistance gene Cyp6g1. Genetics (2007) 175:1071–1077.[Abstract/Free Full Text]

    Clark SH, Hilliker AJ, Chovnick A. Recombination can initiate and terminate at a large number of sites within the rosy locus of Drosophila melanogaster. Genetics (1988) 118:261–266.[Abstract/Free Full Text]

    Clark SH, McCarron M, Love C, Chovnick A. On the identification of the rosy locus DNA in Drosophila melanogaster: intragenic recombination mapping of mutations associated with insertions and deletions. Genetics (1986) 112:755–767.[Abstract/Free Full Text]

    Daborn PJ, Yen JL, Bogwitz MR, et al, (13 co-authors). A single p450 allele associated with insecticide resistance in Drosophila. Science (2002) 297:2253–2256.[Abstract/Free Full Text]

    David JR, Capy P. Genetic variation of Drosophila melanogaster natural populations. Trends Genet (1988) 4:106–111.[CrossRef][Web of Science][Medline]

    Davidson EH, Britten RJ. Organization, transcription, and regulation in the animal genome. Q Rev Biol (1973) 48:565–613.[CrossRef][Medline]

    Ewens WJ. Mathematical population genetics (2004) 2nd edition. Berlin (Germany): Springer.

    Fay JC, Wu CI. Hitchhiking under positive Darwinian selection. Genetics (2000) 155:1405–1413.[Abstract/Free Full Text]

    Feuk L, Carson AR, Scherer SW. Structural variation in the human genome. Nat Rev Genet (2006) 7:85–97.[Web of Science][Medline]

    Glinka S, Ometto L, Mousset S, Stephan W, De Lorenzo D. Demography and natural selection have shaped genetic variation in Drosophila melanogaster: a multi-locus approach. Genetics (2003) 165:1269–1278.[Abstract/Free Full Text]

    Grumbling G, Strelets V. FlyBase: anatomical data, images and queries. Nucleic Acids Res (2006) 34:D484–D488.[Abstract/Free Full Text]

    Haddrill PR, Charlesworth B, Halligan DL, Andolfatto P. Patterns of intron sequence evolution in Drosophila are dependent upon length and GC content. Genome Biol (2005) 6:R67.[CrossRef][Medline]

    Harpending HC, Batzer MA, Gurven M, Jorde LB, Rogers AR, Sherry ST. Genetic traces of ancient demography. Proc Natl Acad Sci USA (1998) 95:1961–1967.[Abstract/Free Full Text]

    Hein J, Schierup MH, Wiuf C. Gene genealogies, variation and evolution: a primer in coalescent theory (2005) New York: Oxford University Press.

    Hermisson J, Pennings PS. Soft sweeps: molecular population genetics of adaptation from standing genetic variation. Genetics (2005) 169:2335–2352.[Abstract/Free Full Text]

    Hudson RR. The how and why of generating gene genealogies. In: In: N. Takahata and A.G. Clark, editors. Mechanisms of molecular evolution: Introduction to molecular paleopopulation biology. Sunderland (MA): Sinauer (1993) 23–36.

    Hudson RR. Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics (2002) 18:337–338.[Abstract/Free Full Text]

    Hudson RR, Kaplan NL. The coalescent process in models with selection and recombination. Genetics (1988) 120:831–840.[Abstract/Free Full Text]

    Iafrate AJ, Feuk L, Rivera MN, Listewnik ML, Donahoe PK, Qi Y, Scherer SW, Lee C. Detection of large-scale variation in the human genome. Nat Genet (2004) 36:949–951.[CrossRef][Web of Science][Medline]

    Innan H, Kim Y. Pattern of polymorphism after strong artificial selection in a domestication event. Proc Natl Acad Sci USA (2004) 101:10667–10672.[Abstract/Free Full Text]

    Kaplan NL, Darden T, Hudson RR. The coalescent process in models with selection. Genetics (1988) 120:819–829.[Abstract/Free Full Text]

    Kaplan NL, Hudson RR, Langley CH. The "hitchhiking effect" revisited. Genetics (1989) 123:887–899.[Abstract/Free Full Text]

    Kazazian HH. Mobile elements: drivers of genome evolution. Science (2004) 303:1626–1632.[Abstract/Free Full Text]

    Kidwell MG, Lisch DR. Perspective: transposable elements, parasitic DNA, and genome evolution. Evolution Int J Org Evolution (2001) 55:1–24.[CrossRef][Web of Science][Medline]

    Kim Y, Nielsen R. Linkage disequilibrium as a signature of selective sweeps. Genetics (2004) 167:1513–1524.[Abstract/Free Full Text]

    Kreitman M. Methods to detect selection in populations with applications to the human. Annu Rev Genomics Hum Genet (2000) 1:539–559.[CrossRef][Web of Science][Medline]

    Lachaise D, Cariou ML, David JR, Lemeunier F, Tsacas L, Ashburner M. Historical biogeography of the Drosophila-melanogaster species subgroup. Evol Biol (1988) 22:159–225.

    Li H, Stephan W. Inferring the demographic history and rate of adaptive substitution in Drosophila. PLoS Genet (2006) 2:1580–1589.[Web of Science]

    Maside X, Bartolome C, Assimacopoulos S, Charlesworth B. Rates of movement and distribution of transposable elements in Drosophila melanogaster: in situ hybridization vs Southern blotting data. Genet Res (2001) 78:121–136.[Web of Science][Medline]

    Maynard Smith J, Haigh J. The hitch-hiking effect of a favourable gene. Genet Res (1974) 23:23–35.[Web of Science][Medline]

    McCollum AM, Ganko EW, Barrass PA, Rodriguez JM, McDonald JF. Evidence for the adaptive significance of an LTR retrotransposon sequence in a Drosophila heterochromatic gene. BMC Evol Biol (2002) 2:5.[CrossRef][Medline]

    McVean GA. A genealogical interpretation of linkage disequilibrium. Genetics (2002) 162:987–991.[Abstract/Free Full Text]

    Montgomery E, Charlesworth B, Langley CH. A test for the role of natural selection in the stabilization of transposable element copy number in a population of Drosophila melanogaster. Genet Res (1987) 49:31–41.[Web of Science][Medline]

    Navarro A, Barbadilla A, Ruiz A. Effect of inversion polymorphism on the neutral nucleotide variability of linked chromosomal regions in Drosophila. Genetics (2000) 155:685–698.[Abstract/Free Full Text]

    Nei M. Molecular evolutionary genetics (1987) New York: Columbia University Press.

    Nielsen R. Molecular signatures of natural selection. Annu Rev Genet (2005) 39:197–218.[CrossRef][Web of Science][Medline]

    Nuzhdin SV. Sure facts, speculations, and open questions about the evolution of transposable element copy number. Genetica (1999) 107:129–137.[CrossRef][Web of Science][Medline]

    Orr HA, Betancourt AJ. Haldane's sieve and adaptation from the standing genetic variation. Genetics (2001) 157:875–884.[Abstract/Free Full Text]

    Petrov DA, Aminetzach YT, Davis JC, Bensasson D, Hirsh AE. Size matters: non-LTR retrotransposable elements and ectopic recombination in Drosophila. Mol Biol Evol (2003) 20:880–892.[Abstract/Free Full Text]

    Petrov DA, Hartl DL. Pseudogene evolution and natural selection for a compact genome. J Hered (2000) 91:221–227.[Abstract/Free Full Text]

    Przeworski M. The signature of positive selection at randomly chosen loci. Genetics (2002) 160:1179–1189.[Abstract/Free Full Text]

    Przeworski M, Coop G, Wall JD. The signature of positive selection on standing genetic variation. Evolution Int J Org Evolution (2005) 59:2312–2323.[CrossRef][Web of Science][Medline]

    Rosenberg NA, Nordborg M. Genealogical trees, coalescent theory and the analysis of genetic polymorphisms. Nat Rev Genet (2002) 3:380–390.[CrossRef][Web of Science][Medline]

    Rozas J, Sánchez-DelBarrio JC, Messeguer X, Rozas R. DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics (2003) 19:2496–2497.[Abstract/Free Full Text]

    Sabeti PC, Reich DE, Higgins JM, et al, (17 co-authors). Detecting recent positive selection in the human genome from haplotype structure. Nature (2002) 419:832–837.[CrossRef][Medline]

    Schlenke TA, Begun DJ. Strong selective sweep associated with a transposon insertion in Drosophila simulans. Proc Natl Acad Sci USA (2004) 101:1626–1631.[Abstract/Free Full Text]

    Schöfl G, Catania F, Nolte V, Schlötterer C. African sequence variation accounts for most of the sequence polymorphism in non-African Drosophila melanogaster. Genetics (2005) 170:1701–1709.[Abstract/Free Full Text]

    Slatkin M. Simulating genealogies of selected alleles in a population of variable size. Genet Res (2001) 78:49–57.[CrossRef][Web of Science][Medline]

    Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proc Natl Acad Sci USA (2003) 100:9440–9445.[Abstract/Free Full Text]

    Tajima F. Evolutionary relationship of DNA sequences in finite populations. Genetics (1983) 105:437–460.[Abstract/Free Full Text]

    Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics (1989a) 123:585–595.[Abstract/Free Full Text]

    Tajima F. The effect of change in population size on DNA polymorphism. Genetics (1989b) 123:597–601.[Abstract/Free Full Text]

    Teshima KM, Coop G, Przeworski M. How reliable are empirical genomic scans for selective sweeps? Genome Res (2006) 16:702–712.[Abstract/Free Full Text]

    Thornton K, Andolfatto P. Approximate Bayesian inference reveals evidence for a recent, severe bottleneck in a Netherlands population of Drosophila melanogaster. Genetics (2006) 172:1607–1619.[Abstract/Free Full Text]

    Thornton KR, Jensen JD, Becquet C, Andolfatto P. Progress and prospects in mapping recent selection in the genome. Heredity (2007) 98:380–348.

    Tuzun E, Sharp AJ, Bailey JA, et al, (12 co-authors). Fine-scale structural variation of the human genome. Nat Genet (2005) 37:727–732.[CrossRef][Web of Science][Medline]

    Voight BF, Kudaravalli S, Wen X, Pritchard JK. A map of recent positive selection in the human genome. PLoS Biol (2006) 4:446–458.[CrossRef][Web of Science]

    Wall JD. Recombination and the power of statistical tests of neutrality. Genet Res (1999) 74:65–79.[CrossRef][Web of Science]

    Watterson GA. On the number of segregating sites in genetical models without recombination. Theor Popul Biol (1975) 7:256–276.[CrossRef][Web of Science][Medline]

    Wright SI, Bi IV, Schroeder SG, Yamasaki M, Doebley JF, McMullen MD, Gaut BS. The effects of artificial selection on the maize genome. Science (2005) 308:1310–1314.[Abstract/Free Full Text]

Accepted for publication January 1, 2008.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Genome ResHome page
J. K. Pickrell, G. Coop, J. Novembre, S. Kudaravalli, J. Z. Li, D. Absher, B. S. Srinivasan, G. S. Barsh, R. M. Myers, M. W. Feldman, et al.
Signals of recent positive selection in a worldwide sample of human populations
Genome Res., May 1, 2009; 19(5): 826 - 837.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
J. Gonzalez, J. M. Macpherson, P. W. Messer, and D. A. Petrov
Inferring the Strength of Selection in Drosophila under Complex Demographic Models
Mol. Biol. Evol., March 1, 2009; 26(3): 513 - 526.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
25/6/1025    most recent
msn007v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Macpherson, J. M.
Right arrow Articles by Petrov, D. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Macpherson, J. M.
Right arrow Articles by Petrov, D. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?