Skip Navigation


MBE Advance Access originally published online on August 4, 2007
Molecular Biology and Evolution 2007 24(10):2310-2322; doi:10.1093/molbev/msm162
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrow All Versions of this Article:
24/10/2310    most recent
msm162v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Arunyawat, U.
Right arrow Articles by Städler, T.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Arunyawat, U.
Right arrow Articles by Städler, T.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

Research Articles

Using Multilocus Sequence Data to Assess Population Structure, Natural Selection, and Linkage Disequilibrium in Wild Tomatoes

Uraiwan Arunyawat1, Wolfgang Stephan and Thomas Städler

Section of Evolutionary Biology, Department Biologie II, University of Munich (LMU), Planegg-Martinsried, Germany

E-mail: staedler{at}zi.biologie.uni-muenchen.de.


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusion
 Supplementary Material
 Acknowledgements
 References
 
We employed a multilocus approach to examine the effects of population subdivision and natural selection on DNA polymorphism in 2 closely related wild tomato species (Solanum peruvianum and Solanum chilense), using sequence data for 8 nuclear loci from populations across much of the species’ range. Both species exhibit substantial levels of nucleotide variation. The species-wide level of silent nucleotide diversity is 18% higher in S. peruvianum ({pi}sil {approx} 2.50%) than in S. chilense ({pi}sil {approx} 2.12%). One of the loci deviates from neutral expectations, showing a clinal pattern of nucleotide diversity and haplotype structure in S. chilense. This geographic pattern of variation is suggestive of an incomplete (ongoing) selective sweep, but neutral explanations cannot be entirely dismissed. Both wild tomato species exhibit moderate levels of population differentiation (average FST {approx} 0.20). Interestingly, the pooled samples (across different demes) exhibit more negative Tajima’s D and Fu and Li's D values; this marked excess of low-frequency polymorphism can only be explained by population (or range) expansion and is unlikely to be due to population structure per se. We thus propose that population structure and population/range expansion are among the most important evolutionary forces shaping patterns of nucleotide diversity within and among demes in these wild tomatoes. Patterns of population differentiation may also be impacted by soil seed banks and historical associations mediated by climatic cycles. Intragenic linkage disequilibrium (LD) decays very rapidly with physical distance, suggesting high recombination rates and effective population sizes in both species. The rapid decline of LD seems very promising for future association studies with the purpose of mapping functional variation in wild tomatoes.

Key Words: population structure • population expansion • clinal variation • ongoing selective sweep • linkage disequilibrium • wild tomatoes


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusion
 Supplementary Material
 Acknowledgements
 References
 
Understanding the evolutionary forces that shape patterns of nucleotide diversity within and among populations and recently diverged species is an important aim of population genetics. Generally, patterns of genetic diversity within and among populations are influenced both by evolutionary processes that affect the entire genome, such as demographic history and population structure, and by processes that act at individual genes such as natural selection. A multilocus approach is a powerful way to disentangle the effects of different evolutionary forces on DNA variation. This approach has been used for several well-studied species of Drosophila (e.g., Glinka et al. 2003Go; Das et al. 2004Go; Ometto et al. 2005Go), Arabidopsis (Wright et al. 2003Go; Ramos-Onsins et al. 2004Go; Nordborg et al. 2005Go; Schmid et al. 2005Go), and maize (Tenaillon et al. 2004Go; Wright et al. 2005Go).

In the presence of population structure, several factors including levels of gene flow among populations and the number of demes are expected to contribute to levels of nucleotide diversity within and among populations and consequently influence species-wide levels of variation (Whitlock and Barton 1997Go; Wakeley and Aliacar 2001Go; Laporte and Charlesworth 2002Go). There is considerable evidence that population structure shapes patterns of genetic variation in many plant species, such as Arabidopsis thaliana (Sharbel et al. 2000Go; Schmid et al. 2006Go), Arabidopsis lyrata (Wright et al. 2003Go; Clauss and Mitchell-Olds 2006Go), Populus tremula (Ingvarsson 2005Go), Silene tatarica (Tero et al. 2003Go), Pinus densata (Ma et al. 2006Go), and teosinte (Moeller et al. 2007Go). The effects of population structure might also influence the magnitude and pattern of linkage disequilibrium (LD; Ostrowski et al. 2006Go; De and Durrett 2007Go) and may also yield spurious associations in association studies (e.g., Helgason et al. 2005Go).

In principle, several factors can increase levels of LD, for instance, population structure, low recombination rate, natural and artificial selection, inbreeding, and small effective population size. However, other factors such as outcrossing, high recombination rate, and large effective population size may propel a decay of LD (reviewed by Gupta et al. 2005Go). Since the availability of genome-wide sequences and/or single nucleotide polymorphism (SNP) maps, LD mapping has been used extensively in animal and plant systems, as well as to dissect the molecular bases of human diseases (Flint-Garcia et al. 2003Go; Rafalski and Morgante 2004Go). Therefore, a detailed understanding of the extent and patterns of LD within a given target species will facilitate the choice of appropriate methodology for association mapping. Even though LD has received much attention recently, more studies are needed to investigate patterns of LD as well as factors that influence LD. In plant genomics, only a few well-studied species such as A. thaliana, P. tremula, rice, maize, barley, and sunflower have been characterized for the extent and decay of LD with physical distance (Lin et al. 2001Go; Remington et al. 2001Go; Tenaillon et al. 2001Go; Nordborg et al. 2002Go; Garris et al. 2003Go; Kraakman et al. 2004Go; Ingvarsson 2005Go; Liu and Burke 2006Go).

Wild tomatoes (Solanum section Lycopersicon) have become a suitable plant model system for evolutionary analyses because of their recent divergence, the clear phenotypic distinction, and the great diversity of mating systems. Wild tomatoes are native to western South America, with 2 endemic species in the Galapagos Islands (Rick 1986Go; Taylor 1986Go; Spooner et al. 2005Go). Our earlier studies based on single populations in 5 species suggested that nucleotide diversity in wild tomatoes is influenced by the mating system and among-locus variation in neutral mutation rate and/or selective constraints among loci, whereas evidence for positive selection was scarce (Roselius et al. 2005Go; Städler et al. 2005Go). Consequently, if positive directional selection does not have a large impact on patterns of sequence diversity in these wild tomato species, then the analyses of demographic processes and population structure become more significant for understanding levels and patterns of diversity and historical events.

In this study, we adopt a multilocus approach to examine the effects of population structure and demographic history on nucleotide variability in wild tomatoes. We also characterize the (intragenic) decay of LD using multiple natural populations for each of the self-incompatible, closely related species Solanum peruvianum and Solanum chilense. We ask specifically 1) What are the patterns of nucleotide diversity in wild tomatoes? 2) Do wild tomato populations show genetic differentiation? 3) How fast is the decline of LD with physical distance in these 2 closely related wild tomato species? Using polymorphism data for 8 effectively unlinked nuclear loci, we found substantial levels of nucleotide polymorphism and modest levels of population differentiation in both species. More interestingly, the presence of population structure (as well as sampling design) may have facilitated the discovery of a clinal pattern of nucleotide variation at one of the loci in S. chilense. Furthermore, LD decays very rapidly, reflecting fairly high rates of recombination at all loci as well as high effective population size.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusion
 Supplementary Material
 Acknowledgements
 References
 
Plant Samples and Loci Studied
Wild tomatoes have been taxonomically reassigned to the genus Solanum section Lycopersicon (Spooner et al. 1993Go, 2005Go; Olmstead et al. 1999Go; Peralta and Spooner 2001Go). For this study, we adopted the new nomenclature and chose 2 self-incompatible, obligately cross-fertilizing wild tomato species. Solanum peruvianum is distributed along the western side of the Andes from north-central Peru to northern Chile and S. chilense from southern Peru to northern Chile (cf., fig. 1 in Städler et al. 2005Go). Both studied species are patchily distributed and grow in diverse habitats from sea level to the highland up to 3,000 m (Rick 1979Go, 1986Go; Taylor 1986Go). In their native habitats, it appears that S. peruvianum is the most widespread and highly subdivided species; it is also characterized by substantial morphological diversity within and among populations (Rick 1979Go, 1986Go). In order to adequately sample the geographic ranges of the species, 3 new population samples of each species were collected in Peru by Städler T and Marczewski T (May 2004): Arequipa (ARE), Nazca (NAZ), and Canta (CAN) for S. peruvianum and Tacna (TAC), Moquegua (MOQ), and Quicacha (QUI) for S. chilense; the population samples and geographic locations are summarized in table 1. The sampled natural populations varied from a low census size of 9 isolated plants along 1.3 km of highway (ARE) to >200 in more densely growing, continuous roadside populations (CAN and MOQ). Based on local availability, we randomly sampled between 5 and 7 plants per population. Voucher specimens have been deposited at the herbaria of the Universidad San Marcos (USM, Lima, Peru) and Munich Systematic Botany (MSB, Munich, Germany). In addition, we included one population sample of each species from an earlier analysis (Städler et al. 2005Go); these samples were obtained from the Tomato Genetics Resource Center at University of California, Davis (http://tgrc.ucdavis.edu): accessions LA2744 (S. peruvianum; Tarapaca [TAR]) and LA2884 (S. chilense; Antofagasta [ANT]) also represent random samples from each source population as the plants used for DNA extraction were each grown from seeds collected from separate wild plants.


Figure 1
View larger version (27K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 1.— Clinal pattern of nucleotide variability (SNPs) at locus CT208 in Solanum chilense. From top to bottom, samples are ordered from the south (ANT) to the north (QUI) of the geographic range. Sample identifiers are given in the left column. The polymorphic sites shown here are distributed over the entire (sequenced) locus (~1.1 kbp). Nucleotide positions at which the outgroup sequence (Solanum ochranthum) differs from all S. chilense sequences were eliminated for visual clarity.

 

View this table:
[in this window]
[in a new window]

 
Table 1 Geographic Location of the Analyzed Populations

 
For each population, we generally analyzed 5 or 6 individuals (10 or 12 alleles) at 8 unlinked nuclear loci that represent a subset of those studied earlier (CT093, CT208, CT251, CT066, CT166, CT179, CT198, and CT268; Roselius et al. 2005Go; Städler et al. 2005Go). These loci correspond to anonymous, single-copy cDNA markers originally mapped by Tanksley et al. (1992Go; see also http://www.sgn.cornell.edu). Extensive sequence information is available for many of the mapped cDNA markers (Ganal et al. 1998Go), which have been integrated into longer "tentative consensus" sequences in the Tomato Gene Index at the Institute for Genomic Research (http://www.tigr.org/tdb/lgi). Putative functions have been proposed for all these loci (see Roselius et al. 2005Go), but they were not chosen based on functional criteria. The loci encompass regions of low to high recombination rates as estimated by Stephan and Langley (1998)Go.

DNA Amplification and Sequencing
We isolated genomic DNA from dried leaves of mature plants using the DNeasy Plant Mini Kit (Qiagen GmbH, Hilden, Germany). Polymerase chain reaction (PCR) primers were designed based on the published cDNA or genomic DNA sequences from Solanum lycopersicum (see Städler et al. 2005Go). PCR primers and conditions are deposited at http://www.zi.biologie.uni-muenchen.de/evol/Downloads.html. PCR products were sequenced directly from both strands on an ABI3730 DNA analyzer (Applied Biosystems, Foster City, CA). Direct sequencing was also used to confirm polymorphic sites in heterozygotes. Because it is essential to resolve haplotypes, we designed haplotype-specific sequencing primers based on heterozygous nucleotides or indels as previously described (Städler et al. 2005Go). Haplotype phase was thus completely resolved for all new sequences. Sequences were edited and aligned using the Sequencher program (Gene Codes, Ann Arbor, MI) and adjusted manually. Locus CT208 was resequenced for both the TAR and ANT samples. Moreover, we designed new primers for CT166 and CT208, for which shorter PCR products were sequenced than previously (Baudry et al. 2001Go; Roselius et al. 2005Go). All new sequences have been deposited with the GenBank database with accession numbers EU077613 [GenBank] –EU078167 [GenBank] .

Estimation of Nucleotide Diversity and Neutrality Tests
We estimated levels of nucleotide diversity for all sites and for silent sites only (using noncoding and synonymous sites), calculating Watterson's (1975)Go estimator {theta}w = 4 Nem (where Ne denotes the effective population size and m the mutation rate per site and generation) and {pi}, the average number of pairwise differences per site between sequences in a sample (Nei 1987Go). Because our analyses encompass multiple samples from subdivided populations, we also obtained species-wide estimates of the parameter {theta} by calculating {pi}between. This was accomplished by including all pairwise comparisons of sequences obtained from different demes but excluding pairwise comparisons within demes; this approach eliminates undesirable effects of the scattering phase of the coalescent process in substructured populations and should yield unbiased estimates of {theta} (Wakeley 1999Go, 2001Go).

We tested for deviations from neutrality using Tajima's D statistic (1989)Go. This test measures skews in the frequency spectrum; a negative D value indicates an excess of rare polymorphisms, and a positive D suggests an excess of intermediate-frequency polymorphisms. Tajima’s test is conservative for testing departures from neutral equilibrium conditions, in particular under the assumption of no recombination. We also employed Fu and Li's D (Fu and Li 1993) and Fay and Wu's (2000)Go H statistic. The H statistic measures the differences between the average number of nucleotide differences and the estimator {theta}H, which is based on the frequency of "derived" variants. A significantly negative H value indicates an excess of high-frequency derived variants, which may be indicative of positive selection. The significance of H was evaluated by 10,000 coalescent simulations, using the observed number of segregating sites and no recombination. All standard analyses were performed in DnaSP version 4.0 (Rozas et al. 2003Go). In addition, we used the multilocus Hudson-Kreitman-Aguadé test (HKA, Hudson et al. 1987Go) to assess the ratio of polymorphism within species to the divergence between species, as implemented in Hey's program HKA (http://lifesci.rutgers.edu/~heylab). Significance of Tajima's D and Fu and Li's D values was also assessed using 10,000 coalescent simulations in the HKA program. In the tests mentioned above, we used sequences from Solanum ochranthum or Solanum lycopersicoides as outgroup data (Roselius et al. 2005Go), except for CT208 for which we obtained a new sequence from S. ochranthum.

Tests of Population Differentiation
In order to quantify the extent of population differentiation within species, we estimated the standard Fst statistic based on the method of Hudson et al. (1992)Go, as implemented in the DnaSP program. Fst is an estimate of population differentiation measuring the differentiation of subpopulations relative to the total (sampled) population. Population substructure within each species was thus quantified on a locus-by-locus basis as well as between all pairs of populations. Population differentiation was additionally assessed using Hudson's Snn statistic (2000)Go in combination with permutation tests (10,000 permutations). We also quantified levels of differentiation within and between species in a hierarchical analysis of molecular variance approach (AMOVA; Excoffier et al. 1992Go), as implemented in Arlequin 3.1 (Excoffier et al. 2005Go).

Analyses of Recombination and Intragenic LD
We estimated the minimum number of recombination events (Rm) using the 4-gamete test of Hudson and Kaplan (1985)Go and the population recombination parameter {rho}, where {rho} = 4Nec and c the recombination fraction per generation between sites, using Hudson’s (2001)Go composite likelihood method, as implemented in the LDhat 2.0 package (McVean et al. 2002Go). Moreover, we calculated the degree of LD in terms of the Zns statistic (Kelly 1997Go), which is the average of squared allele-frequency correlations (r2; Hill and Robertson 1968Go) over all pairwise comparisons.

We investigated the decay of intragenic LD with physical distance following the methods of Remington et al. (2001)Go. The estimates of LD were calculated by using r2 between pairs of polymorphic sites. The expected decay of LD was modeled as

Formula (1)
where n denotes the number of sequences (Hill and Weir 1988Go). We fitted this equation to the data using the R statistical package (http://www.r-project.org/). The nonlinear regression yields a least-squares estimate of {rho} per basepair; this estimate may not be precise and unrealistic due to several factors, for example, the nonindependence between linked sites and the nonequilibrium populations. Nonetheless, this model is useful to characterize the rate of decay of LD (e.g., Remington et al. 2001Go; Brown et al. 2004Go; Ingvarsson 2005Go; Liu and Burke 2006Go). These analyses were performed separately for each locus, both within populations and for the combined data set, and for all 8 loci together. Sites with observed multiple hits were excluded in the recombination and LD analyses, and all singletons were removed in the LD analyses.


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusion
 Supplementary Material
 Acknowledgements
 References
 
Patterns of Nucleotide Diversity and Tests of Neutrality
We sequenced 8 unlinked nuclear loci in 3 populations of each species, with a total concatenated length of >10 kb per "allele." We also added one population of each species from an earlier study (Städler et al. 2005Go). Therefore, in this study, we evaluated in total 4 populations and about 40–46 alleles per locus per species. The total length (including indels) of the individual loci ranges from 778 bp to 1,887 bp. Most of the loci contained both coding and noncoding sites (introns and/or flanking regions), except for CT066 and CT268 that contribute only coding sites (table 2). We quantified polymorphism by using Watterson's {theta}w and {pi} for all sites ({theta}all, {pi}all) and for silent sites ({theta}sil, {pi}sil). For each locus and each population, estimates of nucleotide diversity are given in supplementary table 1 (Supplementary Material online). Across individual loci, the species-wide level of variation ({pi}all) varies from ~0.57 to 2.69% in S. peruvianum and ~0.55 to 1.66% in S. chilense. CT198 and CT179 appear to be the most polymorphic loci, whereas CT093 shows the least polymorphism in both species.


View this table:
[in this window]
[in a new window]

 
Table 2 Sequence Characterization of the 8 Nuclear Loci

 
We also calculated the weighted average levels of nucleotide diversity across all 8 loci for each population, as presented in table 3. Most of the loci and populations generally exhibit substantial levels of polymorphism. In S. peruvianum, we found that the northernmost population (CAN) shows the highest level of variability, which is a bit higher than that of the southernmost population (TAR), whereas the ARE population exhibits the lowest level of polymorphism. In S. chilense, the 3 new populations (TAC, MOQ, and QUI) display comparable levels of sequence diversity, which are about twice as high as in the ANT population.


View this table:
[in this window]
[in a new window]

 
Table 3 Summary of Nucleotide Variation and Multilocus Neutrality Tests

 
We used Tajima's D (Tajima 1989Go) and Fu and Li's D statistics (Fu and Li 1993Go) to test for deviations from the standard neutral model. Based on these statistics, most of the populations do not show significant departures from neutral equilibrium expectations. The only 2 exceptions are the ARE and ANT populations. The ARE population shows significant deviations from the neutral equilibrium model at 4 loci, where 3 loci (CT093, CT179, and CT268) exhibit positive Tajima's D values, whereas only CT066 exhibits a significantly negative value (supplementary table 1, Supplementary Material online). The ANT population also shows significantly positive Tajima's D values at several loci as reported in our previous study (Roselius et al. 2005Go). The multilocus neutrality tests, based on all sites, are reported in table 3. Multilocus Tajima's D values are slightly and consistently negative in S. peruvianum (except for the ARE population), whereas in S. chilense the Tajima's D values are close to zero (except for the ANT population that shows a very significantly positive value). Unsurprisingly, we found that Fu and Li's D statistic exhibits similar patterns as Tajima's D. However, the estimates of Fu and Li's D indicate a departure from neutrality for the CAN population (DFL = –0.91, P = 0.02), as well as for the ANT population in S. chilense (DFL = 1.25, P < 0.001). This negative DFL value suggests an excess of polymorphisms on external branches of the genealogy (i.e., singletons) of the CAN sample, whereas the positive value for the ANT population indicates an excess of polymorphisms on internal branches (i.e., intermediate-frequency variants).

We also evaluated levels of nucleotide variation using the combined samples in both species (treated as a single population per species; table 3). We found an interesting pattern, in that the {theta} estimates are consistently higher than the mean {theta}s for individual populations, whereas the {pi} estimates are less affected by pooling the samples (see supplementary table 1, Supplementary Material online). Hence, we obtained a highly significantly negative multilocus Tajima's D for the combined sample of S. peruvianum sequences (DT = –1.06, P < 0.001). Using the combined sample for S. chilense, Tajima's D is negative but only marginally significant (DT = –0.53, P = 0.07). Similarly, for Fu and Li's D statistic both species exhibit significantly negative values when the population samples are combined, reflecting a marked excess of singletons in the samples (table 3). We will discuss and evaluate this striking pattern in Discussion. The species-wide estimates of {pi} (based on interdeme sequence divergence, {pi}between) exceed the values for the combined species samples (table 3), which is expected because potentially closely related sequences sampled from the same populations do not contribute to these estimates.

Furthermore, we applied the H test of Fay and Wu using S. ochranthum or S. lycopersicoides as outgroups (based upon availability; Roselius et al. 2005Go). At locus CT208, 2 S. chilense populations (TAC and MOQ) show significantly negative H values (H = –15.12, P = 0.02 for TAC and H = –21.33, P < 0.001 for MOQ) that indicate an excess of high-frequency derived variants (Fay and Wu 2000Go). Figure 1 summarizes patterns of nucleotide variation at locus CT208 across the S. chilense populations. The northernmost QUI population exhibits more SNP variants that are shared with the outgroup than the majority SNP variants found in the MOQ and TAC samples. We identified the alleles that are similar to the outgroup as the "ancestral" haplotype group and the other alleles as the derived haplotype group. We found that the MOQ population shows a very high frequency of derived variants, in that only one allele belongs to the ancestral haplotype group. Likewise, only 2 of 12 sequenced alleles in the TAC population are of the ancestral group. In addition, the derived group has apparently fixed in the southernmost ANT population; this clinal pattern of nucleotide variation is suggestive of the possibility of selection.

Additionally, the multilocus HKA test (Hudson et al. 1987Go) was used to assess departures from the neutral model. We observed no evidence for any significant deviations in S. peruvianum. In S. chilense, none of the new populations show significant deviations from the neutral model, whereas the "old" ANT population clearly departs from the neutral model (P = 0.02), as shown in table 3. A statistically significant HKA test was also found in our previous study, which was based on 14 loci (Roselius et al. 2005Go).

Levels of Population Differentiation
Table 4 shows estimates of Fst and results of the permutation test of population differentiation across all 4 populations per species. In S. peruvianum, we observed significant genetic differentiation at all 8 loci, with Fst ranging from 0.081 (CT166) to 0.352 (CT066). The average Fst estimate is 0.198 for S. peruvianum, which is slightly lower than the value for S. chilense (Fst = 0.232). As in S. peruvianum, all loci exhibit significant population differentiation in S. chilense, although test statistics other than Snn might not infer significant differentiation at CT066. Because levels of differentiation (as assessed by the Fst statistic) are elevated in both species due to the presence of low-polymorphism populations (ARE and ANT), we also repeated this analysis under exclusion of these samples. Based on the remaining 3 populations per species, average Fst values are 0.138 in S. peruvianum and 0.188 in S. chilense.


View this table:
[in this window]
[in a new window]

 
Table 4 Summary of Population Differentiation at 8 Loci

 
In addition, we calculated among-population pairwise estimates of Fst across 8 loci, as summarized in table 5. In S. peruvianum, the populations separated by the largest geographic distance (TAR–CAN) are the least differentiated (Fst = 0.088), which is mainly attributable to their high within-population diversity (table 3). Estimates of Fst may be somewhat misleading because of their dependency on levels of intrademe variation; hence, we also calculated {pi}between as an estimate of divergence that should be unaffected by levels of local sequence diversity. When quantified in this manner, the pairwise estimates of among-population divergence show remarkably similar values, with somewhat lower divergence between ARE and NAZ. In S. chilense, the Fst estimate between the geographically fairly close populations TAC and MOQ is very low (Fst = 0.017), which is partly due to their high within-population diversity. In contrast, the other population pairs show substantial levels of differentiation (Fst > 0.17). Analyses using {pi}between suggest that the low-diversity sample ANT is equally divergent from both TAC and MOQ as these high-diversity populations are from each other; rather, it is the northernmost QUI sample that shows evidence of elevated levels of divergence from the 3 other populations (table 5).


View this table:
[in this window]
[in a new window]

 
Table 5 Average Pairwise Estimates of Fst and Interdeme {pi} across 8 Loci

 
A hierarchical AMOVA approach was used to additionally quantify differentiation between the species. We observed 30.3% of the total variation among species and only 10.1% among populations within species, whereas 59.6% of the total variation was found within populations. Thus, the fixation index among species is 0.303, and permutation tests are highly significant (P < 0.001), suggesting a considerable level of differentiation between S. peruvianum and S. chilense. It clearly exceeds the level of differentiation between subpopulations within species.

Recombination and Intragenic LD
Primarily because of haplotype structure at several (ARE; S. peruvianum) or most loci (ANT; S. chilense), estimates of recombination and intragenic LD for these samples are presented in supplementary table 2 (estimates of the recombination parameter {rho} are often zero, and LD decays more slowly with distance; these populations have been excluded from the LD-decay analyses below). Using the 4-gamete test to infer the minimum number of recombination events (Rm) in the 6 other populations, we found very diverse estimates across loci, varying from 0 to 13 for individual population samples (table 6). In both species, these Rm estimates are higher for the analyses of the combined samples (with 30–36 alleles per species). Overall, the estimated minimum number of recombination events is largely consistent with the levels of physical recombination rate for each locus except for CT251; this locus exhibits high Rm estimates in both species, in contrast to its low recombination rate estimated by Stephan and Langley (1998)Go.


View this table:
[in this window]
[in a new window]

 
Table 6 Estimates of Haplotype Structure, Recombination Parameters, and LD

 
We also estimated the population recombination parameter ({rho}) at each locus using the composite-likelihood approach of Hudson (2001)Go. For single population samples, {rho} ranges from 0 to 0.102 per site in S. peruvianum and from 0 to 0.053 in S. chilense (table 6). Using the locus-specific means of the values obtained for individual populations, we computed the weighted average of {rho} in S. peruvianum ({rho} ~ 0.0234), which is almost 3-fold higher than in S. chilense ({rho} ~ 0.0084). In order to achieve more power for the analysis, we combined 2 populations per species and treated them as single samples. Because both the TAR–CAN (S. peruvianum) and TAC–MOQ (S. chilense) pairs show relatively low levels of population differentiation (table 5), these were chosen for the combined analysis. The weighted average {rho} across the 8 loci is ~0.0348 based on the 2 relatively undifferentiated populations in S. peruvianum, whereas the corresponding weighted average {rho} is ~0.0238 in S. chilense. The ratio of these {rho} estimates is about 1.5, which is close to the ratio of the {theta} estimates for both species (see table 3).

Assuming equilibrium conditions, one may calculate estimates of the effective population size based on the population parameters {rho} and {theta}, where {rho} = 4Nec and {theta} = 4Neµ. In this study, we estimate the effective population size of wild tomatoes on the basis of {rho}, using the average physical recombination rate per generation based on the 6 loci with an estimated c > 0 (excluding CT093 and CT208; Roselius et al. 2005Go). Using the estimated mean {rho} based on 2 relatively undifferentiated populations per species (see above) and c ~1.51 x 10–8, we obtain estimates of Ne ~6.87 x 105 for S. peruvianum and ~5.04 x 105 for S. chilense. These Ne estimates are somewhat lower than estimates based on {theta} from our previous study (Roselius et al. 2005Go; and taking into account the higher levels of polymorphism for S. chilense found in the present study).

Figure 2 illustrates the decline of LD with physical distance, using pooled data of all 8 loci for the nonlinear regression model. The expected value of r2 decays to negligible levels (i.e., <0.05) within <150 bp for the combined sample in S. peruvianum (fig. 2A) and within <750 bp in S. chilense (fig. 2B). Within individual populations, LD decays less rapidly (about 2- to 4-fold larger distances), which is mainly due to the smaller sample size and (to a lesser extent) lower polymorphism levels for single population samples. The effect of sample size is shown in supplementary figure 1 (Supplementary Material online), for which we created pooled samples of size 12 (i.e., using 4 randomly selected alleles per population); these samples by and large mirror the decay of LD in the most polymorphic single-population samples (with similar sample size). Moreover, we present the decay of LD with physical distance for each locus separately, as shown in supplementary figures 2 and 3 (Supplementary Material online). Generally, the decay of LD is relatively fast in both species, whereas CT208 exhibits somewhat higher levels of LD than the other loci in S. chilense. This is certainly caused by the clinal pattern of SNP and haplotype structure at this locus (see fig. 1). Finally, it does not seem that the decay of LD is much faster at loci with higher recombination rates.


Figure 2
View larger version (45K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 2.— Plots of the squared allele frequencies (r2) versus distance (in basepairs) between pairs of polymorphic sites for Solanum peruvianum (2A) and Solanum chilense (2B). The solid lines depict the expected decline in LD for the combined samples (3 populations per species); broken lines are used for individual populations. All lines are based on nonlinear regressions of r2 against distance, using the equation of Hill and Weir (1988)Go given in Materials and Methods.

 

    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusion
 Supplementary Material
 Acknowledgements
 References
 
Levels and Patterns of Nucleotide Diversity
Both studied wild tomato species exhibit substantial levels of nucleotide variation. The species-wide level of silent polymorphism ({pi}sil) across 8 loci is 0.0250 per site in S. peruvianum and 0.0212 in S. chilense. These average values of silent diversity are similar to that obtained for wild sunflower (~0.0234; Liu and Burke 2006Go) but higher than estimates for several other outcrossing angiosperms, such as A. lyrata (~0.0140; Wright et al. 2003Go), Arabidopsis halleri (~0.0150; Ramos-Onsins et al. 2004Go), maize (~0.0120; Tiffin and Gaut 2001Go), and P. tremula (~0.0160; Ingvarsson 2005Go).

For species characterized by population subdivision, the number of demes is an important factor that might influence the species-wide level of variation (Whitlock and Barton 1997Go; Pannell and Charlesworth 1999Go; Wakeley 2001Go; Laporte and Charlesworth 2002Go). Given that S. peruvianum appears to be more patchily distributed than S. chilense (Rick 1986Go; Städler T, personal observation), it is not necessarily surprising to find higher levels of nucleotide polymorphism in S. peruvianum than in S. chilense. However, other factors might also contribute to this difference between both species, such as population bottlenecks and/or range or demographic expansions since species divergence. Different levels of local (population-specific) nucleotide diversity can easily be accounted for under models allowing for differences among demes in population size, fraction of immigrants, and contributions to the migrant pool (Wakeley 2001Go). For example, the low-diversity populations ARE and ANT may be very isolated from other regional demes and, hence, may have received few immigrant genes over appreciable timescales, leading to the partial loss of diversity, haplotype structure, and predictable changes in the site-frequency spectrum (De and Durrett 2007Go).

Evidence for an Ongoing Selective Sweep in S. chilense
Locus CT208 in S. chilense features an intriguing geographic pattern of nucleotide diversity, in that levels of nucleotide variation gradually diminish from north to south, with essentially no variation in the southernmost sample. Moreover, the northernmost population exhibits many haplotypes distinguished by ancestral variation (SNPs), whereas the southernmost sample is fixed for a derived haplotype. In other words, the frequency of derived variants increases from north to south, with significantly negative H values for the MOQ and TAC populations, where the derived haplotype group is nearly fixed (9:1 and 10:2, respectively, cf. fig. 1). This clinal pattern of nucleotide variation is one of the first such instances in plants that we are aware of. One example is a recent study of European aspen that showed clinal variation of 4 SNPs, suggestive of an adaptive response in phyB2 to local photoperiodic conditions (Ingvarsson et al. 2006Go). However, the clinal pattern of variation seen in our study appears to be different, in that many differentiating SNPs are spread out over the entire sequenced portion of locus CT208 (~1.1 kb) but entirely in intronic regions. CT208 encodes a class III alcohol dehydrogenase, and its genomic location is in a region of very low recombination on chromosome 9 (Roselius et al. 2005Go). The clinal pattern of nucleotide variation is quite different from a "classical" pattern expected under a selective sweep, where most of the neutral variation linked to the selected locus is lost. Generally, a selective sweep scenario can be detected by a reduction in nucleotide diversity (e.g., Maynard Smith and Haigh 1974Go; Kim and Stephan 2002Go; Beisswanger et al. 2006Go; Kane and Rieseberg 2007Go).

Three alternative possibilities to explain the clinal pattern of variation at CT208 among populations are 1) a nonadaptive scenario involving the retention of ancestral variation (i.e., haplotype groups were already diverged in the ancestral species, perhaps aided by subdivision) and its subsequent spatial patterning due to genetic drift and/or patterns of gene flow within S. chilense, 2) a mutational origin of the derived haplotype and its subsequent spread due to positive selection, and 3) introgression of the derived, presumably beneficial haplotype from an unidentified donor species (or very divergent population). The first scenario cannot be entirely ruled out, but we note that the segregating variation in S. peruvianum is quite unlike that distinguishing the derived haplotype group in S. chilense; hence, one would have to assume the loss of this derived haplotype group in S. peruvianum by stochastic processes, despite its higher effective population size. The second possibility seems unrealistic because divergence appears to be too high for the derived haplotype group to have arisen by mutation within S. chilense. The third scenario amounts to introgression of an adaptive haplotype and its subsequent spread (see Morjan and Rieseberg 2004Go). Two potential sources of origin for the derived haplotype are the close tomato relatives S. lycopersicoides and Solanum sitiens as both species occur within the geographic range of S. chilense. However, these 2 species are characterized by very strong reproductive isolation from S. chilense (Rick 1979Go). Moreover, repeated attempts to amplify CT208 from S. lycopersicoides failed, implying some divergence from the haplotypes found in both focal species and S. ochranthum at the primer-binding sites. Thus, no definitive conclusion about the evolutionary processes underlying the observed clinal pattern at CT208 is warranted at present.

Population Structure and Its Consequences
We obtained levels of Fst estimates in both species that indicate moderate population structure. Levels of genetic differentiation in wild tomatoes, which are insect-pollinated herbaceous perennials, are close to the mean level found in a meta-analysis of other comparable outcrossing plant species, based on allozyme data (Hamrick and Godt 1996Go). Furthermore, our Fst estimates in both wild tomato species are broadly consistent with sequence-based estimates in A. lyrata (Wright et al. 2003Go) and in the wind-pollinated tree species P. tremula (Ingvarsson 2005Go). Life history traits such as the breeding system, life forms, geographic ranges, and seed dispersal mechanisms tend to be associated with different levels of genetic diversity within and differentiation among populations (Gottlieb 1977Go; Hamrick and Godt 1996Go). However, the genetic diversity maintained by a species is not only a function of its life history traits but also heavily depends on the ecological and evolutionary history of the species.

Pollen and/or seed dispersal under equilibrium assumptions may not be sufficient to explain patterns of population differentiation in our samples, especially the low differentiation between the northernmost and southernmost population in S. peruvianum. An alternative to equilibrium gene flow as an explanatory framework for patterns of differentiation rests on the likely importance of soil seed banks and historical associations of tomato populations mediated by climatic cycles and/or range expansion following divergence from their common ancestor (see below). Soil seed banks can have major impacts on effective population size and consequently the maintenance of genetic diversity in plants (Levin 1990Go; Nunney 2002Go; Roselius et al. 2005Go). In addition to population structure per se, they might be a factor prolonging the time needed for local adaptation and/or selective sweeps and more generally in retarding genetic differentiation among populations.

The El Niño Southern Oscillation (ENSO) is a key phenomenon for weather patterns in the tropical Pacific Ocean (DeVries 1987Go; Tudhope et al. 2001Go; Tudhope and Collins 2003Go); these cyclical events are expected to affect seed germination, plant establishment, and other aspects of plant ecology and evolution over large regions of coastal western South America (Gutiérrez and Meserve 2003Go). Although published data for wild tomatoes are lacking, we conjecture that the effects of ENSO might result in high (transient) levels of connectivity across the species range and might fuel species-wide or regional expansions and contractions.

Perhaps our most interesting finding is that Tajima's DT and Fu and Li's DFL statistics are higher in the samples of the individual populations than in the pooled samples, where the pooled samples show a significant excess of singletons and other low-frequency polymorphisms (differences between among-population means and pooled-sample estimates of DT and DFL are –0.89 and –1.31 in S. peruvianum and –0.87 and –1.53 in S. chilense, respectively; table 3). We suggest that this excess of singletons is a consequence of the interplay of population structure and demographic effects in both wild tomato species but in ways not generally understood in the literature. Based on simulations of the island model, it is frequently stated that the pooling of samples from different populations should increase the proportion of intermediate-frequency polymorphisms (e.g., Pannell 2003Go; Ingvarsson 2005Go). However, this expectation of a more positive Tajima's D upon pooling should be restricted to high levels of population subdivision, where pooling leads to long internal branches of the gene genealogies with fixed differences between sequences obtained from different demes. For our data, the absence of fixed nucleotide differences between populations implies that pooling the population samples does not result in additional internal branches of the genealogy that would account for higher proportions of intermediate-frequency polymorphism. Rather, pooling of samples yields an excess of singletons as most of the singletons within local populations still remain singletons in the pooled sample.

Although these features of our data can explain why Tajima's D for the pooled sample is not elevated compared with single populations, the critical observation of an excess of singletons compared with neutral equilibrium expectations (i.e., strongly negative DT and DFL) requires an additional explanation. More strongly negative DT values for pooled samples were noted in many previous studies (e.g., Ptak and Przeworski 2002Go; Ingvarsson 2005Go; Liu and Burke 2006Go; Pool and Aquadro 2006Go; Moeller et al. 2007Go), with several authors concluding that population structure per se is largely responsible for the skew in the site-frequency spectrum (e.g., Moeller et al. 2007Go). However, we argue that the effect of population structure, in interaction with the sampling scheme, is to partially mask the impact of regional or species-wide demographic history on the site-frequency spectrum, as explained below.

Simulating range expansions under a stepping-stone model, Ray et al. (2003)Go found that the extent of population subdivision (i.e., low or high migration rate) has a marked impact on whether the signal of species-wide expansion can be recovered from single-deme samples. Simulated gene genealogies were star-like under high rates of migration (i.e., approaching samples from unstructured populations undergoing expansion) but were like those expected under demographic equilibrium under low levels of migration (i.e., the frequency spectrum did not reflect the actual species-wide demography). Analyzing an infinite-island model analytically, Excoffier (2004)Go obtained very similar results. Complementary simulation work on the joint effects of population subdivision and the sampling scheme on patterns of polymorphism and the site-frequency spectrum (under constant effective population size) has demonstrated essentially the same phenomena (De and Durrett 2007Go). In their simulations, samples drawn from single demes were characterized by positive Tajima's D and higher levels of LD compared with species-wide samples (single sequences each taken from many demes), with the effect being stronger under stepping stone than under island spatial structure.

All these results can be understood in terms of the underlying sample genealogies and point toward the importance of sampling schemes in drawing inferences from patterns of sequence polymorphism. The "pooling effect" (i.e., lower or more negative DT and DFL for pooled population samples) on the one hand reflects the fact that depending on their degree of isolation, local samples tend to harbor related sequences (Wakeley 2001Go; Ray et al. 2003Go). On the other hand, pooling samples from distinct populations is akin to increasing the proportion of (unrelated) migrants and should result in genealogies intermediate between those characterizing single demes and those found for species-wide samples. Importantly, coalescent simulations using the ms program (Hudson 2002Go) suggest that in general, strongly negative Tajima's D values can only be observed under population expansion and thus cannot be accounted for by population subdivision per se (Merino C, Pfaffelhuber P, Stephan W, Städler T, unpublished data). This does not rule out possible contributions of purifying selection (resulting in a more skewed site-frequency spectrum for nonsynonymous SNPs or other sequences under functional constraints) to such a pattern in particular data sets (e.g., Nordborg et al. 2005Go; Pool and Aquadro 2006Go; Moeller et al. 2007Go).

Recombination and the Decay of LD
Intragenic LD decays rapidly to very low levels (r2 < 0.05) within a few hundred basepairs in both species of wild tomatoes. This is true even at the loci with very low estimates of the physical recombination rate (CT093 and CT208; Stephan and Langley 1998Go; Roselius et al. 2005Go), suggesting considerable levels of recombination at all loci, as also shown by appreciable Rm estimates in both species. Hence, the estimates of the physical recombination rate at some loci might possibly be underestimated. Moreover, high levels of haplotype diversity in both species also imply sufficient levels of recombination, which consequently result in the rapid decline of LD in both species (fig. 2 and table 6).

The difference in LD observed across species is a result of the interplay of many factors, such as recombination rate, mating system, selection, effective population size, and population structure (reviewed by Rafalski and Morgante 2004Go). Among these factors, high recombination rate, obligate outcrossing, and high Ne in concert provide the most plausible explanation for the rapid decline of LD in our samples, whereas the more extensive LD seen at locus CT208 in S. chilense is easily explained by the unusual haplotype structure within and among populations (i.e., clinal pattern of variation). Overall, the rapid decay of LD in wild tomatoes is comparable to that previously documented in several outcrossing plant species. For example, LD decays to about 50% within 2 kbp in loblolly pine (Brown et al. 2004Go), within a few hundred basepairs in Norway Spruce (Heuertz et al. 2006Go), 0.2–1.5 kbp in maize (Remington et al. 2001Go; Tenaillon et al. 2001Go), to negligible levels within 250 bp in European aspen (Ingvarsson 2005Go) and within 200 bp in wild sunflower (Liu and Burke 2006Go).

In plants, particularly, different mating systems and recombination rate are important factors that affect the decay of LD with distance. The relationship between recombination and mating system can increase or decrease levels of LD. Physical recombination is less effective in selfing populations where individuals are more likely to be homozygous (Nordborg and Donnelly 1997Go); therefore, LD will be more extensive in selfing than in outcrossing populations. Indeed, under a primarily selfing mating system, significant LD extends to 250 kbp in A. thaliana (Nordborg et al. 2002Go, 2005Go), to 100 kbp in rice (Garris et al. 2003Go), to >50 kbp in soybean (Zhu et al. 2003Go), and to about 70 kbp in potato (outcrossing species but usually vegetatively propagated; Simko et al. 2006Go).


    Conclusion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusion
 Supplementary Material
 Acknowledgements
 References
 
Given that wild tomatoes are subdivided species, our assessment of population structure allowed us to discover and interpret several patterns of variability. First, analyzing a few genuine population samples enabled us to find and interpret the clinal pattern of variability at CT208 as a potential signature of an ongoing selective sweep in S. chilense. Sampling only a single population or, alternatively, single individuals from many demes across the species range might have entirely missed this signature. We propose that population structure is one of the most important evolutionary forces that shaped patterns of nucleotide diversity within and among populations in these wild tomatoes. Our finding that pooling samples from different populations yields an excess of low-frequency variants was instrumental in inferring population expansion for both species and more generally to make sense of the differences in sample genealogies under schemes ranging from single-deme to species-wide sampling. One important implication is that one may obtain signals of the species-wide demographic history not only from species-wide samples but also by sampling a few populations; the magnitude of the pooling effect and the pooled site-frequency spectrum (in the form of DT and DFL estimates) should be conservative indicators of a population (or range) expansion. Finally, the rapid decay of LD in both species is very useful for high-resolution mapping in association studies, provided that appropriate candidate genes are chosen. For this purpose, however, a high-density marker screening would be needed.


    Supplementary Material
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusion
 Supplementary Material
 Acknowledgements
 References
 
Supplementary tables 1 and 2 and figures 1–3 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusion
 Supplementary Material
 Acknowledgements
 References
 
We are most grateful to Gertraud Feldmaier-Fuchs and Tobias Marczewski for excellent laboratory assistance and to Carlos Merino and Peter Pfaffelhuber for implementing the ms coalescent simulations. The constructive comments of 3 anonymous referees helped to improve the final version of this article. We are thankful to Asunción Cano for vital logistic and administrative help in Lima and to Gabriel Clostre and T. Marczewski for assistance throughout the collection trip in Peru. Collection and export of tomato samples were made possible through permits issued by the Peruvian "Instituto Nacional de Recursos Naturales" (INRENA), authorization numbers 020–2004-INRENA-IFFS-DCB and 003754-AG-INRENA. This work was funded by the "Deutsche Forschungsgemeinschaft" through its Priority Program "Radiations—Origins of Biological Diversity" (SPP-1127), grant Ste 325/5 to W.S., and a "Deutscher Akademischer Austauschdienst" fellowship to U.A.


    Footnotes
 
1 Present address: Department of Genetics, Kasetsart University, Bangkok, Thailand. Back

Jody Hey, Associate Editor


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Conclusion
 Supplementary Material
 Acknowledgements
 References
 

    Baudry E, Kerdelhué C, Innan H, Stephan W. Species and recombination effects on DNA variability in the tomato genus. Genetics (2001) 158:1725–1735.[Abstract/Free Full Text]

    Beisswanger S, Stephan W, De Lorenzo D. Evidence for a selective sweep in the wapl region of Drosophila melanogaster. Genetics (2006) 172:265–274.[Abstract/Free Full Text]

    Brown GR, Gill GP, Kuntz RJ, Langley CH, Neale DB. Nucleotide diversity and linkage disequilibrium in loblolly pine. Proc Natl Acad Sci USA (2004) 101:15255–15260.[Abstract/Free Full Text]

    Clauss MJ, Mitchell-Olds T. Population genetic structure of Arabidopsis lyrata in Europe. Mol Ecol (2006) 15:2753–2766.[Medline]

    Das A, Mohanty S, Stephan W. Inferring the population structure and demography of Drosophila ananassae from multilocus data. Genetics (2004) 168:1975–1985.[Abstract/Free Full Text]

    De A, Durrett R. Stepping-stone spatial structure causes slow decay of linkage disequilibrium and shifts the site frequency spectrum. Genetics (2007) 176:969–981.[Abstract/Free Full Text]

    DeVries TJ. A review of geological evidence for ancient El Niño activity in Peru. J Geophys Res (1987) 92:14471–14479.[CrossRef]

    Excoffier L. Patterns of DNA sequence diversity and genetic structure after a range expansion: lessons from the infinite-island model. Mol Ecol (2004) 13:853–864.[CrossRef][Medline]

    Excoffier L, Laval G, Schneider S. Arlequin ver. 3.0: an integrated software package for population genetics data analysis. Evol Bioinform Online (2005) 1:47–50.

    Excoffier L, Smouse PE, Quattro JM. Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics (1992) 131:479–491.[Abstract]

    Fay JC, Wu C-I. Hitchhiking under positive Darwinian selection. Genetics (2000) 155:1405–1413.[Abstract/Free Full Text]

    Flint-Garcia SA, Thornsberry JM, Buckler ES. Structure of linkage disequilibrium in plants. Annu Rev Plant Biol (2003) 54:357–374.[CrossRef][Medline]

    Fu Y-X, Li W-H. Statistical tests of neutrality of mutations. Genetics (1993) 133:693–709.[Abstract]

    Ganal MW, Czihal R, Hannappel U, Kloos D-U, Polley A, Ling H-Q. Sequencing of cDNA clones from the genetic map of tomato (Lycopersicon esculentum). Genome Res (1998) 8:842–847.[Abstract/Free Full Text]

    Garris AJ, McCouch SR, Kresovich S. Population structure and its effect on haplotype diversity and linkage disequilibrium surrounding the xa5 locus of rice (Oryza sativa L.). Genetics (2003) 165:759–769.[Abstract/Free Full Text]

    Glinka S, Ometto L, Mousset S, Stephan W, De Lorenzo D. Demography and natural selection have shaped genetic variation in Drosophila melanogaster: a multi-locus approach. Genetics (2003) 165:1269–1278.[Abstract/Free Full Text]

    Gottlieb LD. Electrophoretic evidence and plant systematics. Ann Mo Bot Gard (1977) 64:161–180.[CrossRef]

    Gupta PK, Rustgi S, Kulwal PL. Linkage disequilibrium and association studies in higher plants: present status and future prospects. Plant Mol Biol (2005) 57:461–485.[CrossRef][Web of Science][Medline]

    Gutiérrez JR, Meserve PL. El Niño effects on soil seed bank dynamics in north-central Chile. Oecologia (2003) 134:511–517.[Web of Science][Medline]

    Hamrick JL, Godt MJW. Effects of life history traits on genetic diversity in plant species. Phil Trans R Soc Lond B (1996) 351:1291–1298.[Abstract/Free Full Text]

    Helgason A, Yngvadóttir B, Hrafnkelsson B, Gulcher J, Stefánsson K. An Icelandic example of the impact of population structure on association studies. Nat Genet (2005) 37:90–95.[CrossRef][Web of Science][Medline]

    Heuertz M, De Paoli E, Källman T, Larsson H, Jurman I, Morgante M, Lascoux M, Gyllenstrand N. Multilocus patterns of nucleotide diversity, linkage disequilibrium and demographic history of norway spruce (Picea abies (L.) Karst). Genetics (2006) 174:2095–2105.[Abstract/Free Full Text]

    Hill WG, Robertson A. Linkage disequilibrium in finite populations. Theor Appl Genet (1968) 38:226–231.[CrossRef]

    Hill WG, Weir BS. Variances and covariances of squared linkage disequilibria in finite populations. Theor Popul Biol (1988) 33:54–78.[CrossRef][Web of Science][Medline]

    Hudson RR. A new statistic for detecting genetic differentiation. Genetics (2000) 155:2011–2014.[Abstract/Free Full Text]

    Hudson RR. Two-locus sampling distributions and their application. Genetics (2001) 159:1805–1817.[Abstract/Free Full Text]

    Hudson RR. Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics (2002) 18:337–338.[Abstract/Free Full Text]

    Hudson RR, Kaplan NL. Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics (1985) 111:147–164.[Abstract/Free Full Text]

    Hudson RR, Kreitman M, Aguadé M. A test of neutral molecular evolution based on nucleotide data. Genetics (1987) 116:153–159.[Abstract/Free Full Text]

    Hudson RR, Slatkin M, Maddison WP. Estimation of levels of gene flow from DNA sequence data. Genetics (1992) 132:583–589.[Abstract]

    Ingvarsson PK. Nucleotide polymorphism and linkage disequilibrium within and among natural populations of European aspen (Populus tremula L. Salicaceae). Genetics (2005) 169:945–953.[Abstract/Free Full Text]

    Ingvarsson PK, García MV, Hall D, Luquez V, Jansson S. Clinal variation in phyb2, a candidate gene for day-length-induced growth cessation and bud set, across a latitudinal gradient in European aspen (Populus tremula). Genetics (2006) 172:1845–1853.[Abstract/Free Full Text]

    Kane NC, Rieseberg LH. Selective sweeps reveal candidate genes for adaptation to drought and salt tolerance in common sunflower, Helianthus annuus. Genetics (2007) 175:1823–1834.[Abstract/Free Full Text]

    Kelly JK. A test of neutrality based on interlocus associations. Genetics (1997) 146:1197–1206.[Abstract]

    Kim Y, Stephan W. Detecting a local signature of genetic hitchhiking along a recombining chromosome. Genetics (2002) 160:765–777.[Abstract/Free Full Text]

    Kraakman AT, Niks RE, Van den Berg PM, Stam P, Van Eeuwijk FA. Linkage disequilibrium mapping of yield and yield stability in modern spring barley cultivars. Genetics (2004) 168:435–446.[Abstract/Free Full Text]

    Laporte V, Charlesworth B. Effective population size and population subdivision in demographically structured populations. Genetics (2002) 162:501–519.[Abstract/Free Full Text]

    Levin DA. The seed bank as a source of genetic novelty in plants. Am Nat (1990) 135:563–572.[CrossRef][Web of Science]

    Lin JZ, Brown AHD, Clegg MT. Heterogeneous geographic patterns of nucleotide sequence diversity between two alcohol dehydrogenase genes in wild barley (Hordeum vulgare subspecies spontaneum). Proc Natl Acad Sci USA (2001) 98:531–536.[Abstract/Free Full Text]

    Liu A, Burke JM. Patterns of nucleotide diversity in wild and cultivated sunflower. Genetics (2006) 173:321–330.[Abstract/Free Full Text]

    Ma X-F, Szmidt AE, Wang X-R. Genetic structure and evolutionary history of a diploid hybrid pine Pinus densata inferred from the nucleotide variation at seven gene loci. Mol Biol Evol (2006) 23:807–816.[Abstract/Free Full Text]

    Maynard Smith J, Haigh J. The hitch-hiking effect of a favourable gene. Genet Res (1974) 23:23–35.[Web of Science][Medline]

    McVean G, Awadalla P, Fearnhead P. A coalescent-based method for detecting and estimating recombination from gene sequences. Genetics (2002) 160:1231–1241.[Abstract/Free Full Text]

    Moeller DA, Tenaillon MI, Tiffin P. Population structure and its effects on patterns of nucleotide polymorphism in teosinte (Zea mays ssp. parviglumis). Genetics (2007) 176:1799–1809.[Abstract/Free Full Text]

    Morjan CL, Rieseberg LH. How species evolve collectively: implications of gene flow and selection for the spread of advantageous alleles. Mol Ecol (2004) 13:1341–1356.[CrossRef][Medline]

    Nei M. Molecular evolutionary genetics (1987) New York: Columbia University Press.

    Nordborg M, Borevitz JO, Bergelson J. (12 co-authors). The extent of linkage disequilibrium in Arabidopsis thaliana. Nat Genet (2002) 30:190–193.[CrossRef][Web of Science][Medline]

    Nordborg M, Donnelly P. The coalescent process with selfing. Genetics (1997) 146:1185–1195.[Abstract]

    Nordborg M, Hu TT, Ishino Y. (24 co-authors). The pattern of polymorphism in Arabidopsis thaliana. PLoS Biol (2005) 3:1289–1299.[Web of Science]

    Nunney L. The effective size of annual plant populations: the interaction of a seed bank with fluctuating population size in maintaining genetic variation. Am Nat (2002) 160:195–204.[CrossRef][Web of Science][Medline]

    Olmstead RG, Sweere JA, Spangler RE, Bohs L, Palmer JD. Phylogeny and provisional classification of the Solanaceae based on chloroplast DNA. In: Solanaceae IV, advances in biology and utilization—Nee M, Symon DE, Lester RN, Jessop JP, eds. (1999) Kew: Royal Botanical Gardens. 111–137.

    Ometto L, Glinka S, De Lorenzo D, Stephan W. Inferring the effects of demography and selection on Drosophila melanogaster populations from a chromosome-wide scan of DNA variation. Mol Biol Evol (2005) 22:2119–2130.[Abstract/Free Full Text]

    Ostrowski MF, David J, Santoni S. (11 co-authors). Evidence for a large-scale population structure among accessions of Arabidopsis thaliana: possible causes and consequences for the distribution of linkage disequilibrium. Mol Ecol (2006) 15:1507–1517.[CrossRef][Medline]

    Pannell JR. Coalescence in a metapopulation with recurrent local extinction and recolonization. Evolution (2003) 57:949–961.[CrossRef][Web of Science][Medline]

    Pannell JR, Charlesworth B. Neutral genetic diversity in a metapopulation with recurrent local extinction and recolonization. Evolution (1999) 53:664–676.[CrossRef][Web of Science]

    Peralta IE, Spooner DM. Granule-bound starch synthase (GBSSI) gene phylogeny of wild tomatoes (Solanum L. section Lycopersicon [Mill.] Wettst. Subsection Lycopersicon). Am J Bot (2001) 88:1888–1902.[Abstract/Free Full Text]

    Pool JE, Aquadro CF. History and structure of sub-Saharan populations of Drosophila melanogaster. Genetics (2006) 174:915–929.[Abstract/Free Full Text]

    Ptak SE, Przeworski M. Evidence for population growth in humans is confounded by fine-scale population structure. Trends Genet (2002) 18:559–563.[CrossRef][Web of Science][Medline]

    Rafalski A, Morgante M. Corn and humans: recombination and linkage disequilibrium in two genomes of similar size. Trends Genet (2004) 20:103–111.[CrossRef][Web of Science][Medline]

    Ramos-Onsins SE, Stranger BE, Mitchell-Olds T, Aguadé M. Multilocus analysis of variation and speciation in the closely related species Arabidopsis halleri and A. lyrata. Genetics (2004) 166:373–388.[Abstract/Free Full Text]

    Ray N, Currat M, Excoffier L. Intra-deme molecular diversity in spatially expanding populations. Mol Biol Evol (2003) 20:76–86.[Abstract/Free Full Text]

    Remington DL, Thornsberry JM, Matsuoka Y, Wilson LM, Whitt SR, Doebley J, Kresovich S, Goodman MM, Buckler ES. Structure of linkage disequilibrium and phenotypic associations in the maize genome. Proc Natl Acad Sci USA (2001) 98:11479–11484.[Abstract/Free Full Text]

    Rick CM. Biosystematic studies in Lycopersicon and closely related species of Solanum. In: The biology and taxonomy of the Solanaceae—Hawkes JG, Lester RN, Skelding AD, eds. (1979) London: Academic Press. 667–678.

    Rick CM. Reproductive isolation in the Lycopersicon peruvianum complex. In: Solanaceae—biology and systematics—D'Arcy WG, ed. (1986) New York: Columbia University Press. 477–495.

    Roselius K, Stephan W, Städler T. The relationship of nucleotide polymorphism, recombination rate and selection in wild tomato species. Genetics (2005) 171:753–763.[Abstract/Free Full Text]

    Rozas J, Sánchez-DelBarrio JC, Messeguer X, Rozas R. DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics (2003) 19:2496–2497.[Abstract/Free Full Text]

    Schmid KJ, Ramos-Onsins SE, Ringys-Beckstein H, Weisshaar B, Mitchell-Olds T. A multilocus sequence survey in Arabidopsis thaliana reveals a genome-wide departure from a neutral model of DNA sequence polymorphism. Genetics (2005) 169:1601–1615.[Abstract/Free Full Text]

    Schmid KJ, Törjék O, Meyer R, Schmuths H, Hoffmann MH, Altmann T. Evidence for a large-scale population structure of Arabidopsis thaliana from genome-wide single nucleotide polymorphism markers. Theor Appl Genet (2006) 112:1104–1114.[CrossRef][Web of Science][Medline]

    Sharbel TF, Haubold B, Mitchell-Olds T. Genetic isolation by distance in Arabidopsis thaliana: biogeography and postglacial colonization of Europe. Mol Ecol (2000) 9:2109–2118.[CrossRef][Medline]

    Simko I, Haynes KG, Jones RW. Assessment of linkage disequilibrium in potato genome with single nucleotide polymorphism markers. Genetics (2006) 173:2237–2245.[Abstract/Free Full Text]

    Spooner DM, Anderson GJ, Jansen RK. Chloroplast DNA evidence for the interrelationships of tomatoes, potatoes, and pepinos (Solanaceae). Am J Bot (1993) 80:676–688.[CrossRef][Web of Science]

    Spooner DM, Peralta IE, Knapp S. Comparison of AFLPs with other markers for phylogenetic inference in wild tomatoes [Solanum L. section Lycopersicon (Mill.) Wettst.]. Taxon (2005) 54:43–61.[Web of Science]

    Städler T, Roselius K, Stephan W. Genealogical footprints of speciation processes in wild tomatoes: demography and evidence for historical gene flow. Evolution (2005) 59:1268–1279.[CrossRef][Web of Science][Medline]

    Stephan W, Langley CH. DNA polymorphism in Lycopersicon and crossing-over per physical length. Genetics (1998) 150:1585–1593.[Abstract/Free Full Text]

    Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics (1989) 123:585–595.[Abstract/Free Full Text]

    Tanksley SD, Ganal MW, Prince JP. (19 co-authors). High density molecular linkage map of the tomato and potato genomes. Genetics (1992) 132:1141–1160.[Abstract]

    Taylor IB. Biosystematics of the tomato. In: The tomato crop: a scientific basis for improvement—Atherton JG, Rudich J, eds. (1986) London: Chapman & Hall. 1–34.

    Tenaillon MI, Sawkins MC, Long AD, Gaut RL, Doebley JF, Gaut BS. Patterns of DNA sequence polymorphism along chromosome 1 of maize (Zea mays ssp. mays L.). Proc Natl Acad Sci USA (2001) 98:9161–9166.[Abstract/Free Full Text]

    Tenaillon MI, U'Ren J, Tenaillon O, Gaut BS. Selection versus demography: a multilocus investigation of the domestication process in maize. Mol Biol Evol (2004) 21:1214–1225.[Abstract/Free Full Text]

    Tero N, Aspi J, Siikamäki P, Jäkäläniemi A, Tuomi J. Genetic structure and gene flow in a metapopulation of an endangered plant species, Silene tatarica. Mol Ecol (2003) 12:2073–2085.[CrossRef][Medline]

    Tiffin P, Gaut BS. Sequence diversity in the tetraploid Zea perennis and the closely related diploid Z. diploperennis: insights from four nuclear loci. Genetics (2001) 158:401–412.[Abstract/Free Full Text]

    Tudhope AW, Chilcott CP, McCulloch MT, Cook ER, Chappell J, Ellam RM, Lea DW, Lough JM, Shimmield GB. Variability in the El Niño-southern oscillation through a glacial-interglacial cycle. Science (2001) 291:1511–1517.[CrossRef][Web of Science][Medline]

    Tudhope S, Collins M. Global change: the past and future of El Niño. Nature (2003) 424:261–262.

    Wakeley J. Nonequilibrium migration in human history. Genetics (1999) 153:1863–1871.[Abstract/Free Full Text]

    Wakeley J. The coalescent in an island model of population subdivision with variation among demes. Theor Popul Biol (2001) 59:133–144.[CrossRef][Web of Science][Medline]

    Wakeley J, Aliacar N. Gene genealogies in a metapopulation. Genetics (2001) 159:893–905.[Abstract/Free Full Text]

    Watterson GA. On the number of segregating sites in genetical models without recombination. Theor Popul Biol (1975) 7:256–276.[CrossRef][Web of Science][Medline]

    Whitlock MC, Barton NH. The effective size of a subdivided population. Genetics (1997) 146:427–441.[Abstract]

    Wright SI, Bi IV, Schroeder SG, Yamasaki M, Doebley JF, McMullen MD, Gaut BS. The effects of artificial selection on the maize genome. Science (2005) 308:1310–1314.[Abstract/Free Full Text]

    Wright SI, Lauga B, Charlesworth D. Subdivision and haplotype structure in natural populations of Arabidopsis lyrata. Mol Ecol (2003) 12:1247–1263.[CrossRef][Medline]

    Zhu YL, Song QJ, Hyten DL, Van Tassell CP, Matukumalli LK, Grimm DR, Hyatt SM, Fickus EW, Young ND, Cregan PB. Single-nucleotide polymorphisms in soybean. Genetics (2003) 163:1123–1134.[Abstract/Free Full Text]

Accepted for publication July 31, 2007.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
J HeredHome page
N. D. Quang, S. Ikeda, and K. Harada
Patterns of Nucleotide Diversity at the Methionine Synthase Locus in Fragmented and Continuous Populations of a Wind-Pollinated Tree, Quercus mongolica var. crispula
J. Hered., June 16, 2009; (2009) esp036v1.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
T. Stadler, B. Haubold, C. Merino, W. Stephan, and P. Pfaffelhuber
The Impact of Sampling Schemes on the Site Frequency Spectrum in Nonequilibrium Subdivided Populations
Genetics, May 1, 2009; 182(1): 205 - 216.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
J. Ross-Ibarra, M. Tenaillon, and B. S. Gaut
Historical Divergence and Gene Flow in the Genus Zea
Genetics, April 1, 2009; 181(4): 1399 - 1413.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
K. Nadachowska and W. Babik
Divergence in the Face of Gene Flow: The Case of Two Newts (Amphibia: Salamandridae)
Mol. Biol. Evol., April 1, 2009; 26(4): 829 - 841.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. Riebler, L. Held, and W. Stephan
Bayesian Variable Selection for Detecting Adaptive Genomic Differences Among Populations
Genetics, March 1, 2008; 178(3): 1817 - 1829.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
T. Stadler, U. Arunyawat, and W. Stephan
Population Genetics of Speciation in Two Closely Related Wild Tomatoes (Solanum Section Lycopersicon)
Genetics, January 1, 2008; 178(1): 339 - 350.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrow All Versions of this Article:
24/10/2310    most recent
msm162v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Arunyawat, U.
Right arrow Articles by Städler, T.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Arunyawat, U.
Right arrow Articles by Städler, T.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?