Skip Navigation


MBE Advance Access originally published online on January 11, 2007
Molecular Biology and Evolution 2007 24(3):875-888; doi:10.1093/molbev/msm005
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrow All Versions of this Article:
24/3/875    most recent
msm005v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Zhu, Q.
Right arrow Articles by Ge, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zhu, Q.
Right arrow Articles by Ge, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

Research Articles

Multilocus Analysis of Nucleotide Variation of Oryza sativa and Its Wild Relatives: Severe Bottleneck during Domestication of Rice

Qihui Zhu*,{dagger}, Xiaoming Zheng*, Jingchu Luo{dagger}, Brandon S. Gaut{ddagger} and Song Ge*

* State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China
{dagger} Center for Bioinformatics, State Key Laboratory of Plant Genetic Engineering and Protein Engineering and College of Life Sciences, Peking University, Beijing, China
{ddagger} Department of Ecology and Evolutionary Biology, University of California–Irvine

E-mail: gesong{at}ibcas.ac.cn.


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Varying degrees of reduction of genetic diversity in crops relative to their wild progenitors occurred during the process of domestication. Such information, however, has not been available for the Asian cultivated rice (Oryza sativa) despite its importance as a staple food and a model organism. To reveal levels and patterns of nucleotide diversity and to elucidate the genetic relationship and demographic history of O. sativa and its close relatives (Oryza rufipogon and Oryza nivara), we investigated nucleotide diversity data from 10 unlinked nuclear loci in species-wide samples of these species. The results indicated that O. rufipogon and O. nivara possessed comparable levels of nucleotide variation ({theta}sil = 0.0077~0.0095) compared with the relatives of other crops. In contrast, nucleotide diversity of O. sativa was as low as {theta}sil = 0.0024 and even lower ({theta}sil = 0.0021 for indica and 0.0011 for japonica), if we consider the 2 subspecies separately. Overall, only 20–10% of the diversity in the wild species was retained in 2 subspecies of the cultivated rice (indica and japonica), respectively. Because statistic tests did not reject the assumption of neutrality for all 10 loci, we further used coalescent to simulate bottlenecks under various lengths and population sizes to better understand the domestication process. Consistent with the dramatic reduction in nucleotide diversity, we detected a severe domestication bottleneck and demonstrated that the sequence diversity currently found in the rice genome could be explained by a founding population of 1,500 individuals if the initial domestication event occurred over a 3,000-year period. Phylogenetic analyses revealed close genetic relationships and ambiguous species boundary of O. rufipogon and O. nivara, providing additional evidence to treat them as 2 ecotypes of a single species. Lowest linkage disequilibrium (LD) was found in the perennial O. rufipogon where the r2 value dropped to a negligible level within 400 bp, and the highest in the japonica rice where LD extended to the entirely sequenced region (~900 bp), implying that LD mapping by genome scans may not be feasible in wild rice due to the high density of markers needed.

Key Words: nucleotide diversity • domestication • bottleneck • linkage disequilibrium • Oryza sativaOryza rufipogon


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Domestication has long been an area of great interest in study of evolution (Darwin 1859) and for understanding the process of mutual dependence between human societies and the plants and animals (Zeder et al. 2006Go). Varying degrees of reduction of genetic diversity in crops relative to their wild progenitors occurred during the process of domestication (Tanksley and McCouch 1997Go; Zeder et al. 2006Go). This reduction in genetic diversity has resulted partly from small initial populations in crops relative to their wild ancestors and partly from intense selection for agronomic traits (Eyre-Walker et al. 1998Go; Clark et al. 2004Go; Zeder et al. 2006Go). The former effect, often referred to as "domestication bottleneck" (Eyre-Walker et al. 1998Go), is featured by a genome-wide loss of genetic diversity, whereas selection is locus specific and occurs in regions of the genome that are tightly linked to the sites that are the targets of selection (Clark et al. 2004Go; Zeder et al. 2006Go). In addition, domestication may also have a major impact on the organization of genetic diversity within the genome because population bottlenecks and selection may elevate linkage disequilibrium (LD) throughout the genome (Nordborg 2000Go). Thus, knowledge of the process and dynamics of crop domestication is important, both because domestication bottlenecks limit genetic variation in crops (Eyre-Walker et al. 1998Go; Buckler et al. 2001Go) and because any approach that analyzes variation in populations such as association mapping should be based on population genetic parameters (Eyre-Walker et al. 1998Go; Liu and Burke 2006Go).

In recent decades, molecular population genetics has been widely used to reveal the patterns of genetic diversity within and between populations and to trace the histories of divergence and speciation in plants (Eyre-Walker et al. 1998Go; Hilton and Gaut 1998Go; Savolainen et al. 2000Go; Tiffin and Gaut 2001Go; Olsen and Purugganan 2002Go; Ramos-Onsins et al. 2004Go; Wright and Gaut 2005Go). Because natural selection generally acts on some but not all genes, it is possible to differentiate the effects of natural selection and demography using multilocus sequence data (Nordborg and Innan 2002Go; Wright and Gaut 2005Go). Although many studies on nucleotide variation of plant species have been conducted, mainly focusing on the model plant Arabidopsis and several crops (see review in Wright and Gaut 2005Go), relatively few investigations have been undertaken to examine the changes in nucleotide diversity and to reveal demographic history of crops and their wild relatives (Eyre-Walker et al. 1998Go; White and Doebley 1999Go; Hamblin et al. 2004Go; Tenaillon et al. 2004Go; White et al. 2004Go; Wright et al. 2005Go; Liu and Burke 2006Go).

Asian cultivated rice (Oryza sativa L.) is an economically important crop that is the staple food for more than one-half of the world's population. It includes 2 major subspecies: Oryza sativa ssp. indica and Oryza sativa ssp. japonica. Differences between the subspecies are apparent both in a number of physiological and morphological traits and in growth habitats (Oka 1988Go; Garris et al. 2005Go). Geographically and ecologically, indica is primarily known as lowland rice and grown throughout tropical Asia, whereas japonica is typically found in temperate East Asia, upland areas of Southeast Asia, and high elevations in South Asia (Oka 1988Go; Khush 1997Go). Two Asian wild species, Oryza rufipogon Griff. and O. nivara Sharma et Shastry, are most closely related to and have thus been considered as the progenitors of O. sativa (Vaughan 1989Go; Khush 1997Go). The perennial O. rufipogon is widely distributed in southern China, South and Southeast Asia, and northern Australia and inhabits wet habitats throughout its growing season. In contrast, the annual O. nivara is mainly found in South and Southeast Asia and occurs in seasonally dry/wet areas such as ponds, swamps, and the vicinity of rice fields with shallow water (Vaughan 1989Go). Because life history traits characterizing these 2 wild species vary continuously in nature and segregate in an F2 population (Morishima 2001Go), they have been treated mostly as 2 different ecotypes (Oka 1988Go; Morishima et al. 1992Go; Lu et al. 2000Go; Cheng et al. 2003Go) or subspecies (Vaughan and Morishima 2003Go) under a single species (O. rufipogon). It is increasingly appreciated that O. sativa and its wild relatives (O. nivara and O. rufipogon) form a large species complex together with weedy races from introgression and hybridization between rice and its wild relatives (Vaughan 1989Go; Lu et al. 2000Go; Morishima 2001Go).

Because of the agronomic and theoretical importance of rice, extensive studies have been conducted on the genetic diversity, systematics, and evolution of rice and its relatives (see reviews in Oka 1988Go; Khush 1997Go; Ge et al. 2001Go; Vaughan et al. 2003Go; Zhu and Ge 2005Go). It has been well established that the Asian cultivated rice originated from its wild relative, O. rufipogon sensu lato (including both annual and perennial ecotypes) (Khush 1997Go; Vaughan et al. 2003Go), though the number and precise geographical location of domestications are still controversial (Morishima 2001Go; Zhu and Ge 2005Go; Londo et al. 2006Go). A growing number of studies have characterized nucleotide variation of cultivated rice and its wild relatives (Barbier et al. 1991Go; Olsen and Purugganan 2002Go; Garris et al. 2003Go; Yoshida and Miyashita 2005Go; Olsen et al. 2006Go). These investigations, however, were mainly focused on a single species and exclusively based on 1 or 2 genes or multiple linked genes. In this study, we provide the first investigation on the levels and patterns of genetic variation and relationships of the Asian cultivated rice and its close relatives based on sequence data of multiple nuclear loci. Our specific objectives were 1) to compare nucleotide variation of O. sativa and its wild progenitors with other major crops; 2) to elucidate the genetic relationship between O. rufipogon and O. nivara; 3) to study population demography and explore the severity of population bottleneck during rice domestication; and 4) to characterize the level of LD, which is a key issue in the design of fine-scale mapping of agronomically important genes. To address these questions not only provides important insights into the domestication process and the evolution of cultivated rice but also facilitates the effective use of the wild rice germplasm.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Plant Materials
A total of 30 accessions of cultivated rice (16 O. sativa ssp. indica and 14 O. sativa ssp. japonica) and 30 wild individuals (18 O. rufipogon and 12 O. nivara) were sampled and listed in table 1. The cultivated accessions originated from 21 countries and included all the 5 distinct varietal groups (indica, aus, aromatic, temperate japonica, and tropical japonica) (Garris et al. 2005Go) to represent broad genetic diversity of Asian cultivated rice. The wild individuals were sampled to cover the entire distribution range of the 2 wild species. We also included one Oryza barthii accession from Africa as an outgroup because this species is closely related to the species complex of O. sativa and its relatives (Zhu and Ge 2005Go). Total DNA was extracted from fresh or silica gel–dried leaves, using the hexadecyltrimethylammonium bromide method as described in Ge et al. (1999)Go.


View this table:
[in this window]
[in a new window]

 
Table 1 Plant Materials Used in This Study

 
Polymerase Chain Reaction Amplification, Cloning, and Sequencing
The genes used in this study represent 10 unlinked nuclear loci on 9 different chromosomes in rice. Detailed information about the location and functional association as well as the primer sequences of the amplified regions is provided in table 2. Polymerase chain reaction (PCR) amplification was performed in a total volume of 25 µl using T-gradient 96U thermocycler (Biometra, Göttingen, Germany). The reaction mix contained 5–50 ng of template DNA, 0.2 µmol/l of each primer, 200 µmol/l of each deoxyribonucleotide triphosphate, 10 mmol/l Tris-Cl (pH 8.3), 50 mmol/l KCl, 25 mmol/l MgCl2, 1.5% dimethyl sulfoxide, and a mixture of 0.5 U Pfu polymerase (Promega, Madison, WI) and 0.5 U Ex Taq DNA polymerase (TaKaRa, Shiga, Japan). Amplification was carried out in a T-gradient 96 U thermocycler (Biometra, Göttingen, Germany) as follows: 2 min at 94 °C followed by 38 cycles of 30 s at 94 °C, 30–50 s at 54 °C, 90 s at 72 °C, and a final extension at 72 °C for 10 min. Because of high Tm value and short PCR product in length for Waxy, a 2-step PCR was used for the gene. The condition included denaturation at 95 °C for 2 min, 38 cycles composed of denaturation at 95 °C for 45 s, annealing and extension at 68 °C for 90 s, and a final extension at 72 °C for 7 min. Amplification products were separated by electrophoresis on 1.5% agarose gels stained with ethidium bromide using a 100-bp DNA ladder and gel purified with a Pharmacia purification kit (Amersham Pharmacia Biotech, Piscataway, NJ). Sequencing was done on a Megabase1000 automatic DNA sequencer (Amersham Pharmacia Biotech) after the reaction products were purified through precipitation with 95% ethanol and 3 M sodium acetate (pH 5.2).


View this table:
[in this window]
[in a new window]

 
Table 2 Summary of the Genes Surveyed and the Primer Sequences Used in the Study

 
For the cultivated rice, we sequenced PCR products directly on both strands. For wild O. rufipogon and O. nivara, in which either homozygous or heterozygous individuals exist, PCR fragments were cloned into pGEM T-easy vectors (Promega) with either a Pharmacia purification kit (Amersham Pharmacia Biotech) or a Dinggou purification kit (Dingguo, Beijing, China). Independent plasmid DNAs were selected randomly and at least 6 clones were sequenced individually using the DYEnamic ET Terminator Kit (Amersham Pharmacia Biotech), following the manufacturer's protocol. Because Taq errors occur at random, it is unlikely that polymorphisms shared among more than 1 clone (sequence) are artificial (Eyre-Walker et al. 1998Go; Hilton and Gaut 1998Go). However, "singletons," that is, polymorphisms occurred in only 1 sequence, relative to all the remainder sequences can represent either true sequence variation or Taq polymerase artifact. To confirm the singletons, we repeated PCR amplification, cloning, and sequencing and finally excluded those singletons resulting from Taq polymerase error. By means of multiclone sequencing and reamplifying and resequencing, interallelic PCR recombinants were also verified and removed. All sequences were deposited into GenBank, with accession numbers EF069438EF070112.

Sequence Analysis
Initial sequence data were assembled with the ContigExpress program (Informax Inc. 2000, North Bethesda, MD). Allele sequences were aligned using a combination of methods implemented in BioEdit version 7.0.1 (Hall 1999Go) and ClustalX 1.81 (Thompson et al. 1997Go), with further manual refinements. Sequences were edited with DAMBE version 4.1.19 (Xia and Xie 2001Go). Insertions/deletions (indels) were not included in the analysis. For each locus and taxon, we calculated the number of segregating sites (S), the number of haplotypes (h) as well as 2 parameters of nucleotide diversity: {pi}, the expected heterozygosity per nucleotide site (Nei 1987Go) and {theta}w, an estimate of 4Neµ, where Ne is the effective population size and µ is the mutation rate per nucleotide (Watterson 1975Go). Estimates of nucleotide diversity were based on total sequences and silent sites separately using the DnaSP version 4.10.8 (Rozas et al. 2003Go). The observed heterozygous rate at each locus was estimated as Ho = number of heterozygous individuals/number of total individuals. An individual was considered as heterozygous when 2 alleles were different in 1 site at any locus.

Tests of Neutrality
To test for deviations from the neutral equilibrium model of evolution, we performed several tests using the programs DnaSP and HKA (Hudson–Kreitman–Aguade) (http://lifesci.rutgers.edu/~heylab/). In each locus, Tajima's D (Tajima 1989Go) and D* and F* of Fu and Li (1993)Go were calculated at all sites and at silent sites separately. Tajima's D was based on the discrepancy between the mean pairwise differences ({pi}) and Watterson's estimator ({theta}w), whereas D* and F* of Fu and Li rely on the difference between the number of polymorphic sites in external branches (polymorphisms unique to an extant sequence) and number of polymorphic site in internal phylogenetic branches (polymorphisms shared by extant sequences). For both tests, negative values indicate an excess of low-frequency polymorphisms, whereas positive values indicate an excess of intermediate polymorphisms. To assess the neutral prediction of the ratio of polymorphism to divergence across loci, a multilocus HKA test (Hudson et al. 1987Go) was applied to all 10 loci. Because selective force generally effect on single locus in the history of species evolution, the multilocus HKA test across unlinked or loosely linked loci can discriminate between selection forces and population demography during the speciation process. Oryza barthii sequences were used as outgroups for the HKA and tests of Fu and Li.

Genealogy and Divergence of Taxa
The genealogical trees of 10 nuclear loci were constructed by PAUP* version 4.0b10 (Swofford 2001Go), using the Neighbor-Joining (NJ) and maximum parsimony (MP) methods. The NJ analysis was conducted with Kimura's 2-parameter distances (Kimura 1980Go), and the MP was performed using heuristic searches (Tree Bisection-Reconnection branch swapping; MulTrees option in effect) with 10 random additions of taxa. The stability of internal nodes was assessed by bootstrap analysis with 1,000 replicates. Because of the close relationship of the 4 taxa (Morishima et al. 1992Go; Lu et al. 2000Go), we computed fixed differences and shared polymorphisms for pairwise comparisons among 4 taxa. In addition, to test for monophyly of taxa, constraint trees were made to hold monophyly for the 4 taxa separately and evaluated with a Kishino–Hasegawa test (Kishino and Hasegawa 1989Go), as implemented in PAUP*.

Linkage Disequilibrium
The decay of LD with physical distance was estimated using a nonlinear regression analysis of LD between polymorphic sites versus the distance between sites in basepairs (Remington et al. 2001Go). The expected value of r2 at drift–recombination equilibrium is E(r2) = 1/(1 + {rho}), where {rho} is 4Nc and c is the recombination rate in morgans between the 2 markers and N is the effective population size (Hill and Weir 1988Go). Under the assumption of a low level of mutation and finite sample size, the expectation (Hill and Weir 1988Go) becomes

Formula
where n is the number of sequences sampled. We pooled our sequences across loci and fit this model individually for the 4 taxa as well as separately for the wild and cultivated samples using PROC NLIN in SAS Ver. 6.12 (SAS Institute, Cary, NC) and SigmaPlot 10.0 (SPSS Inc., Chicago, IL). Although a least-square estimate of {rho} per basepair by the nonlinear regression might be biased because of the nonindependence among linked sites and nonequilibrium populations (Remington et al. 2001Go), such analysis is still useful for investigating the overall rate of decay of LD (Ingvarsson 2005Go; Liu and Burke 2006Go). DnaSP was also used to estimate the minimum number of recombination events in the history of each gene sample (RM) (Hudson and Kaplan 1985Go). Squared allele-frequency correlations for LD (r2) were also calculated with program SITES (Hey and Wakeley 1997Go).

Coalescent Simulations
We used coalescent simulations to model the impact of a bottleneck on sequence diversity using a modified version of Hudson's ms program (Hudson 2002Go). We modeled the divergence of 2 populations representing the cultivated (O. sativa) and wild rice (O. rufipogon/O. nivara), with a population bottleneck in cultivated rice. The bottleneck model was previously described in domesticated maize (Eyre-Walker et al. 1998Go; Tenaillon et al. 2004Go; Wright et al. 2005Go) and was modified slightly in this study as follows.

To model the bottleneck, we combined O. rufipogon and O. nivara as a single population as the progenitor of O. sativa because previous and present studies supported treating them as a large gene pool (Morishima 2001Go; Londo et al. 2006Go). Due to 2 alternative hypotheses regarding origin of the Asian cultivated rice, we modeled the bottleneck under 2 conditions: 1) to treat the Asian cultivated rice as a whole, assuming that it originated monophyletically (Oka 1988Go; Lu et al. 2002Go) and 2) to perform the simulations on 2 subspecies of the cultivated rice, indica and japonica, separately, assuming that they had separate domestication origins (Second 1982Go; Yamanaka et al. 2003Go; Zhu and Ge 2005Go). The goal of these simulations was to estimate the bottleneck severity that best explained the cultivated rice data.

The assumptions and parameters used for simulation are illustrated in figure 1. It was supposed at time t2 generations ago, a single ancestral population of size Na experienced an instantaneous shift in size to the bottlenecked population size Nb, and at time t1 generations ago, the bottleneck population expanded instantaneously to the present population size Np. The bottleneck was characterized by 2 parameters: d, the duration of the bottleneck in generations (measured as t2 – t1) and Nb, the population size during the bottleneck. We used the bottlenecked stringency K (Wright et al. 2005Go), which is the ratio of Nb and d, to describe the severity of bottleneck in the domestication because an obvious positive correlation between Nb and d were detected in previous maize studies (Tenaillon et al. 2004Go; Wright et al. 2005Go). In this study, a range of the parameter d (200, 500, 1,000, 1,500, 2,000, and 3,000) was explored. We used 3,000 generations to be the maximum duration of a bottleneck associated with rice domestication because archeological evidence suggested that the duration of the domestication bottleneck in rice seems less than 3,000 years (see Discussion). For a given d value, K value ranged from 0.001 to 7 and were used to assess the fit of different models. A total of 150 scenarios were examined for each locus with 10,000 simulations.


Figure 1
View larger version (5K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 1.— Diagrammatic representation of the coalescent model used in simulation. See text for details.

 
For each simulation at each locus, we started with silent {theta}w, recombination rate, and Tajima's D calculated for O. rufipogon/O. nivara, and the cultivated rice data were used to assess fit of statistics. Then we calculated values of S, {theta}w, {pi}, and Tajima's D for wild rice and cultivated rice, respectively. The simulation was accepted if the number of segregating sites (S) fell within 20% of the observed wild rice data (Weiss and von Haeseler 1998Go). To assess fit, we defined levels of acceptance corresponding to a range of 20% for each of 4 summary statistics (Srice, {theta}w-rice, {pi}rice, and Drice). For each of the 150 explored bottleneck scenarios, we calculated the approximate likelihood for locus i as the proportion of simulations among 10,000 that fit the data (Weiss and von Haeseler 1998Go). Finally, after calculating likelihoods on a per-locus basis, we calculated the multilocus likelihood by multiplying across loci. This approach made the implicit assumption that loci are independent (Tenaillon et al. 2004Go). Furthermore, a recent study on human population size change indicated that combining multiple statistics might result in a significant reduction of the compatible parameter space and thus effectively improve the power to detect bottlenecks (Voight et al. 2005). Therefore, we combined the 4 statistics, S, {theta}w, {pi}, and Tajima's D to better measure the fit to coalescent simulations.


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Nucleotide Diversity
Sequences of the 10 unlinked loci were obtained from 60 accessions, representing 2 subspecies of O. sativa and its wild relatives. The length of aligned sequence for each locus varied between 544 bp and 1,057 bp, with a total of 8,079 bp in length. Of the 10 loci, 8 contained both coding and noncoding sites and the remaining 2 were intron (Adh1) and 5'-flanking (CatA) regions. The schematic diagrams of the 10 genes are provided in supplementary figure S1 (Supplementary Material online). Both indel and single nucleotide polymorphisms (SNPs) were present in the sequenced regions. A total of 69 indel polymorphisms were found exclusively in introns or flanking regions and excluded from all further analyses. SNP polymorphisms were substantially higher than indel polymorphisms. A total of 357 SNPs and thus an average of 1 SNP every 23 nucleotides were found across the 4 taxa, with the number of SNP being 53 in indica, 30 in japonica, 263 in O. rufipogon, and 199 in O. nivara.

Standard statistics of sequence variation for each locus are summarized in table 3, including the estimates of nucleotide variation in different regions at individual loci. As expected, due to strong functional constraint, the levels of nucleotide variation at coding regions were all substantially lower than those at noncoding regions (data not shown). Levels of nucleotide diversity are heterogeneous among loci, with GBSSII being the least variable loci (mean {theta}sil = 0.0036) and Waxy the highest (mean {theta}sil = 0.0161). Diversity pattern of the entire region was similar to that of the silent sites at each locus. To determine whether silent diversity and interspecific divergence are correlated with each other as expected under neutral evolution, we calculated the correlation between {theta}sil and Ksil and found significant positive correlation for O. rufipogon (P < 0.01) and O. nivara (P < 0.05), but no significant correlation for 2 subspecies of the cultivated rice (P > 0.55). The positive correlations between {theta}sil and Ksil in 2 wild species conformed to the pattern expected under neutrality, suggesting that neutral mutation rates (and/or selective constraint) varied among loci. No significant correlation between {theta}sil and Ksil in indica and japonica, on contrast, suggested the deviation from a strict neutral equilibrium model, probably due in part to demographic processes associated with domestication.


View this table:
[in this window]
[in a new window]

 
Table 3 Summary of Nucleotide Polymorphisms and Neutrality Tests

 
At the taxon level, polymorphisms in the wild species ({theta}sil = 0.0077 for O. nivara and {theta}sil = 0.0095 for O. rufipogon) were significantly higher than those in the cultivated rice ({theta}sil = 0.0021 for indica and {theta}sil = 0.0011 for japonica). At both the entire region and silent sites, nucleotide diversity in indica rice was roughly 2 times that of the japonica rice, whereas the 2 wild species maintained comparable levels of diversity that were 3.7-fold to 8.6-fold higher than those in the cultivated rice (tables 3 and 4). It is obvious from table 3 that diversity in wild species is much higher than that in the cultivated rice at all loci. When the data of indica and japonica were pooled, the cultivars maintained 5.7% (Cbp1) to 50% (GBSSII) as much diversity across loci compared with the wild species (table 4). Over all 10 loci, domesticated rice has about ~22% ({theta}sil) of the variability found in its progenitor (table 4).


View this table:
[in this window]
[in a new window]

 
Table 4 Summary of Nucleotide Polymorphisms and Neutrality Tests on the Cultivated and Wild Rice

 
Although no heterozygote was detected in the cultivated rice sample, different numbers of heterozygotes were observed for the 2 wild species. Across 10 loci, 12 out of 18 (66.7%) O. rufipogon individuals were heterozygous, whereas 3 of 13 (23.1%) O. nivara individuals were heterozygous. The frequency of heterozygotes in the 2 species was consistent with their mating system because the outcrossing rate of O. rufipogon was estimated to be 30–50%, whereas that of O. nivara was from 5% to 25% (Morishima et al. 1984Go; Barbier 1989Go).

Neutrality Tests
To examine the fit of nucleotide polymorphism data to the neutral equilibrium model, we first performed the tests of Tajima's D (Tajima 1989Go) and D* and F* of Fu and Li (1993)Go for each gene. No significant Tajima's D value and D* and F* values of Fu and Li were observed at any locus except for O. rufipogon at Adh1, where the neutral model was rejected at P < 0.05 (table 3). It is interesting to note that 9 of 10 loci for O. rufipogon and 8 of 10 loci for O. nivara had negative Tajima's D values, whereas 2 cultivated subspecies had positive Tajima's D at most loci (tables 3 and 4), perhaps as a consequence of population expansion in the wild species or the skewed distribution toward intermediate frequency in the cultivated species. Theoretically, Tajima's D should be higher in populations that have experienced a recent bottleneck because of the preferential loss of low-frequency variant (Tenaillon et al. 2001Go). This is obviously true for cultivated rice where the D values were higher than those in the wild species at all 10 loci (tables 3 and 4).

We also performed the multilocus HKA test (Hudson et al. 1987Go) to examine whether levels of polymorphism and divergence across loci would be correlated, as expected under the neutral model of molecular evolution. No significant departure from the equilibrium model was detected between any taxon pair (japonicaindica, X2 = 1.87, P = 0.993; japonicanivara, X2 = 1.93, P = 0.993; japonicarufipogon, X2 = 3.18, P = 0.957; indicanivara, X2 = 3.16, P = 0.958; indicarufipogon, X2 = 3.13, P = 0.959; rufipogonnivara, X2 = 0.79, P = 0.999). Thus, there appears to be no evidence for the action of past selection at the 10 loci as a whole. Because the HKA test statistic for closely related species is not expected to follow the {chi}2 distribution (Machado et al. 2002Go), we compared the test statistic with a distribution generated from 10,000 coalescent simulations (Hilton et al. 1994Go). Using 1 sequence of O. barthii as an outgroup at each locus, HKA tests across loci were applied to each of the 4 taxa. The multilocus test was not significant for the 2 wild species (O. rufipogonO. barthii, X2 = 0.61, P = 0.999; O. nivaraO. barthii X2 = 1.70, P = 0.995). In contrast, the HKA test rejected the null hypothesis of proportionality between polymorphism and divergence in japonica/O. barthii contrast (X2 = 17.86, P = 0.037) and the test value was close to significance in indica/O. barthii contrast (X2 = 15.44, P = 0.080). In both cases, CatA contributed greatly to the significant multilocus HKA statistics for cultivated rice, whereas Os0053 to the indica rice and O. nivara (fig. 2). The rejection of a multilocus HKA test can be explained by both selection and demographic history (Wright and Gaut 2005Go). Removal of these single loci from the multilocus HKA test causes the test statistic to drop below the critical value in all taxa.


Figure 2
View larger version (8K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 2.— Summary of multilocus HKA tests (Hudson et al. 1987Go). In each test, polymorphism within individual species is compared with divergence from Oryza barthii sequences. Solid symbols represent the contributions to the overall {chi}2 statistic that resulted from polymorphisms, and open symbols show contributions caused by divergence. (a) ({blacktriangleup}) and ({triangleup}), indica; ({blacktriangledown}) and ({bigtriangledown}), japonica. (b) (•) and ({circ}), Oryza rufipogon; ({blacksquare}) and ({square}), Oryza nivara. Symbols above the horizontal line indicate that the observed deviations are greater than the simulated, and those below the line indicate that the observed deviations are less than the simulated.

 
Divergence of Species and Genealogical Patterns
To assess the divergence and genetic relationships among taxa we first calculated the numbers of shared polymorphisms and fixed differences between taxon pairs. A fixed difference refers to the nucleotide site where all sampled sequences from one taxon are different from all sequences from another taxon, whereas a shared polymorphism occurs when 2 taxa have the same 2 bases segregating at the same site (Hilton et al. 1994Go). Shared polymorphisms reveal a history of polymorphism that has not yet been erased by genetic drift and thus reflect either a short divergence time between taxa or historically large population sizes (Hilton et al. 1994Go). As shown in table 5, no fixed difference was detected between any pair of comparisons at all 10 loci except for the japonica/nivara contrast where 3 and 1 fixed differences were observed at loci CatA and Adh1, respectively. A large amount of shared polymorphisms were detected between contrasts over all 10 loci (table 5), indicating a close affinity or hybridization/introgression among these 4 taxa. It is not surprising that no fixed difference was observed between cultivated and wild rice because of the recent domestication history of rice, probably around 10,000 years (Normile 1997Go). Similarly, no fixed differences were observed between wild and cultivated sorghum (White et al. 2004Go). Note that the highest number of shared polymorphisms (104) was found between 2 wild species (O. rufipogon and O. nivara), whereas the lowest number of shared polymorphisms (28) occurred between 2 subspecies of the cultivated rice (indica and japonica). These observations are in agreement with Fst analyses where the lowest genetic differentiation was found between O. rufipogon and O. nivara (Fst = 0.033), but the highest between indica and japonica (Fst = 0.190).


View this table:
[in this window]
[in a new window]

 
Table 5 Fixed Differences and Shared Polymorphisms at the 10 Locia

 
We also constructed the genealogical trees for each locus using both NJ and MP methods. Results showed that on almost all the trees, sequences from the same taxon did not form a monophyletic group despite a tendency for sequences to cluster by the taxon designations in the combined trees (supplementary fig. S2, Supplementary Material online). Some accessions appeared to be in different positions on single trees (data not shown), suggesting that coalescence for alleles at most loci occurred in the common ancestor of these taxa. Such genealogical patterns of lack of concordance among the 10 gene trees might result from lineage sorting because the 4 taxa were closely related and diverged very recently from a common ancestor. Extensive gene flow through hybridization and introgression might be also factors leading to inconsistent phylogenetic relationships among taxa. To test for the monophyly of the 4 taxa, we compared the tree topologies statistically by the Kishino–Hasegawa test with or without taxon constraints. Constraining samples by treating each taxon as a monophyletic group resulted in a significant reduction in likelihood (P < 0.001) of differences between the constraint and nonconstraint trees, rejecting the monophyly of each of the 4 taxa (supplementary table S1, Supplementary Material online).

Coalescent Simulations and Severity of Bottleneck during Rice Domestication
Because different tests did not reject neutrality for all 10 loci (table 3), it is appropriate, as a first approximation, to use simulations of a neutral coalescent process to estimate population parameters during rice domestication. In the simulation, we assume a reduction in population size associated with the initial domestication of rice and a population increase in size after it was widely cultivated. We also assume that a domestication bottleneck first occurred 10,000 generations in the past because the earliest archaeological evidence indicated that rice domestication occurred around 10,000 BP (Khush 1997Go; Normile 1997Go). As indicated above, no significant genetic differentiation was observed between 2 wild species (O. rufipogon and O. nivara) with these data, and thus their data sets were combined into a single, large ancestral gene pool to domesticated rice.

The simulation results revealed an obvious positive correlation between Nb and d at each locus (supplementary fig. S3, Supplementary Material online), as was expected, so we estimated the parameter K = Nb/d. Our estimate of K values varied among summary statistics (S, {theta}w, {pi}, and Tajima's D) (supplementary fig. S4, Supplementary Material online). The summary statistics S, {theta}w, and {pi} produced an appreciable likelihood peak, but for several loci the curves based on Tajima's D did not reach a maximum within the parameter space (supplementary fig. S4, Supplementary Material online). This result might reflect different evolutionary pattern for different regions, or the polymorphism distribution of wild rice might not be mimicked adequately in the simulation, or single summary statistics may be inadequate. To take fuller use of multiple aspects of the data and to narrow the confidence regions (Voight et al. 2005), we used the combinations of DS{theta}w{pi} in the subsequent analyses of bottleneck models. Based on this analysis, the K values were lower in the japonica rice (0.2) than in the indica rice (0.5), suggestive of more severe bottleneck effect on the japonica rice (fig. 3). When the 2 subspecies of O. sativa were considered as a whole, the severity parameter K was 0.4 (fig. 3). Because the multilocus HKA statistics suggested that 2 loci (CatA and Os0053) may not evolve neutrally in the cultivated rice, we repeated the above simulations by removing the 2 loci. Interestingly, the simulation results changed very little (data not shown). With estimated K values, we may calculate the number of individuals that have been involved in the domestication event. For example, if the domestication bottleneck was 200 generations in length, we estimated that the domestication of rice was based on 40~100 individuals of wild rice. If a bottleneck of 2,000 and 3,000 generations, we estimated that the cultivated rice was based on a wild rice population of 400~1000 individuals and 600~1,500 individuals, respectively (supplementary table S2, Supplementary Material online).


Figure 3
View larger version (8K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 3.— Diagram of maximum likelihood estimates for population bottleneck during rice domestication. The x axis indicates the severity parameter K, and slashes represent the omitted region with K value from 2 to 7, where the likelihood values are close to 0. The y axis represents the likelihood value based on the combined DS{theta}w{pi}.

 
Linkage Disequilibrium
Figure 4 shows r2, the measure of LD, as a function of distance for comparisons within loci, pooled over the entire data set. It is obvious that both regression curves fitting to the cultivated indica and japonica were above those of 2 wild species. The observed LD in the cultivated rice extended on average much further than those in the 2 wild species where the expected values of r2 dropped to negligible levels (i.e., <0.1) within 400 bp. In particular, the japonica rice showed extensive LD over distances that approach the length of the sequenced region (fig. 4a). Estimation of population recombination parameters showed that the minimum number of recombination events ranged from 0 to 7 across loci in the wild species, but no recombination was found in the 2 cultivated rice (table 3).


Figure 4
View larger version (15K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 4.— Plots of the squared allele-frequency correlations r2 (y axis) as a function of physical distance (x axis) between sites for 10 genes in the wild and cultivated rice. The curved lines depict the fitted nonlinear regressions of mutation–recombination–drift model based on the equation in Materials and Methods. (a) Four taxa separately; (b) indica and japonica are combined as the cultivated rice, and Oryza rufipogon and Oryza nivara are combined as the wild rice.

 
When data were pooled for the cultivated and wild samples separately, the average r2 was significantly different between the cultivated and wild species (P < 10–12 with a t-test), with the values being 0.368 in the cultivated rice and 0.155 in the wild species (table 4). As can be seen from figure 4b, the average r2 in the wild species drops quickly for the first 200 bp to a value of 0.18 within 100 bp and a value of 0.12 within 200 bp; whereas the value in the cultivated rice declines much slowly to a value of only 0.38 within 100 bp and remains a value of >0.15 by 1,000 bp. The higher level of LD in the cultivated rice is consistent with an increase in LD due to domestication.


    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Diversity and Differentiation between O. rufipogon and O. nivara
Although O. rufipogon and O. nivara have long been the subject of extensive studies, no investigation has been conducted on nucleotide variation of species-wide samples of these 2 species using sequence data of multiple nuclear loci despite a few of studies focusing on 1 or 2 genes (Barbier et al. 1991Go; Yoshida et al. 2004Go; Yoshida and Miyashita 2005Go). A literature survey on the multiple gene studies of wild plant species has revealed a wide range of nucleotide variation, from {theta}sil = 0.0247 (Arabidopsis lyrata ssp. petraea, Ramos-Onsins et al. 2004Go; Zea mays ssp. parviglumis, Wright and Gaut 2005Go) to {theta}sil = 0.0040 (Arabidopsis lyrata ssp. lyrata, Wright and Gaut 2005Go). The nucleotide diversity of the wild relatives of many crops fell within this range, such as wild relative of maize (Zea mays ssp. mays, {theta}sil = 0.0149; Zea perennis, {theta}sil = 0.0130) (Tiffin and Gaut 2001Go; Wright and Gaut 2005Go), wild barley (Hordeum vulgare, {theta}sil = 0.0109) (Morrell et al. 2003Go), wild sunflower (Helianthus annuus, {theta}sil = 0.0144) (Liu and Burke 2006Go), and Populus tremula ({theta}sil = 0.0167) (Ingvarsson 2005Go). The present study showed that O. rufipogon and O. nivara possessed medium to low levels of nucleotide variation ({theta}sil = 0.0095 for O. rufipogon and {theta}sil = 0.0077 for O. nivara) compared with the relatives of other crop plants. However, relative to other wild species in Oryza, nucleotide diversity in these 2 species is much higher because our recent multiple gene study found that the {theta}sil values ranged from 0.0038 to 0.0057 in 3 Oryza species with the C-genome (Oryza officinalis, Oryza eichingeri, and Oryza rhizomatis) (Zhang and Ge 2007Go). Higher level of genetic variation in O. rufipogon than in other Oryza species was also reported using allozyme, RFLPs, and SSRs (Gao et al. 2001Go; Gao and Zhang 2005Go; Bautista et al. 2006Go).

It is not unexpected that slightly higher nucleotide diversity is observed in O. rufipogon than in O. nivara because the former has wider distribution and is largely outcrossing species, whereas O. nivara is primarily inbreeding (Vaughan 1989Go). Over all loci, we found that 66.7% of O. rufipogon individuals are heterozygous, whereas 23.1% of O. nivara individuals are heterozygous (table 3), in general agreement with previous estimates of outcrossing rates of 5–56% for the 2 species (Morishima et al. 1984Go; Barbier 1989Go). Theoretically, genetic variation in self-pollinating species would be reduced by decreasing effective population size and eliminating the effective rate of recombination (Charlesworth 2003Go). The low nucleotide variation in O. nivara might be facilitated by self-fertilization. However, empirical studies showed that the effects of selfing on population genetics relied critically on whether the levels of variation were compared species wide or within populations (Savolainen et al. 2000Go; Wright et al. 2003Go). Because our samples were collected from disparate populations, it is still difficult to evaluate the impact of mating system on genetic variation in these species.

Taxonomic treatment and genetic relationships between O. rufipogon and O. nivara have been controversial for decades (Vaughan 1989Go; Morishima 2001Go; Zhu and Ge 2005Go). Although 2 separate species names have still been used in many cases (e.g., Vaughan 1989Go; Ge et al. 2001Go; Yamanaka et al. 2003Go), studies based on morphological analyses and artificial hybridization (Oka 1988Go; Lu et al. 2000Go) as well as recent molecular data (Lu et al. 2002Go; Ren et al. 2003Go; Zhu and Ge 2005Go) did not reveal either a reproductive barrier or significant genetic differentiation between them. It is a tendency, therefore, to treat these 2 species as 2 different ecotypes or subspecies under a single species (O. rufipogon sensu lato) (Morishima 2001Go; Cheng et al. 2003Go; Vaughan et al. 2003Go) and, in some cases, consider them as a single large gene pool (i.e., Londo et al. 2006Go). The present phylogenetic analyses based on 10 genes did not show apparent grouping of the 4 taxa, in agreement with previous results based on multiple gene phylogeny (Zhu and Ge 2005Go). Moreover, the Kishino–Hasegawa test demonstrated that the constraint species tree was significantly worse than the nonconstraint tree (supplementary table S1, Supplementary Material online), rejecting the hypothesis that these taxa are monophyletic groups. This result is further corroborated by abundant shared polymorphisms, no fixed differences among them, and low differentiation, as measured by Fst. In brief, the multiple gene sequence data support treating O. rufipogon and O. nivara as 2 ecotypes of a large species rather than as 2 separate species (Oka 1988Go; Barbier et al. 1991Go; Morishima et al. 1992Go; Lu et al. 2002Go; Zhu and Ge 2005Go).

Severe Bottleneck during Domestication of O. sativa
Considering that the wild relatives of O. sativa harbor average levels of nucleotide variation, it is particularly noteworthy that the nucleotide diversity of O. sativa is much lower than that of other major crops including both inbreeding and outbreeding species. For example, the species-wide silent diversity ({theta}sil) based on multilocus sequences was 0.0149 in maize (Wright and Gaut 2005Go), 0.0109 in barley (Morrell et al. 2003Go), and 0.0034 in sorghum (Hamblin et al. 2004Go). In this study, we found that the nucleotide variation of O. sativa was as low as {theta}sil = 0.0024, approximately 70% of that of sorghum (Hamblin et al. 2004Go), one-third of that of cultivated sunflower ({theta}sil = 0.0072, Liu and Burke 2006Go), and one-sixth of that of maize (Wright and Gaut 2005Go). The value is even lower if we consider the 2 subspecies separately (table 3).

Although previous studies using different marker systems found reduced diversity in cultivated rice relative to its wild progenitors (Oka 1988Go; Sun et al. 2001Go; Londo et al. 2006Go), the present study indicated that only 20% and 10% of the diversity in the wild species were retained in the subspecies indica and japonica of the cultivated rice, respectively. This is in sharp contrast to several studies of other crops where the genome-wide levels of diversity were slightly lower in the crops than in their wild progenitors. For instance, maize maintained approximately 80% of diversity found in its wild ancestor as measured by nonselected genes (Wright and Gaut 2005Go), and the cultivated sunflower has retained 40–50% of the diversity present in the wild (Liu and Burke 2006Go). A previous literature survey also indicated that approximately two-thirds of genetic diversity have been maintained in major crops relative to their wild progenitors despite the difficulty of comparisons between species due to differences in experimental systems (Buckler et al. 2001Go).

Why does rice differ from other domesticated plants with respect to the loss of diversity during domestication? There are 4 possible reasons that rice appears to retain an aberrantly low level of genetic diversity relative to its wild ancestors. The first is that the domestication bottleneck may have been more severe. Our simulations support this conjecture. For example, Wright et al. (2005)Go obtained a K value of 2.45 in maize. In our case, the K value was estimated to be 0.2 (japonica) and 0.5 (indica) assuming separate domestications or 0.4 (japonica + indica) assuming a monophyletic origin of the 2 cultivated subspecies. The founding population would have contained 200–500 individuals for rice (supplementary table S2, Supplementary Material online), but would have up to 4,650 individuals for maize (Eyre-Walker et al. 1998Go) if the initial domestication event occurred over a 1,000-year period. Because archaeological evidence suggested that the domestication of rice occurred at least 10,000 years ago and spread to other regions in South and Southeast Asia around 7,000 years ago (Khush 1997Go), the maximum duration of the domestication bottleneck in rice is at maximum 3,000 years. Therefore, the sequence diversity currently found in rice genome can be explained by a founding population of at most 1,500 individuals (supplementary table S2, Supplementary Material online).

A second explanation for low retention of genetic diversity in domesticated rice could be sampling. If the domestication of rice was a geographically local event, based on a distinct and genetically limited subpopulation of the wild ancestor, then our use of a global sample of O. rufipogon and O. nivara could be misleading. Unfortunately, it is difficult to assess this potential problem fully, partly because there is not yet enough information about population structure in the wild relatives of rice and partly because extensive gene flow occurs between these taxa (Oka 1988Go). A third explanation is that the shift of mating system, from commonly outbreeding wild rice to almost exclusively inbreeding cultivated rice, may have facilitated small population sizes in the incipient domesticate. However, empirical studies showed that the mating system primarily affects diversity at population rather than species levels (Savolainen et al. 2000Go; Wright et al. 2003Go), making the selfing explanation implausible. At least, selfing does not appear to be a complete explanation for the huge diversity reduction in cultivated rice because sorghum is also selfing, yet cultivated sorghum retains 60–70% of diversity of its wild progenitor (Hamblin et al. 2004Go). Finally, selection reduces diversity, and the effects of selection can extend long distances along chromosomes in selfing taxa (Nordborg and Innan 2002Go; Olsen et al. 2006Go). However, various tests in this study did not detect signature of selection (tables 3 and 4) at all 10 loci despite generally low statistical power of most methods (Wright and Gaut 2005Go). In addition, selection may lead to both reduced diversity (e.g., directional selection) and elevated diversity (e.g., balancing selection) (Hamblin et al. 2004Go; Wright and Gaut 2005Go). Consequently, selection might not be the main reason account for the severe reduction of diversity in cultivated rice although effects of selection cannot be excluded entirely.

Linkage Disequilibrium
The extent of LD differs both among taxa and across loci (table 3). Of the 4 taxa, the lowest LD was observed in the perennial O. rufipogon where the r2 value drops to a negligible level within 400 bp, and the highest in the japonica rice where LD extends to the entirely sequenced region (~900 bp) (fig. 4a). The overall pattern of higher LD is observed in cultivated rice, in particular the japonica rice (fig. 4a and b). Extensive LD has also been observed previously in both the Asian and African cultivated rice (Garris et al. 2003Go; Semon et al. 2005Go). Theoretically, LD is affected by various evolutionary and demographic factors, including selection, recombination, population admixture, inbreeding, and bottleneck (Nordborg and Tavare 2002Go). For instance, an investigation on a disease resistance gene (Xa5) in the Asian cultivated rice suggested that LD could extend to ~100 kb (Garris et al. 2003Go). In the tb1 locus of maize and Wx locus of O. sativa, extended LD was found across the entire genomic regions upstream and downstream of the genes due to selective sweeps (Clark et al. 2004Go; Olsen et al. 2006Go). However, selection and population admixture appear not to be the main reasons for the higher LD observed in the cultivated rice because neutrality tests were not significant for all loci and both subspecies indica and japonica maintained high levels of LD (fig. 4a).

The expectation of higher LD in selfers relative to outcrossers has been observed in many species. For example, significant LD persisted for 250 kb in selfing Arabidopsis thaliana (Nordborg and Innan 2002Go), whereas LD declined to negligible levels in <1 kb in the outcrossing maize (Remington et al. 2001Go) and extended only a few hundred basepairs in European aspen, an outbreeding species (Ingvarsson 2005Go). Recently, Liu and Burke (2006)Go found that LD decayed to negligible levels (r2 < 0.10) within 200 bp in the obligate outcrossing wild sunflower but to the same levels within ~1,100 bp in the self-compatible cultivated sunflower. Mating system can explain partly the higher level of LD in cultivated rice because O. sativa is predominantly inbreeding species. Higher LD in the annual O. nivara than in the perennial O. rufipogon is also consistent with the higher outcrossing rate in O. rufipogon than in O. nivara (Morishima et al. 1984Go; Barbier 1989Go).

However, the effects of mating system and population bottleneck on LD are intermingled in many cases, particularly for crop plants given the fact that transition from outcrossing to self-fertilization and population bottleneck often take place simultaneously during crop domestications (i.e., Hamblin et al. 2004Go; Liu and Burke 2006Go). Small effective population size is most likely to be the main force responsible for the higher LD and lower recombination in cultivated rice because only 10–20% of genetic diversity in the wild rice gene pool was retained in the cultivated rice. This suggestion is also supported by our computer simulation, in which very small effective population size due to severe bottleneck was estimated in O. sativa. Higher LD in japonica rice is consistent with its more severe population bottleneck during its domestication. The findings in this study imply that LD mapping by genome scans may not be feasible in wild rice due to the high density of markers needed. These levels of LD, however, are sufficient to establish whether polymorphisms within candidate genes are associated with specific phenotypic variants in the cultivated rice.


    Supplementary Material
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Supplementary figures S1–S4 and tables S1 and S2 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
We thank Lin-Bin Zhang for helpful discussion on the manuscript and Zhe Li, Shu-Qi Zhao, Ge Gao, An-Yuan Guo, and other members in Ge's group for technical assistances. We are grateful to Duncan Vaughan and Tao Sang and 2 anonymous reviewers for their valuable comments. We also thank the International Rice Research Institute (Los Banos, Philippines) for providing leaf and seed samples and Susan McCouch (Cornell University) for providing seeds of some rice cultivars. This work was supported by the National Natural Science Foundation of China (30121003, 30430030, and 90408015), the CAS Innovation Grant, the National Key Basic Research Program of China (2003CB715900), and Postdoctor Science Foundation of China (20060390012).


    Footnotes
 
William Martin, Associate Editor


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 

    Barbier P. (1989) Genetic variation and ecotypic differentiation in the wild rice specis Oryza rufipogon. II. Influence of the mating system and life-history traits on the genetic structure of populations. Jpn J Genet 64:273–285.[CrossRef]

    Barbier P, Morishima H, Ishihama A. (1991) Phylogenetic relationships of annual and perennial wild rice: probing by direct DNA sequencing. Theor Appl Genet 81:693–702.[Web of Science]

    Bautista N, Vaughan D, Jayasuriya A, Liyanage A, Kaga A, Tomooka N. (2006) Genetic diversity in AA and CC genome Oryza species in southern south Asia. Genet Resour Crop Evol 53:631–640.[CrossRef]

    Buckler ESI, Thornsberry JM, Kresovich S. (2001) Molecular diversity, structure and domestication of grasses. Genet Res 77:213–218.[CrossRef][Web of Science][Medline]

    Charlesworth D. (2003) Effects of inbreeding on the genetic diversity of populations. Phil Trans R Soc Lond B Biol Sci 358:1051–1070.[Abstract/Free Full Text]

    Cheng C, Motohashi R, Tsuchimoto S, Fukuta Y, Ohtsubo H, Ohstubo E. (2003) Polyphyletic origin of cultivated rice: based on the interspersion pattern of SINEs. Mol Biol Evol 20:67–75.[Abstract/Free Full Text]

    Clark RM, Linton E, Messing J, Doebley JF. (2004) Pattern of diversity in the genomic region near the maize domestication gene tb1. Proc Natl Acad Sci USA 101:700–707.[Abstract/Free Full Text]

    Darwin C. (1859) On the origin of species by means of natural selection. (John Murray, London).

    Eyre-Walker A, Gaut RL, Hilton H, Feldman DL, Gaut BS. (1998) Investigation of the bottleneck leading to the domestication of maize. Proc Natl Acad Sci USA 95:4441–4446.[Abstract/Free Full Text]

    Fu YX and Li WH. (1993) Statistical tests of neutrality of mutations. Genetics 133:693–709.[Abstract]

    Gao LZ, Ge S, Hong DY. (2001) High levels of genetic differentiation of Oryza officinalis Wall. et Watt. from China. J Hered 92:511–516.[Abstract/Free Full Text]

    Gao LZ and Zhang CH. (2005) Comparisons of microsatellite variability and population genetic structure of two endangered wild rice species, Oryza rufipogon and O. officinalis, and their conservation implications. Biodivers Conserv 14:1663–1679.[CrossRef]

    Garris AJ, McCouch SR, Kresovich S. (2003) Population structure and its effect on haplotype diversity and linkage disequilibrium surrounding the xa5 locus of rice (Oryza sativa L.). Genetics 165:759–769.[Abstract/Free Full Text]

    Garris AJ, Tai TH, Coburn J, Kresovich S, McCouch S. (2005) Genetic structure and diversity in Oryza sativa L. Genetics 169:1631–1638.[Abstract/Free Full Text]

    Ge S, Sang T, Lu BR, Hong DY. (1999) Phylogeny of rice genomes with emphasis on origins of allotetraploid species. Proc Natl Acad Sci USA 96:14400–14405.[Abstract/Free Full Text]

    Ge S, Sang T, Lu BR, Hong DY. (2001) Phylogeny of the genus Oryza as revealed by molecular approaches. In Khush GS, Brar DS, Hardy B (Eds.). Rice genetics IV. Proceedings of the Fourth International Rice Genetics Symposium(IRRI, Los Banos (Phillippines)) pp. 89–105.

    Hall TA. (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp Ser 41:95–98.

    Hamblin MT, Mitchell SE, White GM, Gallego J, Kukatla R, Wing RA, Paterson AH, Kresovich S. (2004) Comparative population genetics of the panicoid grasses: sequence polymorphism, linkage disequilibrium and selection in a diverse sample of Sorghum bicolor. Genetics 167:471–483.[Abstract/Free Full Text]

    Hey J and Wakeley J. (1997) A coalescent estimator of the population recombination rate. Genetics 145:833–846.[Abstract]

    Hill WG and Weir BS. (1988) Variances and covariances of squared linkage disequilibria in finite populations. Theor Popul Biol 33:54–78.[CrossRef][Web of Science][Medline]

    Hilton H and Gaut BS. (1998) Speciation and domestication in maize and its wild relatives: evidence from the Globulin-1 gene. Genetics 150:863–872.[Abstract/Free Full Text]

    Hilton H, Kliman M, Hey J. (1994) Using hitchhiking genes to study adaptation and divergence during speciation within the Drosophila melanogaster species complex. Evolution 48:1900–1913.[CrossRef][Web of Science]

    Hudson RR. (2002) Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18:337–338.[Abstract/Free Full Text]

    Hudson RR and Kaplan NL. (1985) Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111:147–164.[Abstract/Free Full Text]

    Hudson RR, Kreitman M, Aguade M. (1987) A test of neutral molecular evolution based on nucleotide data. Genetics 116:153–159.[Abstract/Free Full Text]

    Ingvarsson PK. (2005) Nucleotide polymorphism and linkage disequilibrium within and among natural populations of European aspen (Populus tremula L. Salicaceae). Genetics 169:945–953.[Abstract/Free Full Text]

    Khush GS. (1997) Origin, dispersal, cultivation and variation of rice. Plant Mol Biol 35:25–34.[CrossRef][Web of Science][Medline]

    Kimura M. (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16:111–134.[CrossRef][Web of Science][Medline]

    Kishino H and Hasegawa M. (1989) Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in hominoidea. J Mol Evol 29:170–179.[CrossRef][Web of Science][Medline]

    Liu A and Burke JM. (2006) Patterns of nucleotide diversity in wild and cultivated sunflower. Genetics 173:321–330.[Abstract/Free Full Text]

    Londo JP, Chiang YC, Hung KH, Chiang TY, Schaal BA. (2006) Phylogeography of Asian wild rice, Oryza rufipogon, reveals multiple independent domestications of cultivated rice, Oryza sativa. Proc Natl Acad Sci USA 103:9578–9583.[Abstract/Free Full Text]

    Lu BR, Naredo MEB, Juliano AB, Jackson MT. (2000) Preliminary studies on taxonomy and biosystematics of the AA genome Oryza species (Poaceae). In Jacobs SW and Everett LJ (Eds.). Grasses: systematics and evolution(CSIRO, Melbourne (Australia)) pp. 51–58.

    Lu BR, Zheng KL, Qian HR, Zhuang JY. (2002) Genetic differentiation of wild relatives of rice as assessed by RFLP analysis. Theor Appl Genet 106:101–106.[Web of Science][Medline]

    Machado CA, Kliman RM, Markert JA, Hey J. (2002) Inferring the history of speciation from multilocus DNA sequence data: the case of Drosophila pseudoobscura and close relatives. Mol Biol Evol 19:472–488.[Abstract/Free Full Text]

    Morishima H. (2001) Evolution and domestication of rice. (International Rice Research Institute, Manila (Philippines)).

    Morishima H, Sano Y, Oka HI. (1984) Differentiation of perennial and annual types due to habitat conditions in the wild rice Oryza perennis. Plant Syst Evol 144:119–135.[CrossRef]

    Morishima H, Sano Y, Oka HI. (1992) Evolutionary studies in cultivated rice and its wild relatives. Oxf Surv Evol Biol 8:135–184.

    Morrell PL, Lundy KE, Clegg MT. (2003) Distinct geographic patterns of genetic diversity are maintained in wild barley (Hordeum vulgare ssp. spontaneum) despite migration. Proc Natl Acad Sci USA 100:10812–10817.[Abstract/Free Full Text]

    Nayar NM. (1973) Origin and cytogenetics of rice. Adv Genet 17:153–292.[CrossRef][Web of Science]

    Nei M. (1987) Molecular evolutionary genetics. (Columbia University Press, New York).

    Nordborg M. (2000) Linkage disequilibrium, gene trees and selfing: an ancestral recombination graph with partial self-fertilization. Genetics 154:923–929.[Abstract/Free Full Text]

    Nordborg M and Innan H. (2002) Molecular population genetics. Curr Opin Plant Biol 5:69–73.[CrossRef][Web of Science][Medline]

    Nordborg M and Tavare S. (2002) Linkage disequilibrium: what history has to tell us. Trends Genet 18:83–90.[CrossRef][Web of Science][Medline]

    Normile D. (1997) Archaeology—Yangtze seen as earliest rice site. Science 275:309.[Free Full Text]

    Oka HI. (1988) Origin of cultivated rice. (Japan Scientific Societies Press, Tokyo (Japan)).

    Olsen KM, Caicedo AL, Polato N, McClung A, McCouch S, Purugganan MD. (2006) Selection under domestication: evidence for a sweep in the rice Waxy genomic region. Genetics 173:975–983.[Abstract/Free Full Text]

    Olsen KM and Purugganan MD. (2002) Molecular evidence on the origin and evolution of glutinous rice. Genetics 162:941–950.[Abstract/Free Full Text]

    Ramos-Onsins SE, Stranger BE, Mitchell-Olds T, Aguade M. (2004) Multilocus analysis of variation and speciation in the closely related species Arabidopsis halleri and A. lyrata. Genetics 166:373–388.[Abstract/Free Full Text]

    Remington DL, Thornsberry JM, Matsuoka Y, Wilson LM, Whitt SR, Doebley J, Kresovich S, Goodman MM, Buckler ES 4th. (2001) Structure of linkage disequilibrium and phenotypic associations in the maize genome. Proc Natl Acad Sci USA 98:11479–11484.[Abstract/Free Full Text]

    Ren F, Lu BR, Li S, Huang J, Zhu Y. (2003) A comparative study of genetic relationships among the AA-genome Oryza species using RAPD and SSR markers. Theor Appl Genet 108:113–120.[CrossRef][Web of Science][Medline]

    Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R. (2003) DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19:2496–2497.[Abstract/Free Full Text]

    Savolainen O, Langley CH, Lazzaro BP, Freville H. (2000) Contrasting patterns of nucleotide polymorphism at the alcohol dehydrogenase locus in the outcrossing Arabidopsis lyrata and the selfing Arabidopsis thaliana. Mol Biol Evol 17:645–655.[Abstract/Free Full Text]

    Second G. (1982) Origin of the genic diversity of cultivated rice (Oryza spp.): study of the polymorphism scored at 40 isozyme loci. Jpn J Genet 57:25–57.[CrossRef]

    Semon M, Nielsen R, Jones MP, McCouch SR. (2005) The population structure of African cultivated rice Oryza glaberrima (Steud.): evidence for elevated levels of linkage disequilibrium caused by admixture with. O. sativa and ecological adaptation. Genetics 169:1639–1647.[Abstract/Free Full Text]

    Sun CQ, Wang XK, Li ZC, Yoshimura A, Iwata N. (2001) Comparison of the genetic diversity of common wild rice (Oryza rufipogon Griff.) and cultivated rice (O. sativa L.) using RFLP markers. Theor Appl Genet 102:157–162.[CrossRef][Web of Science]

    Swofford D. (2001) PAUP: phylogenetic analysis using parsimony (and other methods), version 4.0b10. (Sinauer Associates, Sunderland (MA)).

    Tajima F. (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585–595.[Abstract/Free Full Text]

    Tanksley SD and McCouch SR. (1997) Seed banks and molecular maps: unlocking genetic potential from the wild. Science 277:1063–1066.[Abstract/Free Full Text]

    Tenaillon MI, Sawkins MC, Long AD, Gaut RL, Doebley JF, Gaut BS. (2001) Patterns of DNA sequence polymorphism along chromosome 1 of maize (Zea mays ssp. mays L.). Proc Natl Acad Sci USA 98:9161–9166.[Abstract/Free Full Text]

    Tenaillon MI, U'Ren J, Tenaillon O, Gaut BS. (2004) Selection versus demography: a multilocus investigation of the domestication process in maize. Mol Biol Evol 21:1214–1225.[Abstract/Free Full Text]

    Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG. (1997) The Clustal X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25:4876–4882.[Abstract/Free Full Text]

    Tiffin P and Gaut BS. (2001) Sequence diversity in the tetraploid Zea perennis and the closely related diploid Z. diploperennis: insights from four nuclear loci. Genetics 158:401–412.[Abstract/Free Full Text]

    Vaughan D, Morishima H, Kadowaki K. (2003) Diversity in the Oryza genus. Curr Opin Plant Biol 6:139–146.[CrossRef][Web of Science][Medline]

    Vaughan DA. (1989) The genus Oryza L: current status of taxonomy, Philippines. (International Rice Research Institute, Manila (Philippines)).

    Vaughan DA and Morishima H. (2003) Biosystematics of the genus Oryza. In Smith CW (Ed.). Rice: origin, history, technology, and production(John Wiley & Sons, Inc, Hoboken, NJ) pp. 27–65.

    Voight BF, Adams AM, Frisse LA, Qian Y, Hudson RR, Ricnzo AD. (2005) Interrogating multiple aspects of variation in a full resequencing data set to infer human population size changes. Proc Natl Acad Sci USA 102:10508–18513.

    Watterson GA. (1975) On the number of segregating sites in genetical models without recombination. Theor Popul Biol 7:256–276.[CrossRef][Web of Science][Medline]

    Weiss G and von Haeseler A. (1998) Inference of population history using a likelihood approach. Genetics 149:1539–1546.[Abstract/Free Full Text]

    White GM, Hamblin MT, Kresovich S. (2004) Molecular evolution of the phytochrome gene family in sorghum: changing rates of synonymous and replacement evolution. Mol Biol Evol 21:716–723.[Abstract/Free Full Text]

    White SE and Doebley JF. (1999) The molecular evolution of terminal ear1, a regulatory gene in the genus Zea. Genetics 153:1455–1462.[Abstract/Free Full Text]

    Wright SI, Bi IV, Schroeder SG, Yamasaki M, Doebley JF, McMullen MD, Gaut BS. (2005) The effects of artificial selection on the maize genome. Science 308:1308–1310.[Abstract/Free Full Text]

    Wright SI and Gaut BS. (2005) Molecular population genetics and the search for adaptive evolution in plants. Mol Biol Evol 22:506–519.[Abstract/Free Full Text]

    Wright SI, Lauga B, Charlesworth D. (2003) Subdivision and haplotype structure in natural populations of Arabidopsis lyrata. Mol Ecol 12:1247–1263.[CrossRef][Medline]

    Xia X and Xie Z. (2001) DAMBE: software package for data analysis in molecular biology and evolution. J Hered 92:371–373.[Abstract/Free Full Text]

    Yamanaka S, Nakamura I, Nakai H, Sato YI. (2003) Dual origin of the cultivated rice based on molecular markers of newly collected annual and perennial strains of wild rice species, Oryza nivara and O. rufipogon. Genet Resour Crop Evol 50:529–538.[CrossRef]

    Yoshida K and Miyashita N. (2005) Nucleotide polymorphism in the Adh2 region of the wild rice Oryza rufipogon. Theor Appl Genet 111:1215–1228.[CrossRef][Web of Science][Medline]

    Yoshida K, Miyashita NT, Ishii T. (2004) Nucleotide polymorphism in the Adh1 locus region of the wild rice Oryza rufipogon. Theor Appl Genet 109:1406–1416.[CrossRef][Web of Science][Medline]

    Zeder MA, Emshwiller E, Smith BD, Bradley DG. (2006) Documenting domestication: the intersection of genetics and archaeology. Trends Genet 22:139–155.[CrossRef][Web of Science][Medline]

    Zhang L-B and Ge S. (2007) Multilocus analysis of nucleotide variation and speciation in Oryza officinalis and its close relatives. Mol Biol Evol 10.1093/molbev/msl204.

    Zhu Q and Ge S. (2005) Phylogenetic relationships among A-genome species of the genus Oryza revealed by intron sequences of four nuclear genes. New Phytol 167:249–265.[CrossRef][Web of Science][Medline]

Accepted for publication January 4, 2007.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Proc. Natl. Acad. Sci. USAHome page
R. Allaby, D. Q. Fuller, and T. Brown
Reply to Ross-Ibarra and Gaut: Multiple domestications do appear monophyletic if an appropriate model is used
PNAS, December 9, 2008; 105(49): E106 - E106.
[Full Text] [PDF]


Home page
Plant CellHome page
J. S.S. Ammiraju, F. Lu, A. Sanyal, Y. Yu, X. Song, N. Jiang, A. C. Pontaroli, T. Rambo, J. Currie, K. Collura, et al.
Dynamic Evolution of Oryza Genomes Is Revealed by Comparative Genomic Analysis of a Genus-Wide Vertical Data Set
PLANT CELL, December 1, 2008; 20(12): 3191 - 3209.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
M. A. Chapman, C. H. Pashley, J. Wenzler, J. Hvala, S. Tang, S. J. Knapp, and J. M. Burke
A Genomic Scan for Selection Reveals Candidates for Genes Involved in the Evolution of Cultivated Sunflower (Helianthus annuus)
PLANT CELL, November 1, 2008; 20(11): 2931 - 2945.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
R. G. Allaby, D. Q. Fuller, and T. A. Brown
From the Cover: The genetic expectations of a protracted model for the origins of domesticated crops
PNAS, September 16, 2008; 105(37): 13982 - 13986.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
P. K. Ingvarsson
Multilocus Patterns of Nucleotide Polymorphism and the Demographic History of Populus tremula
Genetics, September 1, 2008; 180(1): 329 - 340.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
Y. Koide, M. Ikenaga, N. Sawamura, D. Nishimoto, K. Matsubara, K. Onishi, A. Kanazawa, and Y. Sano
The Evolution of Sex-Independent Transmission Ratio Distortion Involving Multiple Allelic Interactions at a Single Locus in Rice
Genetics, September 1, 2008; 180(1): 409 - 420.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
C.-L. Huang, S.-Y. Hwang, Y.-C. Chiang, and T.-P. Lin
Molecular Evolution of the Pi-ta Gene Resistant to Rice Blast in Wild Rice (Oryza rufipogon)
Genetics, July 1, 2008; 179(3): 1527 - 1538.
[Abstract] [Full Text] [PDF]


Home page
J HeredHome page
H. Chen, P. L. Morrell, M. de la Cruz, and M. T. Clegg
Nucleotide Diversity and Linkage Disequilibrium in Wild Avocado (Persea americana Mill.)
J. Hered., March 14, 2008; (2008) esn016v1.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
B. Kilian, H. Ozkan, A. Walther, J. Kohl, T. Dagan, F. Salamini, and W. Martin
Molecular Diversity at 18 Loci in 321 Wild and 92 Domesticate Lines Reveal No Reduction of Nucleotide Diversity during Triticum monococcum (Einkorn) Domestication: Implications for the Origin of Agriculture
Mol. Biol. Evol., December 1, 2007; 24(12): 2657 - 2668.
[Abstract] [Full Text] [PDF]


Home page
ANN BOT (LOND)Home page
D. A. Vaughan, E. Balazs, and J. S. Heslop-Harrison
From Crop Domestication to Super-domestication
Ann. Bot., October 1, 2007; 100(5): 893 - 901.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
J. Ross-Ibarra, P. L. Morrell, and B. S. Gaut
Colloquium Papers: Plant domestication, a unique opportunity to identify the genetic basis of adaptation
PNAS, May 15, 2007; 104(suppl_1): 8641 - 8648.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrow All Versions of this Article:
24/3/875    most recent
msm005v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Zhu, Q.
Right arrow Articles by Ge, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zhu, Q.
Right arrow Articles by Ge, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?