Molecular Biology and Evolution 18:1343-1352 (2001)
© 2001 Society for Molecular Biology and Evolution
The Frequency Distribution of Nucleotide Variation in Drosophila simulans
Section of Evolution and Ecology, University of California at Davis
| Abstract |
|---|
|
|
|---|
Patterns of codon bias in Drosophila suggest that silent mutations can be classified into two types: unpreferred (slightly deleterious) and preferred (slightly beneficial). Results of previous analyses of polymorphism and divergence in Drosophila simulans were interpreted as supporting a mutation-selection-drift model in which slightly deleterious, silent mutants make significantly greater contributions to polymorphism than to divergence. Frequencies of unpreferred polymorphisms were inferred to be lower than frequencies of other silent polymorphisms. Here, I analyzed additional D. simulans data to reevaluate the support for these ideas. I found that D. simulans has fixed more unpreferred than preferred mutations, suggesting that this lineage has not been at mutation-selection-drift equilibrium at silent sites. Frequencies of polarized unpreferred polymorphisms are not skewed toward rare alleles. However, frequencies of unpolarized unpreferred codons are lower in high-bias genes than in low-bias genes. This supports the idea that unpreferred codons are borderline deleterious mutations. Purifying selection on silent sites appears to be stronger at twofold-degenerate codons than at fourfold-degenerate codons. Finally, I found that X-linked polymorphisms occur at a higher average frequency than polymorphisms on chromosome arm 3R, even though an average X-linked site is significantly less likely to be polymorphic than an average site on 3R. This result supports a previous analysis of D. simulans indicating different population genetics of X-linked versus autosomal mutations.
| Introduction |
|---|
|
|
|---|
An early result of theoretical population genetics was the expected frequency distribution of mutations under a neutral, equilibrium model of evolution (e.g., Wright 1938
Recent analyses of nucleotide polymorphism and divergence at eight genes from Drosophila simulans and its close relatives have led to three hypotheses regarding the frequency distribution of nucleotide polymorphisms (Akashi 1996, 1999
; Akashi and Schaeffer 1997
). The first hypothesis is that roughly equal numbers of preferred and unpreferred codons (mutations) have fixed along the D. simulans lineage. The observation is consistent with the notion that codon bias is not evolving in D. simulans (i.e., that D. simulans is at equilibrium for codon bias). The second hypothesis (Akashi and Schaeffer 1997
) is that unpreferred polymorphisms segregate at significantly lower frequencies than preferred polymorphisms in D. simulans. The third hypothesis is that replacement polymorphisms are skewed toward rare alleles in D. simulans (Akashi 1996, 1999
). According to this worldview, many unpreferred polymorphisms and replacement polymorphisms in D. simulans belong to a special category of "borderline" alleles. These alleles have selection coefficients such that Ns, the product of the effective population size and the selection coefficient, is close to 1. Selection on such alleles is sufficiently weak that they can reach appreciable frequencies, yet sufficiently strong that they are unlikely to reach high frequencies or fix (Kimura 1983
; Ohta 1992
).
A weakness of the D. simulans data, as acknowledged by Akashi (1996, 1999), is that support for a significant skew toward rare amino acid polymorphisms is based on data from only a few genes. Only three of the eight genes analyzed by Akashi (1996)
harbored amino acid polymorphism. Of the nine singleton amino acid polymorphisms, five were from the period locus. We would be unwise to draw general conclusions about the frequency distribution of amino acid polymorphisms from so few data. Given that greater numbers of silent polymorphisms were observed in D. simulans, conclusions on their frequency distribution would seem to be more sound. Nevertheless, period data account for about 30% of the derived singleton unpreferred polymorphisms in the D. simulans data analyzed by Akashi and Schaeffer (1997)
. If period data are excluded, there is no significant skew toward rare, unpreferred alleles in D. simulans (one-tailed Mann-Whitney U test; P = 0.11). This dependence of the statistical results on period data could be indicative of locus effects or could be attributable to reduced power associated with removal of a large amount of data from the analysis.
The conclusion of roughly equal numbers of preferred and unpreferred fixations is based on observation of only 27 mutations (Akashi 1999
). The observation of 14 unpreferred and 13 preferred fixations (Akashi 1999
) is compatible with an equilibrium model (i.e., 50% of the fixations preferred and 50% unpreferred). However, this observation is also compatible with an underlying model with highly asymmetric fixation rates of the two mutant types. For example, the observation of 14 unpreferred and 13 preferred fixations in D. simulans is compatible with an underlying model of 65% unpreferred and 35% preferred fixations (two-tailed binomial probability; P = 0.22). That is, although the equilibrium model was not rejected with the available D. simulans data, this should not be construed as strong support for the model.
In general, the previously available data from D. simulans were insufficient to draw strong conclusions on the frequency distribution. Here, I reexamine the frequency distributions of different types of mutations from a larger sample of D. simulans codons (Begun and Whitley 2000b
) and use the results to make inferences on the causes of variation in D. simulans populations.
| Materials and Methods |
|---|
|
|
|---|
Drosophila simulans alleles analyzed here are those reported in Begun and Whitley (2000b)
The criteria of Sharp and Lloyd (1993)
were used to assign codons to putative fitness classes, preferred and unpreferred. Following Akashi (1995, 1996)
, I analyzed the frequency distribution of polarized mutations. Polarized polymorphisms are those for which parsimony can be used to infer which of two alleles at a polymorphic codon is ancestral. Drosophila melanogaster and D. yakuba served as outgroups for the D. simulans data. I used a haphazardly selected allele from each of the two outgroup species for all inferences of the ancestral state in D. simulans. When each of the outgroup codons was identical to one of the segregating D. simulans codons, the outgroup codon was inferred to be the (monomorphic) codon in the hypothetical ancestral D. simulans population. Fixations along the D. simulans lineage were inferred when all D. simulans alleles had a particular base at a given site and both outgroups shared the same base, which was different from the base present in D. simulans. Changes from preferred alleles to unpreferred alleles are referred to as unpreferred mutations, while changes from unpreferred to preferred alleles are referred to as preferred (i.e., higher fitness) mutations. Codons harboring more than one mutation in the sample of three species were excluded from all analyses. Many of Akashi's analyses focused on silent mutations assigned to either of two fitness categories. Here, I also analyzed "no-change" mutations, defined as unpreferred-to-unpreferred changes, or preferred-to-preferred changes. These mutations are hypothesized to have lesser fitness effects than mutations between categories. Replacement mutations were polarized in the same way as silent mutations for the purposes of estimating frequencies, although they were not assigned to presumptive fitness categories. Polarized polymorphisms can have frequencies between 1/n and (n - 1)/n, where n is the number of sampled alleles.
I also analyzed unpolarized mutations. This approach has at least two advantages. First, there is no inference regarding the ancestral state, and thus no potential uncertainty or bias introduced into the analysis. Second, many more codons are available for analysis. For the purposes of this paper, most unpolarized analyses are on the frequencies of unpreferred codons. Unpolarized unpreferred polymorphisms can have frequencies ranging from 1/n to (n - 1)/n. Codons for which there were more than two alleles were excluded from the analysis. For silent versus replacement polymorphism frequencies (no parsing of silent mutations into fitness classes), the frequency of a codon was taken as the frequency of the less common allele.
For some analyses, I assessed the effect of codon bias on frequency of mutations by dividing the D. simulans genes into "higher-bias" and "lower-bias" categories. Higher bias genes were defined as having an effective number of codons (ENC; Wright 1990
) below the median ENC (43.8) of simulans genes in the data (appendix A); lower bias genes had ENCs above the median. The v and nos loci had the median ENC for the data; v was haphazardly assigned to the lower-bias category, while nos was assigned to the higher-bias category (none of the results are sensitive to this assignment). A more powerful approach for assessing the effect of bias on frequencies might result if we omit genes of intermediate codon bias. Therefore, in some analyses, I included only the genes having ENC values near the tails of the ENC distribution for all the data. For analyses of polarized mutations, the following genes were assigned to the high-bias category: Tpi, Hsc70, G6pd, mir, per, Pgd, and sn; the low-bias category included ry, hyd, dec-1, and Cp190. The mean ENCs for these high- and low-bias categories were 33.6 and 55.1, respectively, compared to a mean ENC of 44.2 for the 40 D. simulans genes from Begun and Whitley (2000b)
. For analyses of unpolarized polymorphisms, the high-bias genes included Yp3, Yp2, and sqh in addition to the above set. Low-bias genes for unpolarized analyses included those used in the polarized data, as well as fzo, mei-9, otu, Gld, mei-218, AATS, and ovo.
| Results |
|---|
|
|
|---|
Average Frequencies of Unpolarized Silent and Replacement Polymorphisms
Table 1 shows the frequencies of silent and replacement polymorphisms in D. simulans. Overall frequencies (±SE) of silent and replacement polymorphisms are 0.262 (±0.004) and 0.260 (±0.012), respectively; there is no evidence that silent and replacement polymorphisms occur at different average frequencies in our sample. Mean frequencies (±SE) of X-linked and 3R polymorphisms (silent + replacement) are 0.273 (±0.007) and 0.256 (±0.005), respectively.
|
Average Frequency of Polarized Mutations
Table 2 shows the average frequencies among 350 polymorphisms from three silent categories and from the replacement category. A Kruskal-Wallis test of mutants in the four categories does not reject the null hypothesis of equal distributions (P = 0.39). Average frequencies of unpreferred versus preferred polymorphisms are not significantly different (Mann-Whitney; P = 0.16). The frequencies of unpreferred versus preferred polymorphisms are not significantly different for the X-linked genes (P = 0.16) or for the 3R genes (P = 0.54) considered separately.
|
When fixed mutations (frequency = 1.0) are included in the estimation of mean frequencies, the frequency of preferred mutants (n = 89, mean = 0.69) is significantly higher (Mann-Whitney; P < 0.0001) than the frequency of unpreferred mutants (n = 275, mean = 0.49). The difference between the results on mean frequencies with versus without fixations is easily understandable from the observation that the ratio of preferred to unpreferred fixations is much higher than the ratio of preferred to unpreferred polymorphisms.
Under the mutation-selection-drift model of silent-site evolution, genes under stronger selection for codon bias might be expected to show a greater skew toward rare alleles for unpreferred mutations (Akashi 1999
; McVean and Charlesworth 1999
). Figure 1
, a scatterplot of codon bias in D. simulans genes (ENC) versus the average frequency of unpreferred polymorphisms (per gene), reveals no effect of codon bias on the average frequency of unpreferred polymorphisms. Although the mean frequency of unpreferred polymorphisms is lower in higher-bias genes (n = 114 polymorphisms, frequency = 0.307) than in lower-bias genes (n = 90 polymorphisms, frequency = 0.333), the difference is not significant (Mann-Whitney; P = 0.16). There is no difference in the average frequencies of unpreferred polymorphisms in high-bias genes (n = 59 polymorphisms) versus low-bias genes (n = 25 polymorphisms) (Mann-Whitney; P = 0.47). The ratios of unpreferred to preferred polymorphisms are not significantly different in the higher-bias (114:19) versus lower-bias (90:24) genes; neither are the ratios significantly different in the high-bias (59:8) versus low-bias (25:6) genes. Overall, there is little evidence that selection has heterogeneous effects on the mean frequencies of derived polymorphisms across categories of mutants.
|
Frequency Distribution of Polarized Mutations
Polarized polymorphisms and fixations from different categories were assigned to one of six frequency classes: >0.0 and
0.20, >0.20 and
0.40, >0.40 and
0.60, >0.60 and
0.80, >0.80 and <1.0, and 1.0 (table 3
). The 4 x 5 contingency table of polymorphisms (X and 3R data pooled) is not significantly heterogeneous (P = 0.96). Thus, there is no support for a skew toward rare alleles for unpreferred polymorphisms, or for any difference in frequency distribution across mutational types. Table 4
shows the distributions of preferred and unpreferred polymorphisms and fixations in higher-bias versus lower-bias genes. The 4 x 5 contingency table of preferred versus unpreferred polymorphisms in higher-bias versus lower-bias genes is not significantly heterogeneous (P = 0.22). The 4 x 6 contingency table that includes fixations is significantly heterogeneous (P = 0.002). Table 5
shows the frequency distribution of preferred and unpreferred mutations for genes from the extreme bias categories, high versus low. A homogeneity test of the frequencies of unpreferred mutations in the high- versus low-bias genes is marginally significant when the fixations are included (P = 0.05) but is not significant when only the polymorphic mutations are used (P = 0.23). The ratios of derived singleton to nonsingleton polymorphisms for unpreferred (90:114) versus preferred (15:28) mutants are not significantly different (G-test; P = 0.31).
|
|
|
Frequencies of Derived X-Linked Versus Autosomal Polymorphisms
For each of four mutant classes, X-linked alleles occur at higher average frequencies than alleles on chromosome arm 3R (table 2 ). Considering each of the 350 polarized polymorphisms as an independent observation, the difference in average frequency between chromosomes is highly significant (Mann-Whitney; P = 0.004). The higher frequency of X-linked polymorphisms is consistent across preferred, unpreferred, no-change, and replacement variants. If one calculates the average frequency of polarized polymorphisms for each gene, the average is significantly higher for X-linked genes (0.389, n = 10) than for 3R genes (0.304, n = 13) (Mann-Whitney U; P = 0.01). Tajima's (1989)
Tests of Polymorphism and Divergence
Tables 6 and 7
show the numbers of polarized polymorphisms and fixations in D. simulans. The 2 x 4 contingency tables are significantly heterogeneous with all the data (P < 0.001) and with the Relish and G6pd data excluded (P < 0.001). There is strong evidence that both G6pd and Relish have undergone adaptive protein evolution in the D. simulans lineage. Therefore, significant heterogeneity of the data in Table 7
shows that large numbers of excess amino acid fixations in Relish and G6pd (Eanes et al. 1996
; Begun and Whitley 2000a
) do not account for the result. As one would suspect from inspection of table 7 , the ratio of polymorphic to fixed mutations is not significantly heterogeneous for the unpreferred, no-change, and replacement mutations (P = 0.23). Thus, the main cause of the significant rejection of homogeneity in this table is that the ratio of preferred fixations to polymorphisms is significantly greater than the ratio observed for the other mutant classes.
|
|
A lineage/gene at equilibrium for codon bias is expected to fix equal numbers of preferred and unpreferred mutations. A binomial test of the equilibrium hypothesis that 50% of D. simulans fixations are preferred is significant for 23 genes (table 6 ; P = 0.021, two-tailed), and for 21 genes (table 7 ; Relish and G6pd excluded, P = 0.004, two-tailed). Sixteen genes deviate from the expectation of equal numbers of preferred and unpreferred fixations in D. simulans; of these, 11 deviate in the direction of more unpreferred than preferred fixations, while only five deviate in the opposite direction. These results support the notion that there has been a decline of codon bias along the D. simulans lineage, although the effect is not as pronounced as it is in D. melanogaster (Akashi 1996
Silent-site divergence was estimated by counting all silent mutants that fixed along the D. simulans lineage; polymorphism data from D. simulans, as well as outgroup data from D. melanogaster and D. yakuba, were used in the analysis. Figure 2
shows that there is no correlation between codon bias (ENC) and the silent-site divergence in the D. simulans lineage. An earlier study showing a similar result did not examine the D. simulans lineage separately (Powell and Moriyama 1997
). Silent divergence for 10 X-linked genes (0.029) was slightly greater than the divergence for 3R genes (0.021); the difference was marginally significant (P = 0.04) by a Mann-Whitney test. However, there was no difference in the ratio of unpreferred to preferred fixations for X-linked (34:28) versus 3R (37:18) genes.
|
The data from tables 4 and 5 can be used to ask whether variation in levels of overall codon bias affect the ratio of polymorphic to fixed unpreferred mutations. The 2 x 2 contingency table of polymorphic versus fixed unpreferred mutants is not significantly heterogeneous for the higher-bias versus lower-bias genes. However, the corresponding table for the more extreme high-bias versus low-bias comparison was significantly heterogeneous (P = 0.03).
Frequency of Unpolarized Unpreferred Codons
There were 422 codons that were polymorphic for an unpreferred codon and a preferred codon (codons with allele frequencies of 0.5 were excluded). Of these, the rarer allele was unpreferred at 285 codons. Unpolarized data can be used directly in tests to determine if unpreferred codons are maintained at low frequency by natural selection. Under the mutation-selection-drift model, the degree of codon bias reflects the intensity of purifying selection at silent sites. The frequency of the unpreferred codon was calculated for each of 452 codons (n = 40 genes) in which one allele was unpreferred and one was preferred. Figure 3 shows the relationship between ENC and the frequency of unpreferred alleles per gene; the two variable are significantly correlated (Spearman correlation; P = 0.005). Furthermore, the average frequency of unpreferred codons is marginally significantly lower (Mann-Whitney; P = 0.04) in the higher-bias genes (mean = 0.213) than in the lower-bias genes (mean = 0.252). The same is true for the 10 most biased (mean frequency of unpreferred alleles per gene = 0.219) versus the 10 least biased (mean frequency of unpreferred alleles per gene = 0.286) genes among the 40 D. simulans genes (Mann-Whitney; P = 0.03). These analyses support the notion that frequencies of unpreferred codons are depressed by purifying selection. Further support for this notion comes from categorization of unpreferred polymorphisms into two categories, singletons versus nonsingletons. There are 85 singletons and 111 nonsingletons for higher-bias genes; there are 61 singletons and 195 nonsingletons for lower-bias genes. The proportion of unpreferred polymorphisms that are singletons is significantly greater in higher-bias than in lower-bias genes (G-test; P < 0.0001), as one would expect if purifying selection depresses frequencies of unpreferred codons more effectively in higher-bias genes.
|
Twofold-Degenerate Versus Fourfold-Degenerate Codons
The previous analyses of silent mutations did not distinguish between those at twofold-degenerate versus fourfold-degenerate codons. The frequency of derived silent mutations (preferred + unpreferred + no change) is lower for the twofold (n = 90 polymorphisms, mean = 0.284) than for the fourfold (n = 195, mean = 0.343) codons; however, the difference is only marginally significant (Mann Whitney; P = 0.04). As one might expect given this result and given that most D. simulans polymorphisms are unpreferred, the mean frequency of derived, unpreferred polymorphisms at twofold codons (n = 70, mean = 0.269) is lower than the corresponding frequency at fourfold codons (n = 115, mean = 0.339); the difference is marginally significant by a Mann-Whitney test (P = 0.04). In spite of the marginally significant difference in means, the frequency distribution of unpreferred polymorphisms in twofold versus fourfold codons (table 8 ) is not significantly different (P = 0.45). Table 9 shows the frequencies of unpolarized unpreferred codons in twofold-degenerate codons from genes of higher versus lower degrees of codons bias. This 2 x 5 contingency table is significantly heterogeneous (P = 0.013). Frequencies of unpolarized unpreferred codons in higher-bias genes are skewed toward rare alleles compared with the distribution for lower-bias genes, as predicted if these codons are maintained at low frequency by purifying selection.
|
|
Potential Biases
Conclusions from polarized data would be weakened if the subset of genes for which we have D. yakuba data differed in some important way from a random sample of genes (e.g., those analyzed in Begun and Whitley [2000b]
/silent divergence is not significantly different for X versus 3R genes included in this study (P = 0.09). Genes evolving more quickly are less likely to be included in the study of polarized mutations, because such genes are expected to have greater PCR failure rates in D. yakuba when PCR primers are designed from D. melanogaster sequence. This could result in a biased sample. However, under the neutral model, genes that evolve more slowly are expected to be less polymorphic. Therefore, if a bias were expected, one would imagine that the D. simulans samples for which D. yakuba data were available would tend to be less polymorphic than a random set of genes successfully amplified and sequenced from D. simulans. This is not the case. In fact, the silent-site divergences between D. melanogaster and D. simulans for genes with versus without a successfully isolated D. yakuba sequence are not significantly different (P = 0.80). It seems likely, therefore, that the difference between the X-linked genes analyzed here and the entire set of X-linked genes analyzed in Begun and Whitley (2000b)
versus the average frequency of polarized polymorphisms for each gene (the frequency of a polymorphism does not affect its contribution to
). Genes, mutations in which can be polarized, show no correlation between the variables. Therefore, there is no reason to believe that conclusions from this paper are compromised by sampling artifacts. Another potential bias in the analysis of codon bias comes from the inference of ancestral states. A codon can be used in the analysis of polarized mutations only if there is a single mutation in the history of a sample that includes three species, D. simulans, D. melanogaster and D. yakuba. Therefore, we expect such codons to be evolving more slowly than a random sample of codons. If codons evolve more slowly because they experience stronger selection for codon bias, then we expect their frequency spectrum to be more skewed toward excess rare unpreferred polymorphisms compared with average D. simulans codons (Akashi 1996
|
| Discussion |
|---|
|
|
|---|
Akashi (1996)
The results presented here are similar to Akashi's in that the ratio of polarized unpreferred to preferred polymorphisms is much greater than the ratio of unpreferred to preferred fixations. If one attributes this result to "too many" unpreferred polymorphisms, then a plausible explanation for such an excess is that unpreferred polymorphisms are borderline deleterious mutations (i.e., 1 < Ns < 3) (Akashi 1996
). The contribution of such mutations to polymorphism is expected to be greater than their contribution to divergence (e.g., Ohta and Kimura 1971
; Kimura 1983
; Ohta 1992
). Their frequencies in samples are expected to be lower than frequencies of neutral polymorphisms or borderline beneficial polymorphisms (Akashi 1999
). Previous analyses of polymorphisms from D. simulans provided little support for skewed frequency distributions for mutants of various putative fitness classes. The results from polarized polymorphisms in D. simulans presented here also provide little support for heterogeneity of frequencies between unpreferred, preferred, or amino acid polymorphisms. On the other hand, analysis of unpolarized polymorphisms from higher-bias versus lower-bias genes provides the best evidence to date for a skew toward rare alleles for unpreferred polymorphisms. Excess numbers of unpreferred polymorphisms and the skew toward rare alleles for unpolarized unpreferred codons provide complementary support for the notion that borderline mutations make significant contributions to silent variation in D. simulans. The rather different results for the unpolarized versus polarized mutations is, however, a bit troubling. This is especially true given that one expects biases arising from analysis of polarized mutations to result in a greater likelihood of detecting skews toward rare alleles for unpreferred mutations. A possible explanation for the discrepancy is that analysis of unpolarized unpreferred mutations is more powerful because there are greater numbers of unpolarized mutations (452) than of polarized mutations (204). Given the results in table 4
, it would not be surprising if larger samples of polarized unpreferred polymorphisms from higher-bias versus lower-bias genes supported a skew toward rare alleles in higher-bias genes.
The analyses presented here support the idea that selection at silent sites is stronger at twofold codons than at fourfold codons. A reasonable interpretation is that fourfold codons sometimes (or often) have more than two potential fitness classes. Assume the least fit allele at a fourfold codon is as deleterious as the less fit allele at a twofold codon. If this is true, then we expect the average unpreferred allele at a fourfold codon to be selected more weakly than the average unpreferred allele at a twofold codon. Kreitman and Antezana (2000)
noted that the rank order of alternative codon frequency for most four-codon families was conserved between D. melanogaster and D. pseudoobscura. This suggests that there are more than two fitness classes and as many as four fitness classes for some codon families. Frequencies of polymorphisms for twofold and fourfold codons in D. simulans support this hypothesis. If true, the hypothesis predicts that many "no-change" mutations are very weakly deleterious (although some may also be weakly beneficial). The observation that the ratio of polymorphic to fixed no-change mutations is similar to the ratio for unpreferred mutations (tables 6 and 7
) is consistent with the hypothesis that the two types of mutations have similar distributions of selection coefficients. The summary of codon use in high-bias D. melanogaster genes given in Kreitman and Antezana (2000)
was used to assign a fitness ranking based on relative abundance. The number of fitness classes was equal to the size of the codon family (two-, three-, or fourfold). All D. simulans mutations previously assigned to the no-change category, along with those that had not been assigned any category based on the analysis of Sharp and Lloyd (1993)
, were reclassified as preferred or unpreferred based on these rankings. Among the reclassified polymorphic mutations, roughly twice as many are to "lower-fitness" codons (49) as to "higher-fitness" codons (23). Among the reclassified fixed mutations, 12 are to lower-fitness codons, while 9 are to higher-fitness codons. Although the 2 x 2 contingency table is not significantly heterogeneous, the configuration is in the same direction as for the unpreferred mutations. This, too, is consistent with the idea that categorization of silent mutations into two categories is overly simplistic.
Begun and Whitley (2000b)
suggested that reduced X-linked versus autosomal polymorphism in D. simulans is best explained by stronger effects of positively selected mutants on the X chromosome. The result reported here, that conditioned on a site being polymorphic in a sample, X-linked polymorphisms occur at a higher frequency than those on 3R (table 2
), is another distinguishing feature of X-linked versus autosomal variation in this species. Further theoretical research is required to determine which models of linked selection may be able to account for these data (e.g., Gillespie 1997
; Fay and Wu 2000
).
Sequencing surveys and microsatellite analyses of D. simulans are indicative of small but significant differentiation between populations and slightly higher levels of variation in African versus United States D. simulans (Irvin et al. 1998
; Hamblin and Veuille 1999
). Inferences on the dynamics of mutations in D. simulans populations reported here rely on comparisons of different mutant classes or comparison of mutations on different chromosomes. Because deviations from population equilibrium are expected to affect all weakly selected sites in the genome in a similar manner, such comparisons remain useful. Nevertheless, theoretical studies would be required to confirm that deviations from equilibrium have only minor effects on the behavior of the tests carried out for this paper.
Cargill et al. (1999)
measured polymorphism in 106 human genes, with an average sample size of 114 alleles per gene. They found that replacement polymorphisms occurred at a significantly lower average frequency than silent polymorphisms, primarily because replacement polymorphisms were overrepresented among the class of very rare alleles. They attributed this observation to stronger purifying selection against replacement polymorphisms than against silent polymorphisms. Determining whether the frequency distributions of replacement polymorphism in Drosophila populations and human populations are similar would require sampling of larger numbers of D. simulans alleles.
|
|
| Acknowledgements |
|---|
|
|
|---|
I thank the anonymous reviewers and D. Rand for useful comments. This work was supported by NIH GM55298 and by the Alfred P. Sloan Foundation.
| Footnotes |
|---|
David M. Rand, Reviewing Editor
1 Keywords: Drosophila,
DNA variation
population genetics
molecular evolution
natural selection ![]()
2 Address for correspondence and reprints: David Begun, Section of Evolution and Ecology, University of California, Davis, California 95616. E-mail: djbegun{at}ucdavis.edu ![]()
| References |
|---|
|
|
|---|
Akashi H., 1994 Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy Genetics 136:927-935[Abstract]
. 1995 Inferring weak selection from patterns of polymorphism and divergence at silent sites in Drosophila Genetics 139:1067-1076[Abstract]
. 1996 Molecular evolution between Drosophila melanogaster and D. simulans: Reduced codon bias, faster rates of amino acid substitution, and larger proteins in D. melanogaster. Genetics 144:1297-1307[Abstract]
. 1999 Inferring the fitness effects of DNA mutations from polymorphism and divergence data: statistical power to detect directional selection under stationarity and free recombination Genetics 151:221-238
Akashi H., S. W. Schaeffer, 1997 Natural selection and the frequency distribution of "silent" DNA polymorphism in Drosophila Genetics 146:295-307[Abstract]
Begun D. J., P. Whitley, 2000a. Adaptive evolution of RELISH, a Drosophila NF-
B/I
B protein Genetics 154:1231-1238
. 2000b. Reduced X-linked nucleotide polymorphism in Drosophila simulans. Proc. Natl. Acad. Sci. USA 97:5960-5965
Bulmer M., 1991 The selection-mutation-drift theory of synonymous codon usage Genetics 129:897-907[Abstract]
Cargill M., D. Altshuler, J. Ireland, et al. (17 co-authors) 1999 Characterization of single-nucleotide polymorphisms in coding regions of human genes Nat. Genet 22:231-238[Web of Science][Medline]
Charlesworth B., M. T. Morgan, D. Charlesworth, 1993 The effect of deleterious mutations on neutral molecular variation Genetics 134:1289-1303[Abstract]
Eanes W. F., M. Kirchner, J. Yoon, C. H. Biermann, I. N. Wang, M. A. McCartney, B. C. Verrelli, 1996 Historical selection, amino acid polymorphism and lineage-specific divergence at the G6pd locus in Drosophila melanogaster and D. simulans. Genetics 144:1027-1041[Abstract]
Fay J. C., C.-I. Wu, 2000 Hitchhiking under positive Darwinian selection Genetics 155:1405-1413
Gillespie J. H., 1997 Junk ain't what junk does: neutral alleles in a selected context Gene 205:291-299[Web of Science][Medline]
Hamblin M., M. Veuille, 1999 Population structure among African and derived populations of Drosophila simulans: evidence for ancient subdivision and recent admixture Genetics 153:305-317
Irvin S. D., K. A. Wetterstrand, C. M. Hutter, C. F. Aquadro, 1998 Genetic variation and differentiation at microsatellite loci in Drosophila simulans: evidence for founder effects in new world populations Genetics 150:777-790
Kimura M., 1983 The neutral theory of molecular evolution Cambridge University Press, Cambridge, England
Kreitman M., M. Antezana, 2000 Population and evolutionary genetics of codon usage in Drosophila Pp. 82101 in R. Singh and C. Krimbas, eds. Evolutionary genetics: from molecules to morphology. Cambridge University Press, Oxford, England
Li W.-H., 1987 Models of nearly neutral mutations with particular implications for nonrandom usage of synonymous codons J. Mol. Evol 24:337-345[Web of Science][Medline]
McVean G. A. T., B. Charlesworth, 1999 A population genetic model for the evolution of synonymous codon usage: patterns and predictions Genet. Res 74:145-158
McVean G. A. T., J. Vieira, 1999 The evolution of codon preference in Drosophila: a maximum-likelihood approach to parameter estimation and hypothesis testing J. Mol. Evol 49:63-75[Web of Science][Medline]
Maruyama T., P. A. Fuerst, 1984 Population bottlenecks and nonequilibrium models in population genetics. I. Allele numbers when populations evolve from zero variability Genetics 108:745-763
Ohta T., 1992 The nearly neutral theory of molecular evolution Annu. Rev. Ecol. Syst 23:263-286[Web of Science]
Ohta T., M. Kimura, 1971 On the constrancy of the evolutionary rate of cistrons J. Mol. Evol 1:18-25[Medline]
Powell J. R., E. N. Moriyama, 1997 Evolution of codon usage bias in Drosophila Proc. Natl. Acad. Sci. USA 94:7784-7790
Rozas J., R. Rozas, 1999 DnaSP 3: an integrated program for molecular population genetics and molecular evolution analysis Bioinformatics 15:174-175
Sharp P. M., A. T. Lloyd, 1993 Codon usage Pp. 378397 in G. Maroni, ed. An atlas of Drosophila genes: sequences and molecular features. Oxford University Press, Oxford, England
Sturtevant A. H., 1929 Contributions to the genetics of Drosophila simulans and Drosophila melanogaster. Publ. Carnegie Inst 399:1-62
Tajima F., 1989 Statistical method for testing the neutral mutation hypothesis by DNA polymorphism Genetics 123:585-595
Takano-Shimizu T., 1999 Local recombination and mutation effects on molecular evolution in Drosophila Genetics 153:1285-1296
True J. R., J. M. Mercer, C. C. Laurie, 1996 Differences in crossover frequency and distribution among three sibling species of Drosophila Genetics 142:507-523[Abstract]
Wright S., 1938 The distribution of gene frequencies under irreversible mutation Proc. Natl. Acad. Sci. USA 24:253-259
. 1990 The "effective number of codons" used in a gene Gene 87:23-29[Web of Science][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
P. R. Haddrill, D. Bachtrog, and P. Andolfatto Positive and Negative Selection on Noncoding DNA in Drosophila simulans Mol. Biol. Evol., September 1, 2008; 25(9): 1825 - 1834. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Heger and C. P. Ponting Variable Strength of Translational Selection Among 12 Drosophila Species Genetics, November 1, 2007; 177(3): 1337 - 1348. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Flowers, E Sezgin, S Kumagai, D. Duvernell, L. Matzkin, P. Schmidt, and W. Eanes Adaptive Evolution of Metabolic Pathways in Drosophila Mol. Biol. Evol., June 1, 2007; 24(6): 1347 - 1354. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. J. Welch Estimating the Genomewide Rate of Adaptive Protein Evolution in Drosophila Genetics, June 1, 2006; 173(2): 821 - 837. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Akashi, W.-Y. Ko, S. Piao, A. John, P. Goel, C.-F. Lin, and A. P. Vitins Molecular Evolution in the Drosophila melanogaster Species Subgroup: Frequent Parameter Fluctuations on the Timescale of Molecular Divergence Genetics, March 1, 2006; 172(3): 1711 - 1726. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. C. Fay and J. A. Benavides Hypervariable Noncoding Sequences in Saccharomyces cerevisiae Genetics, August 1, 2005; 170(4): 1575 - 1587. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. D. Kern and D. J. Begun Patterns of Polymorphism and Divergence from Noncoding Sequences of Drosophila melanogaster and D. simulans: Evidence for Nonequilibrium Processes Mol. Biol. Evol., January 1, 2005; 22(1): 51 - 62. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. D. Shoemaker, K. A. Dyer, M. Ahrens, K. McAbee, and J. Jaenike Decreased Diversity but Increased Substitution Rate in Host mtDNA as a Consequence of Wolbachia Endosymbiont Infection Genetics, December 1, 2004; 168(4): 2049 - 2058. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Qin, W. B. Wu, J. M. Comeron, M. Kreitman, and W.-H. Li Intragenic Spatial Patterns of Codon Usage Bias in Prokaryotic and Eukaryotic Genomes Genetics, December 1, 2004; 168(4): 2245 - 2260. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Comeron Selective and Mutational Patterns Associated With Gene Expression in Humans: Influences on Synonymous Composition and Intron Presence Genetics, July 1, 2004; 167(3): 1293 - 1304. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Bierne and A. Eyre-Walker The Genomic Rate of Adaptive Amino Acid Substitution in Drosophila Mol. Biol. Evol., July 1, 2004; 21(7): 1350 - 1360. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. B. DuMont, J. C. Fay, P. P. Calabrese, and C. F. Aquadro DNA Variability and Divergence at the Notch Locus in Drosophila melanogaster and D. simulans: A Case of Accelerated Synonymous Site Divergence Genetics, May 1, 2004; 167(1): 171 - 185. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Gravot, M. Huet, and M. Veuille Effect of Breeding Structure on Population Genetic Parameters in Drosophila Genetics, February 1, 2004; 166(2): 779 - 788. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. A. Schlenke and D. J. Begun Natural Selection Drives Drosophila Immune System Evolution Genetics, August 1, 2003; 164(4): 1471 - 1480. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. Begun and P. Whitley Molecular Population Genetics of Xdh and the Evolution of Base Composition in Drosophila Genetics, December 1, 2002; 162(4): 1725 - 1735. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. D. Kern, C. D. Jones, and D. J. Begun Genomic Effects of Nucleotide Substitutions in Drosophila simulans Genetics, December 1, 2002; 162(4): 1753 - 1761. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Eyre-Walker Changing Effective Population Size and the McDonald-Kreitman Test Genetics, December 1, 2002; 162(4): 2017 - 2024. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Eyre-Walker, P. D. Keightley, N. G. C. Smith, and D. Gaffney Quantifying the Slightly Deleterious Mutation Model of Molecular Evolution Mol. Biol. Evol., December 1, 2002; 19(12): 2142 - 2149. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Comeron and M. Kreitman Population, Evolutionary and Genomic Consequences of Interference Selection Genetics, May 1, 2002; 161(1): 389 - 410. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Hey and R. M. Kliman Interactions Between Natural Selection, Recombination and Gene Density in the Genes of Drosophila Genetics, February 1, 2002; 160(2): 595 - 608. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. Begun Protein Variation in Drosophila simulans, and Comparison of Genes from Centromeric Versus Noncentromeric Regions of Chromosome 3 Mol. Biol. Evol., February 1, 2002; 19(2): 201 - 203. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





