MBE Advance Access originally published online on December 13, 2007
Molecular Biology and Evolution 2008 25(2):454-467; doi:10.1093/molbev/msm275
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
Contrasting the Efficacy of Selection on the X and Autosomes in Drosophila
Department of Molecular Biology and Genetics, Cornell University
E-mail: aml69{at}cornell.edu.
| Abstract |
|---|
|
|
|---|
To investigate the relative efficacy of both positive and purifying natural selection on the X chromosome and the autosomes in Drosophila, we compared rates and patterns of molecular evolution between these chromosome sets using the newly available alignments of orthologous genes from 12 species. Parameters that may influence the relative X versus autosomal substitution rates include the relative effective population sizes, the male and female germline mutation rates, the distribution of allelic effects on fitness, and the degree of dominance of novel mutations. Our analysis reveals that codon usage bias is consistently greater for X-linked genes, suggesting that purifying selection consistently has greater efficacy on the X chromosome than on the autosomes across the Drosophila phylogeny. However, our results are less consistent with respect to the efficacy of positive selection, with only some lineages showing a higher substitution rate on the X chromosome. This suggests that either the distribution of selective effects of mutations or other relevant parameters are sufficiently variable across species to tip the balance in different ways in individual lineages. These data suggest that rates of substitution are not solely governed by adaptive evolution. This genome-wide analysis provides a clear picture that the efficacy of selection varies intragenomically and that this effect is markedly more consistent across the phylogeny in the case of purifying selection. Our results also suggest that simple models that predict systematic differences in rates of evolution between the X and the autosomes can only be made to be compatible with these Drosophila data if the relevant population genetic parameters that drive substitution rates differ among species and chromosomal contexts.
Key Words: faster-X positive selection purifying selection Drosophila
| Introduction |
|---|
|
|
|---|
The efficacy of natural selection depends on a number of different parameters such as the magnitude of selection coefficients, the distribution of dominance coefficients of allelic variants, and the effective population size. When the effects of selection are relatively weak, the selective effects of linked mutations may also contribute, making factors such as recombination rate and chromosomal location important determinants of rates of fixation of beneficial and deleterious alleles. Because it is affected by a multitude of factors, the efficacy of natural selection can vary widely among species but may vary across genomic contexts within a species as well. In particular, for taxa such as Drosophila in which males are hemizygous for the X chromosome, the increased visibility of novel mutations to selective forces in males may lead to an overall increase in the efficacy of natural selection on the X chromosome relative to the autosomes. However, the reduced effective size of the X chromosome relative to the autosomes, assuming equal effective numbers of breeding males and females, implies that weakly selected variants on the X may have their dynamics mediated to a greater degree by neutral drift. This may ultimately temper the heightened efficacy of selection on the X chromosome arising from hemizygosity of the X chromosome in males.
Increased rates of substitution on the X chromosome resulting from increases in the efficacy of positive selection on this chromosome arise under a variety of conditions. Table 1 presents theoretical ratios of substitution rates of the X chromosome to the autosomes in a single-locus model with selection coefficients sf and sm in females and males, respectively, mutation rates µm and µf in males and females, respectively, and dominance parameter h (Charlesworth et al. 1987
; Vicoso and Charlesworth 2006
). This theory suggests that if novel mutations are on average at least partially recessive (defined here as 0 < h < 0.5), then rates of adaptive evolution on the X chromosome should exceed those on the autosomes (traditionally referred to as the "faster-X" hypothesis) (Avery 1984
; Charlesworth et al. 1987
); this inequality holds for both small and large coefficients of selection (Betancourt et al. 2004
).
|
It is important to note that the above predictions are based on the assumptions of equal numbers of breeding males and females, equal mean and variance in reproductive success for the sexes (i.e., no segregating variation in fitness apart from newly arisen mutations), and identical distributions of selection and dominance coefficients acting on novel mutations. If selection acts on mutations already segregating in the population, then rates of adaptive evolution on the autosomes will exceed those on the X chromosome (Charlesworth et al. 1987
Increases in the efficacy of purifying selection on the X chromosome, in contrast, are predicted to have an opposing effect on rates of substitution under certain conditions (table 1). More efficient removal of deleterious alleles will reduce the substitution rate of deleterious mutations for those mutations that are at least partially recessive (Charlesworth et al. 1987
), which would lead to decreased rates of substitution on the X chromosome. Moreover, because codon bias is modulated by weak selection on synonymous sites and is a consequence of selection-drift-mutation balance (Sharp and Li 1986
; Bulmer 1991
; Akashi 1997
; McVean and Charlesworth 1999
), an increase in the efficacy of purifying selection on the X also predicts that codon bias should be greater for X-linked genes than for autosomal genes.
The newly available whole-genome sequences of 12 Drosophila species afford us an unprecedented opportunity to assess differences in the relative efficacy of selection on the X chromosome and the autosomes. Not only can we explore potential variation in the efficacy of selection between the X and autosomes within each of these 12 species but we can also examine this question in a clade-specific context for the first time, at a genomic scale. Because we can generate a priori predictions regarding the effects of an increased efficacy of positive selection on rates of substitution, we examined potential differences in the efficacy of positive selection between the X and the autosomes by comparing substitution rates of X-linked and autosomal genes. The availability of genomic sequence from so many closely related species facilitates identification of genes evolving under positive selection across the phylogeny in addition to genes with lineage-specific increases in evolutionary rate. This affords us the opportunity to test for increased efficacy of positive selection on the X chromosome in subsets of rapidly evolving genes.
Our results provide strong support for an increased efficacy of purifying selection on the X chromosome across the Drosophila phylogeny. However, there does not appear to be a strong signal of an increased efficacy of positive selection on the X chromosome in these species, as rates of substitution are not systematically increased on this chromosome. The results are sensitive to the metric of substitution employed in the comparison and vary considerably among species. We suggest that whereas positive selection may be more efficacious on the X chromosome, adaptive evolution from novel mutations is not sufficiently pervasive to systematically inflate substitution rates on the X chromosome in Drosophila.
| Materials and Methods |
|---|
|
|
|---|
Coding Sequence Alignments
Two sets of alignments were used for this analysis, both of which are based on the masked alignments as described elsewhere (Drosophila 12 Genomes Consortium 2007
, dN, and dS in the melanogaster group (described below). The second set includes 6698 genes with a single ortholog in all 12 fully sequenced Drosophila genomes. This set was used for inferences of amino acid divergence and codon bias across the Drosophila phylogeny (described below).
Evolutionary Analysis
Estimates of
, dN, and dS were obtained for each of the 8510 alignments in the melanogaster group from branch models run in PAML (version 3.1) (Yang 1998
). The codon substitution model used here makes a number of assumptions whose validity we discuss throughout the paper. An assumption in this model is that there is no heterogeneity among sites in selection pressure (i.e.,
is constant across sites). These models allowed us to obtain branch-specific estimates of evolutionary rate parameters. Five branch tests were used, where for each test, one terminal lineage of the melanogaster subgroup was allowed to have a different
than the rest of the melanogaster group phylogeny. Note that Drosophila ananassae was included in the phylogeny, but given the near saturation at synonymous sites, we did not use branch models for this lineage.
We used 2 methods to estimate rates of adaptation on a particular branch of the phylogeny. Our first method was to assess the rate of evolution of each gene on a terminal branch relative to the rest of the phylogeny and identify genes that show a significant relative acceleration in evolutionary rate on that branch using the branch-specific codon substitution models described above. To test for significant differences in
between each terminal lineage and the rest of the tree, we performed likelihood ratio tests (LRT) assuming that the LRT statistic follows a
2 distribution. Inspection of the distribution of the LRT statistic revealed that this assumption is appropriate, as it conforms well to the
2 distribution. A significant P value for this test coupled with the observation that the terminal branch
exceeds the estimate of
for the rest of the phylogeny indicates a branch-specific acceleration for a gene. Our second method was to test for positive selection across the entire phylogeny. The test for positive selection across the phylogeny compares models that allow
to vary among sites (M7 and M8) and identifies genes that show support for a class of codons within the gene with
>1. The test for positive selection was done by comparing models M7 and M8 and using simulations to generate a null distribution of LRT statistics to generate P values for this test (for a full description of methods, see Drosophila 12 Genomes Consortium 2007
or Larracuente et al., forthcoming). Any genes that had a significant deceleration on a particular lineage from the branch models were removed from that species analysis. All PAML results (including P values) are available for download at FlyBase (ftp://ftp.flybase.net/12_species_analysis).
We also estimated amino acid divergence for orthologous sequences in the set of 6698 alignments for all 12 sequenced Drosophila genomes. We used a model implemented in the CODEML package in PAML that translates codons to amino acids to estimate amino acid divergence for each branch of the phylogeny. We report the terminal branch lengths for individual species with 2 exceptions. Because of the short terminal branch lengths for Drosophila persimilis and Drosophila pseudoobscura, the amino acid divergence on the shared branch immediately preceding the split of these 2 lineages was added to the terminal amino acid divergence of each of these 2 species. Thus, the intragenomic comparisons of rates of evolution are expected to be similar for D. persimilis and D. pseudoobscura given our methodology.
Estimates of amino acid divergence for clades were obtained by summing relevant internal and external branches for only those genes whose Muller element locations had been conserved. For instance, for the melanogaster complex clade, we included each of the 3 terminal lineages in addition to the shared sechellia/simulans lineage. Only the 1878 genes for which the tree topology with Drosophila yakuba and Drosophila erecta as sister species had highest support were included in these clade-specific analyses. This was out of necessity, as these species do not form a clade in the other tree topologies. This approach does limit the impact of phylogenetic incongruence across the genome, as it focuses the analysis on those genes that do not appear to bear strong signatures of lineage sorting. However, one issue we cannot account for is phylogenetic incongruence within the context of a certain gene. Although this can pose challenges for phylogenetic inference (Wong et al. 2007
), we believe that this issue will introduce noise rather than a systematic biases in our results. For the analyses of pairs of orthologs for which in one species the pair is X-linked and the other species the pair is autosomal, we compared the relative amino acid divergence. The relative amino acid divergence was calculated for each gene as the amino acid divergence for the clade-specific branch for that gene normalized by the mean amino acid divergence across genes for that clade.
To estimate divergence at 4-fold degenerate sites, we extracted the 4-fold degenerate sites from the alignments of coding sequences in the melanogaster group and used BASEML with an unrooted tree to estimate terminal branch lengths. We restricted ourselves to those genes with at least 100 four-fold degenerate sites; in total, we have divergence estimates for 4712 genes. For clade-specific analyses on divergence at 4-fold degenerate sites, only the 3068 genes for which the tree topology with D. yakuba and D. erecta as sister species had highest support were included, and genes whose Muller element locations were not conserved across species were not considered.
Lineage-specific evolutionary parameter estimates were based on the tree topology with the highest likelihood. For the clade-specific analysis, we restricted the analysis to the 4925 genes for which the tree topology with D. yakuba and D. erecta as sister species had highest support, and we further restricted ourselves to the subset of genes whose Muller element locations have been conserved across species. For each gene, the clade-specific branch lengths were obtained by summing relevant internal and external branches of the phylogeny (see above).
Estimates of
for the 4-species comparisons were taken from PAML model M0 based on 2-species alignments of orthologous sequences in Drosophila melanogaster/Drosophila simulans and D. pseudoobscura/D. persimilis species pairs. We generated those alignments by extracting the appropriate sequences from the multispecies alignments in the data set of 6698 genes with single orthologs in all 12 genomes. We limited this analysis to the subset of genes for which Muller element locations had been conserved in all 4 species.
Statistics and Multiple Test Correction
All reported P values are based on 2-tailed tests unless specifically stated otherwise. Given the number of statistical comparisons performed, we employed several different corrections for multiple testing. For the M7 versus M8 comparison, we controlled the false discovery rate (FDR) by estimating q values (Storey and Tibshirani 2003
) using the q value package in R (described in Drosophila 12 Genomes Consortium 2007
; Larracuente et al., forthcoming). Unless otherwise stated, we used a FDR threshold of 0.1, implying that the set of genes satisfying this criterion is expected to include 10% false positives. For the other statistical comparisons requiring correction for multiple tests, we used Holm's method for sequential Bonferroni correction (Holm 1979
); these P values are referred to as "adjusted" P values throughout the text.
Genic Features
We estimated the degree of codon bias for all genes in the set of alignments based on orthologous sequences in all 12 genomes. We used a stand-alone implementation of codonW (downloaded from http://codonw.sourceforge.net) and used the codons defined as preferred in D. melanogaster to estimate the frequency of optimal codons (FOP) for each gene in each species. The application of preferred codon definitions from D. melanogaster to the remaining species in the genus is not likely to adversely affect our results, as codon preferences appear highly conserved across the phylogeny (Vicario et al. 2007; Drosophila 12 Genomes Consortium 2007
).
We obtained tissue-specific expression data for 7 adult tissues (brain, midgut, hindgut, malphigian tubule, testis, ovary, accessory gland) from FlyAtlas (www.flyatlas.org) (see Wang et al. 2004
; Chintapalli et al. 2007
). Expression had been assayed on Affymetrix Dros2 microarrays with 4 independent replicates for each tissue. With the exception of the testis, ovary, and accessory gland, these expression estimates are from tissues dissected from equal numbers of males and females. Specificity of expression was measured by
with S representing the signal intensity and n representing the number of tissues (Yanai et al. 2005
). The log(Sj) was set to 0 for any gene detected on 0 or 1 out of 4 arrays for a given tissue. To limit our analysis to genes with no evidence of male-specific expression patterns, genes with
0.9 and expressed in the testes or accessory glands were removed.
Chromosomal locations of every gene in each species were based on scaffold-to-Muller-element maps kindly provided by A. J. Bhutkar based on methodology described elsewhere (Bhutkar AV, Schaeffer SW, Russo S, Xu M, Smith TF, Gelbart WM, personal communication Bhutkar et al. 2006
). Only genes whose locations could be unambiguously mapped to a particular Muller element in the species under study were included in this analysis. We will refer to Muller element A as the "X" chromosome in all 12 species, and in Drosophila willistoni, D. persimilis, and D. pseudoobscura, Muller element D is referred to as the "neo-X."
| Results and Discussion |
|---|
|
|
|---|
Because of the hemizygosity of the X chromosome in Drosophila males, the efficacy of natural selection is expected to be greater than that on autosomes provided the adaptive mutations are at least partially recessive (table 1). This should manifest as a systematic molecular evolutionary difference in substitution rate between the X and the autosomes. A great deal of theoretical attention has been devoted to exploring potential differences in rates and patterns of evolution on the sex chromosomes and the autosomes, which has led to several testable hypotheses and explicit empirical tests (for review, see Vicoso and Charlesworth 2006
Neutral Evolution
Predictions regarding the effects of increased efficacy of natural selection on rates on substitution are sensitive to underlying assumptions regarding mutation rates, life-history characteristics, as well as demography. In particular, the X chromosome should experience higher rates of adaptive substitution than the autosomes assuming equal effective numbers of breeding males and females, that adaptive mutations are new, an adaptive mutation rate on the X chromosome at least equaling that of the autosomes, and at least partial recessivity (Avery 1984
; Charlesworth et al. 1987
; Betancourt et al. 2004
). To assess the validity of these assumptions, we investigated rates of substitution at potentially neutral sites, as several nonselective factors can lead to differences in the neutral substitution rate between the X chromosome and the autosomes. For instance, increased mutation rates in males or females should result in increased rates of substitution at these potentially neutral sites on the autosomes or X chromosome, respectively (table 1). In mammals, the elevated number of mitotic divisions in the male germline results in a higher male mutation rate (Drost and Lee 1995
), which reduces X-linked divergence at neutral sites relative to that on the autosomes. In Drosophila, the numbers of mitotic divisions appear to be comparable in the male and female germline (Drost and Lee 1995
), though the possibility remains that there are sex-specific differences in underlying mutation rates in this system.
Coupled with differences in mutation rates, differences in the effective number of breeding females and breeding males (mediated by a sex ratio unequal to one and/or differences in the variance of breeding success between males and females) will also influence the neutral substitution rates on the X chromosome and the autosomes. In particular, as the ratio of effective numbers of breeding females to breeding males increases, the ratio of effective sizes of the X and autosomes increases from the expected
ratio under an assumption of equal numbers of effective males and females and can approach or even exceed unity (Hartl and Clark 2007
). Inferences of the sex ratio in natural populations based on polymorphism data are consistent with deviations from the expected
in several populations of D. melanogaster (Hutter et al. 2007
; Singh et al. 2007
), though the direction and magnitude of the deviations vary across populations. Following the assumptions of the nearly neutral model (Kimura 1983
), increases in effective size decrease the fixation probability of novel neutral (slightly deleterious) mutations and therefore lead to decreases in substitution rates. Finally, due to differences in effective size between the X chromosome and the autosomes, these chromosomes are differentially affected by demographic events such as population bottlenecks: under a model with equal numbers of effective males and females, a population bottleneck is effectively more severe on the X chromosome than on the autosomes (Pool and Nielsen 2007; Wall et al. 2002
). Thus, neutral rates of substitution on the X chromosome may be less than, equal, or even exceed rates of substitution on the autosomes given sex-specific mutation rates, demography, and differences in the effective sizes of these chromosomes resulting from breeding dynamics.
To assess potential sex-specific mutational biases as well as the impact of demography and life-history characteristics on neutral substitutional patterns, we investigated rates of evolution at 4-fold degenerate synonymous sites in the 5 species in the melanogaster subgroup for which saturation at synonymous sites has not been reached. These results are presented in figure 1A and table 2 (see also supplementary fig. 1, Supplementary Material online). It is important to note that these synonymous sites are likely subject to selection on codon bias and are thus not truly "neutral" (e.g., Singh et al. 2007; Akashi 1994
; Powell and Moriyama 1997
). However, within the context of protein-coding sequences, synonymous sites are less constrained than nonsynonymous sites, and because selection on codon bias is weak, we do expect that divergence at 4-fold degenerate sites can shed light onto putative mutation differences between the sexes. Although we acknowledge that selection may shape patterns of divergence at 4-fold degenerate sites to some degree, it is also clear that drift plays a role in influencing the dynamics of frequency changes at these sites. Consequently, we will refer to these 4-fold degenerate synonymous sites as neutral throughout the remainder of this discussion, consistent with the terminology of the nearly neutral model.
|
|
Previous analysis of silent site and intron divergence between D. melanogaster and D. simulans at 18 loci were suggestive of comparable neutral substitution rates between the X chromosome and the autosomes (Bauer DuMont and Aquadro 1997
The pattern of reduced X divergence at neutral sites in D. sechellia and D. simulans is consistent with elevated male mutation rates, although it is unexpected to find this pattern restricted to these 2 lineages. Reduced divergence at 4-fold degenerate synonymous sites on the X chromosome may similarly reflect to some degree greater selective constraint at synonymous sites given the increased codon bias of X-linked genes in Drosophila (Comeron et al. 1999
; Hambuch and Parsch 2005
; Singh et al. 2005
), though again it is unclear why this pattern would be restricted to these lineages in particular. It is important to note that because we are only using a single sequence for each of the studied species, we are conflating polymorphism and divergence. This may be particularly problematic for closely related species pairs such as simulans/sechellia and persimilis/pseudoobscura where a larger proportion of presumed divergent sites are actually polymorphic in one or both species. This is likely to upwardly bias divergence estimates in autosomal genes more so than X-linked genes, as diversity appears reduced on the D. simulans X chromosome (Begun and Whitley 2000
), and may thus contribute to the decreased X-linked divergence in D. simulans and D. sechellia. It is unclear the extent to which these differences between species in relative rates of divergence at these sites are due to variation among species in mutational patterns, breeding structure, demographic history, levels of X-linked and autosomal polymorphism, or patterns of weak selection, and this is a topic that merits further investigation.
Adaptive Evolution
The efficacy of positive selection can be increased on the X chromosome relative to the autosomes under certain population genetic conditions, leading to increased rates of adaptive substitution on the X chromosome. Specifically, if beneficial alleles are at least partially recessive on average and natural selection acts primarily on novel mutations, then the X chromosome is expected to undergo an increased rate of adaptive evolution relative to the autosomes, assuming equal numbers of effective males and females (Avery 1984
; Charlesworth et al. 1987
; Betancourt et al. 2004
). It appears as though deleterious mutations are partially recessive in Drosophila (for review, see Garcia-Dorado et al. 2004
). However, there is little empirical evidence on the distribution of dominance effects of adaptive mutations. Most inferences of the recessivity of beneficial mutations are based on comparing patterns of polymorphism and divergence between X-linked and autosomal genes (Begun and Whitley 2000
; Schoefl and Schloetterer 2004
; Lu and Wu 2005
). For the purposes of this paper, we will present results in the context of expectations given recessivity of beneficial mutations. However, we note that this choice was not informed by knowledge of the distribution of dominance effects of adaptive mutations but instead was guided by the example set by previous investigations on this topic.
Given that recent studies suggest that a substantial fraction of the Drosophila genome is subject to positive selection (Sawyer et al. 2003
, 2007
; Bierne and Eyre Walker 2004
; Welch 2006
; Drosophila 12 Genomes Consortium 2007
), we expect to see the effects of any increased efficacy of positive selection on the X chromosome in this system. To date, there have been several empirical tests of this hypothesis in Drosophila, which have yielded inconsistent results (Betancourt et al. 2002
; Thornton and Long 2002
; Countermanet al. 2004
; Musters et al. 2006
; Thornton et al. 2006
). In general, there have been 2 approaches to investigating an increased efficacy of positive selection in Drosophila: comparing rates of evolution between X-linked and autosomal genes within a genome or using paired comparisons among orthologs between species or paralogs of duplicate genes within species in which one gene is autosomal and one gene is X-linked. With respect to the former, there has been evidence in support of higher substitution rates of X-linked genes (Musters et al. 2006
), as well as evidence suggesting comparable rates of evolution on the X and the autosomes (Betancourt et al. 2002
). Paired comparisons also yield contradictory results, with some studies revealing faster evolutionary rates for X-linked paralogs/orthologs (Thornton and Long 2002
; Counterman et al. 2004
) and another showing no such effect (Thornton et al. 2006
). The reasons underlying the discrepancies among these paired approaches are unclear, although statistical power and sampling may play roles given the different scales of these studies.
To investigate potential intragenomic differences in the efficacy of positive selection at a large scale, we compared rates of evolution between the X chromosome and the autosomes using orthologous protein-coding sequences aligned in all 12 Drosophila genomes, as well as those genes that were aligned in the 6 species of the melanogaster group. We employ several measures of rates of protein evolution, including amino acid divergence for all 12 species and the ratio of nonsynonymous to synonymous substitution rates (
) for the melanogaster group. We use both paired and unpaired comparisons to contrast the efficacy of positive selection on the X chromosome and autosomes. Importantly, one potential confounding factor in our analysis is ascertainment bias; we have by necessity limited ourselves to those genes with identifiable orthologs in either the entire phylogeny or within the melanogaster group. In the case of the former, only approximately 1/2 of the genes annotated in D. melanogaster are contained within this data set, and this fraction is certainly biased toward those genes that are evolving sufficiently slowly that orthologs can be readily identified. It is difficult to assess the magnitude of this bias, but assuming that the genes not captured by our analysis evolve similarly to the genes included in our analysis, this ascertainment bias makes our analysis regarding positive selection conservative.
Amino Acid Divergence
We compared rates of amino acid divergence of X-linked and autosomal genes within each of the 12 Drosophila genomes. For most species, the X chromosome and the autosomes evolve at similar rates (fig. 1B, table 3, supplementary fig. 2, Supplementary Material online). However, in D. persimilis when all the autosomal genes are pooled, the X chromosome shows significantly increased rates of amino acid divergence (median amino acid divergence is 0.0797 and 0.0886 for the autosomes and X chromosome, respectively; adjusted P = 0.0008, Mann–Whitney U test). In addition, in D. pseudoobscura and D. persimilis, rates of amino acid divergence on the X chromosome exceed those rates on the neo-X chromosome (D. pseudoobscura: 0.0758 and 0.0819 for the neo-X and X, respectively; adjusted P = 0.044, Mann–Whitney U test; D. persimilis: 0.0833 and 0.0886 for the neo-X and X, respectively; adjusted P = 0.03, Mann–Whitney U test). In D. yakuba and D. simulans, however, autosomal genes show significantly increased rates of amino acid divergence (D. yakuba: 0.0127 and 0.0116 for the autosomes and X, respectively; P = 0.039, Mann–Whitney U test; D. simulans: 0.00219 and 0.00144 for the autosomes and X, respectively; P = 0.029, Mann–Whitney U test). This may in part be due to nonselective effects, as both D. simulans and D. yakuba show increased autosomal divergence at 4-fold degenerate synonymous sites. Notably, although we have reduced confidence in the genome sequences of D. persimilis and D. simulans given their low sequence depth and mosaic assembly, respectively, that the patterns observed in these species are largely echoed in species sequenced to high coverage suggests that these patterns are not artifactual.
|
It is of note that amino acid divergence on the dot chromosome (Muller element F) is significantly elevated relative to the remaining autosomes in D. melanogaster, D. yakuba, D. erecta, D. ananassae, and D. mojavensis (P < 0.048, all comparisons, Mann–Whitney U test), which is consistent with expectation. Given the putative lack of recombination on this chromosome, genes on the dot chromosome should experience increased rates of substitution due to the fixation of deleterious alleles. Because it may have made our comparisons of X-linked and autosomal substitution rates overly conservative, we repeated the comparison of amino acid divergence between X-linked and autosomal genes removing F-linked genes; the results from this analysis are qualitatively identical with those presented above, which result in part from the small number of F-linked genes (table 3).
Importantly, differences in the gene complements of the X chromosome and the autosomes can also lead to differences in rates of evolution between the 2 chromosome sets. In D. melanogaster, for instance, genes with sex-specific biases in expression pattern appear to be distributed differently throughout the genome. Accessory gland proteins, which play key roles in male reproduction, have a distribution significantly biased toward autosomes (Swanson et al. 2001
), and other proteins with male-biased expression patterns are also depleted on the X chromosome (Parisi et al. 2003
; Ranz et al. 2003
). These genes with sex-biased expression patterns may evolve more rapidly than other types of genes, particularly if they are involved in reproduction, as reproductive genes in Drosophila do appear to evolve rapidly (for review, see Haerty et al. 2007; Swanson and Vacquier 2002
; Panhuis et al. 2006
). Because these differences in the gene complements of the X and the autosomes may confound our comparisons of evolutionary rates, we repeated our analyses removing genes with male-specific expression patterns (see Materials and Methods). The removal of these genes does not dramatically alter the amino acid divergence results, as D. persimilis still shows evidence in support of an increased substitution rate on the X chromosome whereas D. simulans shows the opposite pattern.
For species that show different rates of amino acid divergence between the X and the pooled autosomes, we can more closely examine heterogeneity in evolutionary rate among Muller elements. In D. persimilis, although median amino acid divergence is higher on the X than on every other individual chromosome arm, the increase is only significant in comparison with Muller elements E and F (adjusted P < 0.036, both comparisons, Mann–Whitney U test). Similarly, in D. simulans, although median amino acid divergence is reduced on the X in relation to all other chromosomes, this decrease is not statistically significant in any single arm comparison (adjusted P > 0.27, all comparisons Mann–Whitney U test). Likewise, in D. yakuba, rates of evolution are lower on the X than on every autosomal chromosome arm but only significantly so in the comparison with Muller elements B and F (adjusted P < 0.015, both comparisons, Mann–Whitney U test). Thus, the X-associated decreases and increases in rates of amino acid divergence evident in these species appear to be minor at best, as they only achieve significance when all autosomal genes are pooled.
We can also use evolutionary parameter estimates from internal branches of the phylogeny to investigate clade-specific trends in comparative rates of evolution between the X and the autosomes. In the shared obscura group lineage, which includes both the branch leading to D. pseudoobscura and D. persimilis as well as the terminal branches (see Materials and Methods), rates of amino acid divergence are significantly increased on the X chromosome relative to both the autosomes and the neo-X chromosome (median divergence is 0.0827, 0.0854, and 0.0924 for the autosomes, neo-X, and X, respectively; adjusted P < 0.014, both comparisons, Mann–Whitney U test). Interestingly, amino acid divergence on the X chromosome also appears to be elevated in the melanogaster group, as well as in the subgenus Sophophora. Although this result may reflect an increased power to detect subtle differences in X-linked versus autosomal evolutionary rate by pooling information across internal branches, it remains possible that this is not characteristic of the genome because this clade-specific analysis was limited to 1878 genes.
Although there does appear to be interchromosomal variation in the rates of amino acid divergence, the pattern is not consistent across taxa. Whereas the obscura group, melanogaster group, and Sophophora subgenus in general and D. persimilis in particular do provide tentative support for an increased efficacy of positive selection on the ancestral X chromosome relative to the autosomes and the neo-X chromosome, D. yakuba and D. simulans show precisely the opposite, although the magnitude of these effects appears to be quite small. However, there are several other confounding factors that may also contribute to the observed patterns. Recent demographic events such as population bottlenecks are likely to play a role, though little is known about the demographic histories of most of the species studied here. Furthermore, with respect to D. persimilis, there are inversions on both the X and the neo-X chromosomes that may have been fixed by positive selection (Machado et al. 2007
); the inversions themselves as well as recent selective pressures on these inversions may also contribute to the pattern of increased rates of evolution on the D. persimilis X and neo-X chromosomes. The extent to which the patterns of amino acid divergence in these species are driven by differences in underlying neutral substitution rate between the X and the autosomes, demographic history, inversions, or selective effects unfortunately remains unclear.
Divergence Estimated by 
Because amino acid divergence does not contain information regarding the substitution process at synonymous sites, we also compared rates of molecular evolution in the context of the melanogaster subgroup, in which saturation at synonymous sites has not as yet been reached. This is particularly important given that rates of substitution at 4-fold degenerate synonymous sites are consistently lower on the X than on the autosomes in 2 of the 5 species. Specifically, we examined
or the ratio of nonsynonymous to synonymous substitution rates per site for all 5 species in this subgroup (fig. 1C, table 2, supplementary fig. 3, Supplementary Material online). Whereas there are no clade-specific increases in estimates of
within the melanogaster subgroup, D. sechellia and D. simulans show significant increases in
for X-linked genes as compared with the pooled autosomal genes (D. sechellia: median
is 0.0876 and 0.102 for the autosomes and the X, respectively; P = 0.0002, Mann–Whitney U test; D. simulans: 0.0589 and 0.0757 for the autosomes and X, respectively; P = 0.0003, Mann–Whitney U test). This is likely due at least in part to the decrease in neutral substitution rate on the X chromosome in these species, as there are no significant differences in substitution rates at nonsynonymous sites between the X and autosomes in these species (data not shown).
As was the case with amino acid divergence, the inclusion of F-linked genes may have made our analyses overly conservative. We expect estimates of
to be closer to one on the dot chromosome given the reduced efficacy of purifying selection due to the negligible rate of recombination and small effective size of this chromosome. When we removed F-linked genes and repeated this analysis, the results were found to be entirely consistent with those presented above; this is likely due to the small number of F-linked genes (table 2). Similarly, these results are largely recapitulated when genes with male-specific expression patterns are removed.
Comparing chromosome arms individually indicates that the patterns in these species are not driven by 1 or 2 outlying autosomal chromosome arms. In D. sechellia, estimates of
are higher on the X than on each of the 4 major autosomal arms and significantly higher than on elements B, C, and E (adjusted P > 0.032, all comparisons, Mann–Whitney U test). Estimates of
are also significantly higher on the X than on each of the 4 major autosomes in D. simulans (adjusted P > 0.035, all comparisons, Mann–Whitney U test).
These comparisons of rates of molecular evolution at synonymous and nonsynonymous sites within the melanogaster subgroup indicate that when controlling for variation in mutation rate and other factors that contribute to rates of evolution at synonymous sites, there are some species that appear to be evolving in a manner consistent with an increased efficacy of positive selection for X-linked genes. In particular, D. simulans and D. sechellia show higher estimates of
for genes on the X relative to genes on the autosomes. It does remain possible that the increased estimates of
in these species are driven at least in part by a reduced synonymous substitution rate arising from increased constraint on synonymous sites particularly in these species or by segregating polymorphisms that are erroneously inferred to be fixed differences in this recently diverged species pair.
Paired Comparisons
A particularly robust test for an increased efficacy of positive selection on the X chromosome is the paired comparison of rates of evolution between pairs of orthologs in which one pair of orthologs is X-linked and the other pair is autosomal (Thornton et al. 2006
). To apply this test, we can exploit the autosome–X translocation in D. persimilis and D. pseudoobscura, in which the ancestral Muller element D was fused to Muller element A, resulting in a neo-X chromosome. Thus, genes located on the D element are X-linked in these species, whereas they are autosomal in related species such as D. melanogaster and D. simulans. Under a model in which positive selection is more efficacious on the X chromosome, rates of evolution of D-linked genes should be higher in D. pseudoobscura/D. persimilis than in D. melanogaster/D. simulans. To estimate rates of evolution, we chose to use
as it takes synonymous substitution rates into account, which is especially important given the underlying differences in neutral substitution rate between the X and the autosomes observed in several species. Using only genes that map unambiguously to Muller element D in all 4 species, we estimated
in the D. melanogaster/D. simulans paired alignments as well as in D. pseudoobscura/D. persimilis paired alignments. We then compared the number of genes for which estimates of
are higher in the D. pseudoobscura/D. persimilis comparison versus the D. melanogaster/D. simulans comparison to a baseline standard. Although others have opted to use the ancestral X chromosome (Muller element A) as the baseline (Thornton et al. 2006
), we elected to use each major Muller element individually to establish an expectation and perform all possible comparisons. For each major Muller element, the numbers of genes that had higher estimates of
in the D. melanogaster/D. simulans comparison or in the D. pseudoobscura/D. persimilis comparison are presented in table 4. Consistent with a model of increased efficacy of positive selection on the X chromosome, of the genes that have higher estimates
in D. pseudoobscura/D. persimilis than D. melanogaster/D. simulans, a significant excess of these are D-linked when Muller elements B, C, or E are used as the null (P < 0.03, Fisher's exact test, all comparisons).
|
Given that nonsynonymous divergence in this pairwise comparison between D. pseudoobscura and D. persimilis is quite low (mean dN = 0.008; median dN = 0.003), which may limit our power to detect differences in evolutionary rate among chromosomes, we employed a similar approach using clade-specific relative rates of amino acid divergence (see Materials and Methods) to test for an increased efficacy of positive selection on the X chromosome. For each gene, we estimated relative amino acid divergence in the melanogaster subgroup, the obscura group, as well as in D. willistoni, where Muller element D has also independently become X-linked. For each Muller element, we counted the number of genes for which (relative) amino acid divergence was higher in the melanogaster subgroup than in the obscura group and vice versa; we obtained similar counts for the comparison of (relative) amino acid divergence between the melanogaster subgroup with D. willistoni. These counts are presented in tables 5 and 6. In the comparison between the melanogaster subgroup and the obscura group as well as the comparison between the melanogaster subgroup and D. willistoni, there is a significant increase in the number of D-linked genes with higher estimates of (relative) amino acid divergence in the obscura group or in D. willistoni than in the melanogaster subgroup when elements B or E are used as the baseline (P < 0.017, Fisher's exact test, all comparisons).
|
These paired comparisons of orthologs in which one pair of orthologs is X-linked and the other pair is autosomal are particularly appropriate for testing for a greater efficacy of positive selection on the X chromosome, as the assumption is that the only difference between the pairs of genes is their chromosomal location. Thus, many potentially confounding factors such as gene function and degree of constraint are controlled for. The results from these paired comparisons are suggestive of an increased efficacy of positive selection on the X chromosome. However, the observation that the results are somewhat sensitive to the particular element used to generate the null indicates that rates of evolution are heterogeneous among chromosomes even beyond the X-autosome comparison, which may merit further investigation.
These results are largely consistent with our analysis of amino acid divergence in the obscura group (above), which provided evidence in support of a heightened efficacy of positive selection in both D. pseudoobscura and D. persimilis as well as in the obscura clade. In contrast, the weak support for faster-X in D. willistoni based on paired 4-species comparisons coupled with the lack of significant difference in rates of amino acid divergence of X-linked and autosomal genes indicate that a model of more efficacious positive selection on the X chromosome clearly is insufficient to explain the pattern of divergence in this species.
Rapidly Evolving Genes
Because rates of substitution on the X should exceed those rates on the autosomes for new advantageous alleles, we expect that an X-associated increase in rate of molecular evolution should be especially pronounced for those genes that appear to be evolving under positive selection. To test this hypothesis, we compared estimates of
for X-linked and autosomal genes in the subset of genes with evidence for positive selection based on the M7 versus M8 PAML comparison (see Materials and Methods). For the putative positively selected genes, when all of the autosomal genes are pooled together, estimates of
are significantly higher for X-linked genes than for autosomal genes in D. simulans (median
is 0.108 and 0.160 for the autosomes and X, respectively; P = 0.022, Mann–Whitney U test). However, this result is not recapitulated in arm-by-arm comparisons, as the increase in
associated with X-linkage in this species is not statistically significant in comparison to any individual chromosome arm (adjusted P > 0.096, all comparisons, Mann–Whitney U test), which may reflect lack of power due to the small number of genes in the positively selected gene subset.
Because the M7 versus M8 comparison is rather stringent and identifies genes that are evolving under positive selection across the entire phylogeny, we also examined the subset of genes that show lineage-specific accelerations in evolutionary rate (see Materials and Methods). When all autosomal genes are pooled together, estimates of
are significantly increased for X-linked versus autosomal genes in the subset of genes with lineage-specific increases in evolutionary rate for D. melanogaster and D. sechellia (D. melanogaster: median
is 0.257 and 0.350 for the autosomes and X, respectively; P = 0.039, Mann–Whitney U test, D. sechellia: median
is 0.399 and 0.468 for the autosomes and X, respectively; P = 0.043, Mann–Whitney U test). Clade-specific analysis shows increased estimates of
on the X in the D. melanogaster complex as well (median
is 0.165 and 0.193 for the autosomes and X, respectively; P = 0.039, Mann–Whitney U test). Because of the small sample size in this gene subset, we cannot test each autosomal element against the X chromosome in each species.
The lack of consistent signal could be because many adaptively evolving genes only show evidence for positive selection at a small fraction of their sites (Drosophila 12 Genomes Consortium 2007
). It could also be due to a lack of power: because we restricted the analysis to the set of genes with a single ortholog in the melanogaster group, there are fewer genes that show evidence of positive selection and/or have significantly accelerated rates of evolution. We therefore compared the proportion of X-linked and autosomal genes in each of these 2 gene subsets (i.e., positively selected gene subset and subset of genes with lineage-specific accelerations in evolutionary rate) to assess whether the X chromosome was particularly enriched for genes evolving rapidly. In D. sechellia, D. simulans, and D. erecta, there is a marginally significant overabundance of X-linked genes among those genes that show evidence of positive selection across the entire phylogeny (P < 0.089, all comparisons, Fisher's exact test). In addition, within the D. melanogaster species complex, there is a clade-specific, marginally significant overrepresentation of X-linked genes in the positively selected gene subset (P = 0.055, Fisher's exact test). Similarly, the X chromosome appears to be enriched for genes with lineage-specific accelerations in rate of evolution specifically in both D. sechellia and D. simulans lineages (P < 0.038, both comparisons, Fisher's exact test); we see a similar clade-specific trend in the melanogaster species complex (P = 0.0026, Fisher's exact test). In D. yakuba, however, there is a dearth of X-linked genes with lineage-specific accelerations in evolutionary rate (P = 0.023, Fisher's exact test).
Thus, when our sample is enriched for genes evolving in a manner that is consistent with either positive selection across the phylogeny or lineage-specific increases in evolutionary rate, we do see a weak signal of increased efficacy of positive selection on the X chromosome, particularly in the D. melanogaster species complex. This is evidenced not only by differences in the distributions of
between the X and the autosomes but also by the genomic distribution of rapidly and/or adaptively evolving genes.
Purifying Selection
The hemizygosity of the X chromosome in males can lead to increased efficacy of selection against deleterious alleles, or more efficient purifying selection, for novel mutations that are at least partially recessive. Theoretical studies suggest that under a broad range of conditions, the rate of fixation of deleterious alleles should be reduced on the X chromosome (Charlesworth et al. 1987
) (table 1).
Increased Efficacy of Purifying Selection on the X
We examined potential differences in the efficacy of purifying selection among chromosomes by comparing levels of codon bias among orthologous genes aligned in all 12 Drosophila species whose genomes have been fully sequenced. Codon bias is the unequal usage of synonymous codons in protein-coding sequences, and it is thought to be a consequence of selection-mutation-drift balance (Sharp and Li 1986
; Bulmer 1991
; Akashi 1997
; McVean and Charlesworth 1999
). Assuming that there is an optimal or preferred codon for each amino acid, corresponding to the most abundant tRNA, or the least error-prone amino-acyl tRNA charging reaction, then mutations away from these preferred codons might be weakly deleterious.
There are several reasons to believe that codon bias is maintained by purifying selection. First, codon preferences appear almost entirely conserved across the Drosophila phylogeny (Vicario, Moriyama, and Powell 2007 Drosophila 12 Genomes Consortium 2007
), suggesting that codon bias levels are likely near or at equilibrium in natural populations. Consistent with this idea, correlations between species in FOP, one commonly employed metric of codon bias, are > 0.91 within the D. melanogaster subgroup (data not shown). Secondly, given that all preferred codons in Drosophila are G or C ending (Akashi 1995
) coupled with the inferred mutational pressure toward A and T in this system (Petrov and Hartl 1999
; Singh et al. 2006
), most novel mutations at synonymous sites will be away from preferred codons. We thus believe that it is likely that codon bias is maintained by purifying selection rather than being driven by positive selection and suggest that codon bias may serve as an appropriate proxy for the efficacy of purifying natural selection.
To assess whether codon bias was an appropriate metric for indicating the efficacy of purifying selection, we examined codon bias of genes on the dot chromosome. This chromosome is thought to undergo little if any recombination and should experience markedly less effective purifying selection, which may be manifested as reduced codon bias. With the exception of D. willistoni, genes on the dot chromosome have significantly lower codon bias than genes on every other chromosome arm (adjusted P << 0.0001, all comparisons, Mann–Whitney U test). In D. willistoni, Muller elements E and F have fused, and as a consequence, the F element in this species may behave differently from the dot chromosome of other species. However, even in D. willistoni, codon bias of genes on Muller element F is significantly lower than codon bias of genes on all other elements (adjusted P < 0.001, all comparisons, Mann–Whitney U test) but element B. These data thus suggest that codon bias is a sensitive metric for evaluating the efficacy of purifying selection.
To compare the efficacy of purifying selection on the X and the autosomes, we compared levels of codon bias of X-linked and autosomal genes. Previous reports suggest that codon bias of X-linked genes is significantly higher than codon bias of autosomal genes in D. melanogaster (Comeron et al. 1999
; Hambuch and Parsch 2005
; Singh et al. 2005
) and D. pseudoobscura (Singh et al. 2005
), which is consistent with a greater efficacy of purifying selection on the X. Given the newly available comparative genomic data for several additional Drosophila species, we tested whether the increase in codon bias associated with X-linkage was evident across the Drosophila phylogeny.
For all 12 species, estimates of FOP are significantly higher for genes on the X chromosome than for genes on the pooled autosomal chromosomes (P << 0.0001, all comparisons, Mann–Whitney U test) (fig. 1D, table 3, supplementary fig. 1, Supplementary Material online). In addition, FOP in genes on the neo-X chromosome is significantly higher than FOP in genes on the autosomes for D. pseudoobscura and D. persimilis (P < 0.0002, both comparisons, Mann–Whitney U test). Moreover, FOP for genes on the ancestral X chromosome is significantly higher than FOP in genes on the neo-X chromosome in D. willistoni, D. persimilis, and D. pseudoobscura (P < 0.0001, all comparisons, Mann–Whitney U test).
Comparing FOP of X-linked genes versus genes on individual Muller elements yields similar results. In all species but D. willistoni, codon bias of ancestrally X-linked genes is significantly higher than codon bias of genes on every individual autosomal chromosome arm (adjusted P < 0.0002, all comparisons, Mann–Whitney U test). In D. willistoni, codon bias on the ancestral X chromosome is significantly higher than codon bias of genes on all other elements (adjusted P < 0.0005, all comparisons, Mann–Whitney U test) except for Muller element E. Therefore, it appears as though the elevated codon bias associated with X-linkage is evident in all Drosophila species sequenced to date; this may reflect an increased efficacy of purifying selection on the X chromosome.
However, because of the indirectness of the evidence, it is important to explore other possible forces that may generate the elevated codon bias on the X. If codon bias were driven by positive selection, the increase in codon bias associated with X-linkage could reflect the increased efficacy of positive selection on the X chromosome. We find this explanation to be unlikely, given the contrast between the consistency of the codon bias pattern across the phylogeny and the marked inconsistency in X versus autosomal rates of protein evolution. Alternatively, the increased codon bias of X-linked genes could result from altered selection pressures on X-linked genes given the dosage problem, as has been suggested previously (Singh et al. 2005
). If the dosage problem can be remediated in part at the level of translation in addition to the level of transcription, then this too could contribute to the observed pattern in codon bias in Drosophila.
| Conclusions and Future Directions |
|---|
|
|
|---|
The complete sequencing of 12 Drosophila genomes facilitates the investigation of potential differences in the efficacy of natural selection using rates of evolution between the X and the autosomes in ways that were not possible before. Beyond expanding the scale of previous analyses by testing for interchromosomal heterogeneities in the efficacy of selection in a larger number of species, these data also facilitate investigating patterns of evolution that are unique to subsets of the Drosophila phylogeny. In addition, we have combined both paired and unpaired approaches to test specifically for greater efficacy of positive selection on the X chromosome. Finally, the richness of the phylogeny allows for the identification of genes evolving adaptively across species, as well as genes with lineage-specific accelerations in evolutionary rate, which facilitates examining the efficacy of positive selection within rapidly evolving gene subsets.
Our results suggest a consistent elevation in the efficacy of purifying selection on the X chromosome compared with the autosomes. In contrast, an elevated efficacy of positive selection on the X chromosome is detected in only some species. Although some species do show evidence in support of such a model, the results are highly sensitive to the metric employed. The lack of a consistently detectable effect across the Drosophila phylogeny may indicate that adaptive evolution from new mutations is not the dominant force that modulates evolutionary rate in these species.
There are 2 main hypotheses to explain this observation. One hypothesis is that sequence evolution may be governed primarily by selective constraint, which would reduce rates of evolution on the X chromosome; the balance between the relative rates and strengths of positive versus purifying selection is likely to ultimately determine overall rates of coding sequence evolution between the X and the autosomes. It is not obvious that this balance will necessarily be the same among species, which could explain why some species show evidence in support of an increased efficacy of selection on the X chromosome and other species do not. Alternatively, it may be that positive selection does indeed play a large role in the evolution of protein-coding sequences in Drosophila, as has been suggested previously (Sawyer et al. 2003
, 2007
; Bierne and Eyre Walker 2004
; Welch 2006
; Drosophila 12 Genomes Consortium 2007
). If this is the case, then the absence of a consistently measurable increase in the efficacy of positive selection on the X chromosome may reflect a violation of one or more of the underlying assumptions of this model. Namely, if on average positive selection does not act on novel mutations, if the selective effects of beneficial mutations are different in males versus females, or if these mutations are not at least partially recessive, then the theoretical predictions for increased rates of adaptive evolution on the X chromosome break down.
The observed differences in X-linked and autosomal substitution rates among species may also reflect the varied evolutionary histories across taxa, given interspecific differences in demography and life-history characteristics. Certainly, the effective population sizes are likely to vary across the phylogeny given that this genus includes both island endemics and cosmopolitan species. Little is known about the effective population sizes of most of the 12 species, but protein polymorphism data suggest that the D. sechellia has a lower effective population size than D. simulans and D. melanogaster and nucleotide polymorphism data suggest that D. melanogaster has a smaller effective population size than D. simulans (e.g., Moriyama and Powell 1996
). Finally, these results may be affected by the types of genes residing on the X chromosome versus the autosomes, and the variation among species with respect to relative rates of evolution of X-linked and autosomal genes may reflect interspecific differences in X versus autosomal gene complements. Although the rate of interchromosomal movement does appear to be quite low in Drosophila (Bhutkar et al. 2007 Ranz et al. 2001
; Richards et al. 2005
), the evolutionary depth of the phylogeny under study may be sufficiently great that species could differ in their distributions of different functional classifications of genes. Moreover, expression patterns can diverge rapidly between species as well, particularly for male-biased genes (Meiklejohn et al. 2003
), which could also alter the gene complements of the X and the autosomes in a lineage-specific manner. Finally, recombination rates appear comparatively labile across the Drosophila phylogeny. Crossover frequencies differ within the D. melanogaster species complex (True et al. 1996
), and there are suggestions that at least some regions in D. pseudoobscura and D. simulans may differ in their recombinational landscape relative to D. melanogaster (Hamblin and Aquadro 1996
, 1999
), possibly associated with inversion polymorphism. Moreover, D. pseudoobscura appears to have higher rates of recombination than its sister species D. persimilis, though both of these species appear to have higher recombination rates than D. melanogaster. This raises the possibility that there may exist lineage-specific changes in the recombination environment, which may further contribute to interspecific differences in rates and patterns of evolution between the X and the autosomes.
Based on our results, we can make several inferences regarding the molecular evolutionary process in Drosophila. First, given that the efficacy of purifying selection on synonymous sites appears to be significantly higher on the X chromosome than on the autosomes across species, this suggests that deleterious synonymous mutations are partially recessive on average in Drosophila and also indicates that purifying selection at synonymous sites acts predominantly on novel mutations. Despite the lack of strong signal of efficacious positive selection on the X chromosome, the analysis of pairs of orthologs and the analysis of the genomic distribution of putative positively selected genes hints at the possibility that at least some fraction of beneficial mutations is recessive and that positive selection can operate on novel allelic variants.
On balance, we believe, these results are consistent with an increased efficacy of positive selection on the X chromosome but are indicative of the great complexity of the molecular evolutionary process. We suggest that although positive selection does contribute to rates and patterns of evolution, rates of adaptive evolution from novel mutations are not sufficiently high to overwhelm all of the potentially contributing factors and systematically inflate rates of evolution on the X chromosome.
This analysis has identified outstanding questions that we hope to investigate further in the future. Namely, the observation of reduced divergence at 4-fold degenerate synonymous sites on the X chromosome of some species but not others may be suggestive of interspecific variation in sex-specific mutation rates and/or differences among species in life-history characteristics. Because our inferences of selection are confounded by these underlying processes, more sophisticated models are required to fully explain the lack of a consistent increase in the efficacy of positive selection on the X chromosome in Drosophila. Further work will be required to understand how the degree of variation in substitution rates between the X and the autosomes among species is affected by differences in lineage-specific patterns of positive selection, mutation rates, and demography.
| Supplementary Material |
|---|
|
|
|---|
Supplementary figures are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
|
| Acknowledgements |
|---|
|
|
|---|
The authors thank C. F. Aquadro for helpful comments on this manuscript and are grateful to the Associate Editor and 3 anonymous reviewers for thoughtful suggestions. This work was supported in part by an NIH NRSA (grant number 1F32GM080944-01 to N.D.S., C.F.A., and A.G.C.).
The authors also thank T. Sackton for running the branch models in PAML.
| Footnotes |
|---|
1 These authors contributed equally to this work.
Hope Hollocher, Associate Editor
| References |
|---|
|
|
|---|
Akashi H. Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics (1994) 136:927–935.[Abstract]
Akashi H. Inferring weak selection from patterns of polymorphism and sivergence at "silent" sites in Drosophila DNA. Genetics (1995) 139:1067–1076.[Abstract]
Akashi H. Codon bias evolution in Drosophila. Population genetics of mutation-selection drift. Gene (Amsterdam) (1997) 205:269–278.
Avery PJ. The population genetics of haplo-diploids and X-linked genes. Genet Res (1984) 44:321–341.[Web of Science]
Bauer DuMont V, Aquadro CF. Rates of DNA sequence evolution are not sex-biased in Drosophila melanogaster and D. simulans. Mol Biol Evol (1997) 14:1252–1257.[Abstract]
Begun DJ, Whitley P. Reduced X-linked nucleotide polymorphism in Drosophila simulans. Proc Natl Acad Sci USA (2000) 97:5960–5965.
Betancourt AJ, Kim Y, Orr HA. A pseudohitchhiking model of x vs. autosomal diversity. Genetics (2004) 168:2261–2269.
Betancourt AJ, Presgraves DC, Swanson WJ. A test for faster X evolution in Drosophila. Mol Biol Evol (2002) 19:1816–1819.
Bhutkar A, Russo S, Smith TF, Gelbart WM. Techniques for multi-genome synteny analysis to overcome assembly limitations. Genome Inform (2006) 17:152–161.
Bhutkar AV, Russo S, Smith TF, Gelbart WM. Genome-scale analysis of positionally relocated genes. Genome Res (2007) 17:1880–1889.
Bierne N, Eyre Walker AC. The genomic rate of adaptive amino acid substitution in Drosophila. Mol Biol Evol (2004) 21:1350–1360.
Bulmer M. The selection-mutation-drift theory of synonymous codon usage. Genetics (1991) 129:897–908.[Abstract]
Charlesworth B, Coyne JA, Barton NH. The relative rates of evolution of sex chromosomes and autosomes. Am Nat (1987) 130:113–146.[CrossRef][Web of Science]
Chintapalli VR, Wang J, Dow JAT. Using fly atlas to identify better Drosophila melanogaster models of human disease. Nat Genet (2007) 39:715–750.[CrossRef][Web of Science][Medline]
Comeron JM, Kreitman M, Aguade M. Natural selection on synonymous sites is correlated with gene length and recombination in Drosophila. Genetics (1999) 151:239–249.
Counterman BA, Ortiz-Barrientos C, Noor MAF. Using comparative genomic data to test for fast-X evolution. Evolution (2004) 58:656–660.[CrossRef][Web of Science][Medline]
Drosophila 12 Genomes Consortium et al. 128 co-authors. Evolution of genes and genomes on the Drosophila phylogeny. Nature (2007) 450:203–218.[CrossRef][Medline]
Drost JB, Lee WR. Biological basis of germline mutation: comparisons of spontaneous germline mutation rates among Drosophila, mouse and human. Environ Mol Mutag (1995) 25:48–64.[CrossRef][Web of Science][Medline]
Garcia-Dorado A, Lopez-Fanjul C, Caballero A. Rates and effects of deleterious mutations and their evolutionary consequences. In: Evolution of molecules and ecosystems—Moya A, Font E, eds. (2004) Oxford: Oxford University Press. 20–32.
Haerty W, Jagadeeshan WS, Kulathinal RJ, et al. Evolution in the fast lane: rapidly evolving sex-and reproduction-related genes in Drosophila species. Genetics (2007) 177:1321–1335.
Hamblin MT, Aquadro CF. High nucleotide sequence variation in a region of low recombination in Drosophila simulans is consistent with the background selection model. Mol Biol Evol. (1996) 13:1133–1140.[Abstract]
Hamblin MT, Aquadro CF. DNA sequence variation and the recombinational landscape in Drosophila pseudoobscura: a study of the second chromosome. Genetics (1999) 153:859–869.
Hambuch TM, Parsch J. Patterns of synonymous codon usage in Drosophila melanogaster genes with sex-biased expression. Genetics (2005) 170:1691–1700.
Hartl DL, Clark AG. Principles of population genetics (2007) 4th ed. Sunderland (MA): Sinauer Associates, Inc.
Holm S. A simple sequentially rejective bonferroni test procedure. Scandinavian J Stat (1979) 6:65–70.
Hutter S, Li H, Beisswanger S, De Lorenzo D, Stephan W. Distinctly different sex ratios in African and European populations of Drosophila melanogaster inferred from chromosomewide single nucleotide polymorphism data. Genetics (2007) 177:469–480.
Kimura M. The neutral theory of molecular evolution (1983) Cambridge: Cambridge University Press.
Larracuente AM, Sackton TB, Greenberg A, Wong A, Singh ND, Sturgill D, Zhang Y, Oliver B, Clark AG. Protein-coding gene evolution in Drosophila. Trends Genet. Forthcoming.
Lu J, Wu C-I. Weak selection revealed by the whole-genome comparison of the X chromosome and autosomes of human and chimpanzee. Proc Natl Acad Sci USA (2005) 102:4063–4067.
Machado CA, Haselkorn TS, Noor MAF. Evaluation of the genomic extent of effects of fixed inversion differences on intraspecific variation and interspecific gene flow in Drosophila pseudoobscura and D-persimilis. Genetics (2007) 175:1289–1306.
McVean GAT, Charlesworth B. A population genetic model for the evolution of synonymous codon usage: patterns and predictions. Genet Res (1999) 74:145–158.[CrossRef][Web of Science]
Meiklejohn CD, Parsch J, Ranz JM, Hartl DL. Rapid evolution of male-biased gene expression in Drosophila. Proc Natl Acad Sci USA (2003) 100:9894–9899.
Moriyama EN, Powell JR. Intraspecific nuclear DNA variation in Drosophila. Mol Biol Evol (1996) 13:261–277.[Abstract]
Musters H, Huntley MA, Singh RS. A genomic comparison of faster-sex, faster-X, and faster-male evolution between Drosophila melanogaster and Drosophila pseudoobscura. J Mol Evol (2006) 62:693–700.[CrossRef][Web of Science][Medline]
Orr HA, Betancourt AJ. Haldane's sieve and adaptation from the standing genetic variation. Genetics (2001) 157:875–884.
Panhuis TM, Clark NL, Swanson WJ. Rapid evolution of reproductive proteins in abalone and Drosophila. Philos Trans R Soc Lond Ser B Biol Sci (2006) 361:261–268.
Parisi M, Nuttall R, Naiman D, Bouffard G, Malley J, Andrews J, Eastman S, Oliver B. Paucity of genes on the Drosophila X chromosome showing male-biased expression. Science (Washington DC) (2003) 299:697–700.
Petrov DA, Hartl DL. Patterns of nucleotide substitution in Drosophila and mammalian genomes. Proc Natl Acad Sci USA (1999) 96:1475–1479.
Pool JE, Nielsen R. Population size changes reshape relative levels of X chromosome and autosome diversity. Int J Org Evolution (2007) 61:3001–3006.
Powell JR, Moriyama EN. Evolution of codon usage bia s in Drosophila. Proc Natl Acad Sci USA (1997) 94:7784–7790.
Ranz JM, Casals F, Ruiz A. How malleable is the eukaryotic genome? Extreme rate of chromosomal rearrangement in the genus Drosophila. Genome Res (2001) 11:230–239.
Ranz JM, Castillo-Davis CI, Meiklejohn CD, Hartl DL. Sex-dependent gene expression and evolution of the Drosophila transcriptome. Science (2003) 300:1742–1745.
Rice WR. Sex chromosomes and the evolution of sex dimorphism. Evolution (1984) 38:735–742.[CrossRef][Web of Science]
Richards S, Liu Y, Bettencourt BR, et al. (52 co-authors). Comparative genome sequencing of Drosophila pseudoobscura: chromosomal, gene, and cis-element evolution. Genome Res (2005) 15:1–18.
Sawyer SA, Kulathinal RJ, Bustamante CD, Hartl DL. Bayesian analysis suggests that most amino acid replacements in Drosophila are driven by positive selection. J MolEvol (2003) 57:S154–S164.
Sawyer SA, Parsch J, Zhang Z, Hartl DL. Prevalence of positive selection among nearly neutral amino acid replacements in Drosophila. Proc Natl Acad Sci USA (2007) 104:6504–6510.
Schoefl G, Schloetterer C. Patterns of microsatellite variability among X chromosomes and autosomes indicate a high frequency of beneficial mutations in non-african D. simulans. Mol Biol Evol (2004) 21:1384–1390.
Sharp PM, Li WH. An evolutionary perspective on synonymous codon usage in unicellular organisms. J Mol Evol (1986) 24:28–38.[CrossRef][Web of Science][Medline]
Singh ND, Arndt PF, Petrov DA. Minor shift in background substitutional patterns in the Drosophila saltans and willistoni lineages is insufficient to explain GC content of coding sequences. BMC Biol (2006) 4. doi:10.1186/1741-7007-1184-1137.
Singh ND, Bauer DuMont VL, Hubisz MJ, Nielsen R, Aquadro CF. Patterns of mutation and selection at synonymous sites in Drosophila. Mol Biol Evol (2007) 24:2687–2697.
Singh ND, Davis JC, Petrov DA. X-linked genes evolve higher codon bias in Drosophila and Caenorhabditis. Genetics (2005) 171:145–155.
Singh ND, Macpherson JM, Jensen JD, Petrov DA. Similar levels of X-linked and autosomal nucleotide polymorphism in African and non-African strains of Drosophila melanogaster. BMC Evol Biol (2007) 7.
Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proc Natl Acad Sci USA (2003) 98:7375–7379.[CrossRef]
Swanson WJ, Clark AG, Waldrip-Dail HM, Wolfner MF, Aquadro CF. Evolutionary EST analysis identifies rapidly evolving male reproductive proteins in Drosophila. Proc Natl Acad Sci USA (2001) 98:7375–7379.
Swanson WJ, Vacquier VD. The rapid evolution of reproductive proteins. Nat Rev Genet (2002) 3:137–140.[Web of Science][Medline]
Thornton K, Bachtrog D, Andolfatto P. X chromosomes and autosomes evolve at similar rates in Drosophila: no evidence for faster-X protein evolution. Genome Res (2006) 16:498–504.
Thornton K, Long M. Rapid divergence of gene duplicates on the Drosophila melanogaster X chromosome. Mol Biol Evol (2002) 19:918–925.
True JR, Mercer JM, Laurie CC. Differences in crossover frequency and distribution among three sibling species of Drosophila. Genetics (1996) 142:507–523.[Abstract]
Vicario S, Moriyama EN, Powell JR. Codon usage in twelve species of Drosophila. BMC Evol Biol (2007) 7:226.[CrossRef][Medline]
Vicoso B, Charlesworth B. Evolution on the X chromosome: unusual patterns and processes. Nat Rev Genet (2006) 7:645–653.[CrossRef][Web of Science][Medline]
Wall JD, Andolfatto P, Przeworski M. Testing models of selection and demography in Drosophila simulans. Genetics (2002) 162:203–216.
Wang K, Kean L, Yang J, Allan AK, Davies SA, Herzyk P, Dow JAT. Function-informed transcriptome analysis of Drosophila renal tubule. Genome Biol (2004) 5:R69.[CrossRef][Medline]
Welch JJ. Estimating the genomewide rate of adaptive protein evolution in Drosophila. Genetics (2006) 173:821–837.
Wong A, Jensen JD, Pool JE, Aquadro CF. Phylogenetic incongruence in the Drosophila melanogaster species group. Mol Phylogen Evol (2007) 43:1138–1150.[CrossRef][Web of Science][Medline]
Yanai I, Benjamin H, Shmoish M, et al. Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics (2005) 21:650–659.
Yang Z. Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol Biol Evol (1998) 15:568–573.[Abstract]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
N. D. Singh, P. F. Arndt, A. G. Clark, and C. F. Aquadro Strong Evidence for Lineage and Sequence Specificity of Substitution Rates and Patterns in Drosophila Mol. Biol. Evol., July 1, 2009; 26(7): 1591 - 1605. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. L. Bauer DuMont, N. D. Singh, M. H. Wright, and C. F. Aquadro Locus-Specific Decoupling of Base Composition Evolution at Synonymous Sites and Introns along the Drosophila melanogaster and Drosophila sechellia Lineages Gen Biol Evol, June 22, 2009; 2009(0): 67 - 74. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


