MBE Advance Access originally published online on October 13, 2006
Molecular Biology and Evolution 2007 24(1):228-235; doi:10.1093/molbev/msl146
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
Maximum Likelihood Estimation of Ancestral Codon Usage Bias Parameters in Drosophila




* Institute of Biology and Centre for Bioinformatics, University of Copenhagen, Copenhagen, Denmark
Department of Biological Statistics and Computational Biology
Department of Molecular Biology and Genetics, Cornell University
E-mail: rasmus{at}binf.ku.dk.
| Abstract |
|---|
|
|
|---|
We present a likelihood method for estimating codon usage bias parameters along the lineages of a phylogeny. The method is an extension of the classical codon-based models used for estimating dN/dS ratios along the lineages of a phylogeny. However, we add one extra parameter for each lineage: the selection coefficient for optimal codon usage (S), allowing joint maximum likelihood estimation of S and the dN/dS ratio. We apply the method to previously published data from Drosophila melanogaster, Drosophila simulans, and Drosophila yakuba and show, in accordance with previous results, that the D. melanogaster lineage has experienced a reduction in the selection for optimal codon usage. However, the D. melanogaster lineage has also experienced a change in the biological mutation rates relative to D. simulans, in particular, a relative reduction in the mutation rate from A to G and an increase in the mutation rate from C to T. However, neither a reduction in the strength of selection nor a change in the mutational pattern can alone explain all of the data observed in the D. melanogaster lineage. For example, we also confirm previous results showing that the Notch locus has experienced positive selection for previously classified unpreferred mutations.
Key Words: codon usage bias codon usage maximum likelihood codon-based models selection
| Introduction |
|---|
|
|
|---|
The existence of codon bias, the nonrandom use of synonymous codons, is well documented in Drosophila (i.e., Shields et al. 1988
Although codon bias appears to some degree in a number of Drosophila species, a difference in the magnitude of the bias has been observed at the level of individual loci and across genomes (i.e., Munté et al. 1997, 2001
; Kliman 1999). For instance, a genome-wide reduction of codon bias has been observed between D. melanogaster and Drosophila simulans (Akashi 1996). Many more unpreferred mutations (i.e., those from a preferred codon to an unpreferred codon) have fixed along the D. melanogaster lineage. Sixty to seventy percent of synonymous changes between these species that differ between unpreferred and preferred codons have the unpreferred codon in D. melanogaster. In addition, when rooted with an outgroup, many loci have a significant relative rate test at synonymous sites due in large part to the fixation of many unpreferred synonymous codons along the D. melanogaster lineage (Akashi 1996; Bauer DuMont et al. 2004
). Relaxation of constraint, presumably due to a reduction in population size in D. melanogaster, has been the favored explanation for these observations.
However, although relaxation of constraint can explain most of the differences between D. melanogaster and D. simulans in synonymous codon usage, Bauer DuMont et al. (2004)
found evidence for the role of positive selection in the fixation of unpreferred mutations at the Notch locus in D. melanogaster. This was done by devising a counting method similar to the common dN/dS estimation methods (e.g., Nei and Gojobori 1986
). If the change in codon usage between these species at Notch was completely governed by relaxation of constraint, then the ratios in the number of preferred fixations per site and the number of unpreferred fixations per site should be equal. However, they observed significantly more unpreferred fixations per site than preferred fixations per site in D. melanogaster. One objective of this paper is to formalize the inference procedure of Bauer DuMont et al. (2004)
in a codon-based likelihood framework.
The first direct estimators of selection coefficients affecting synonymous sites were obtained by McVean and Vieira (2001)
. They compared DNA data from D. melanogaster and D. simulans and from D. melanogaster and Drosophila virilis using a Markov Chain Monte Carlo method to estimate the strength of codon usage bias in these organisms. This study was a vast improvement over previous studies; in that, it allowed direct estimation of parameters relating to both mutation and selection. McVean and Vieira (2001)
concluded that D. melanogaster shows no evidence of positive selection, whereas D. simulans experiences only half the selection pressure for codon usage of their common ancestor.
The analysis by McVean and Vieira (2001)
was important in establishing appropriate models for statistical inferences of patterns and evolution of codon usage bias. However, by only using pairs of species, very little power is retained to make inferences regarding the pattern and evolution of codon usage bias in each of the 2 ancestral phylogenetic lineages. Conventional wisdom in the field would argue that an outgroup is needed to make lineage-specific inferences, and from a statistical standpoint, it may be argued that lineage-specific inferences in the absence of an outgroup may not be desirable because they may suffer from low power or are very model dependent because most inferences will rely on the nonreversible aspects of the model. In addition, McVean and Vieira (2001)
used a model that allowed selection, but not mutation, to vary among lineages.
Here, we present codon-based likelihood models akin to the models of McVean and Vieira (2001)
but applicable to more than 2 species. To reduce the number of parameters, our models assume that the same strength of codon bias is acting in all amino acids. However, in contrast to McVean and Vieira (2001)
, we allow for different mutational processes among evolutionary lineages, and we also use a more complex mutation model. Our implementation allows for standard numerical optimization methods (McVean and Vieira [2001]
used a stochastic optimization algorithm), and it takes the full complexities of the genetic code into account avoiding the possible confounding effects of nonsynonymous substitutions. We apply the method to previously published data from Drosophila yakuba, D. simulans, and D. melanogaster.
| Materials and Methods |
|---|
|
|
|---|
Likelihood Models
The model we will develop is an extension of the codon-based likelihood models by Goldman and Yang (1994)
). The model is specified in terms of transition rates from state i to state j, qij (i, j
). Any existing Markov model with rates qij can be modified to incorporate codon usage bias, forming a new Markov chain with rates q*ij(i,j
), by a consideration of the underlying population genetics.
If the selection coefficient acting on a mutation from an unpreferred to a preferred codon is s, the probability of fixation of such a mutant is (Kimura 1962
)
|
| (1) |
|
| (2) |
![]() | (3) |
i, i
, then the process defined by q*ij is time reversible with stationary distribution
|
| (4) |
When time reversibility is not assumed, the stationary codon frequencies can be obtained using standard Markov chain methods, for example, by solving a system of 61 linear equations or as a byproduct of the Eigenvector decomposition often used in the calculation of transition probabilities. In the following, we will assume that qij is given by the nucleotide mutation process, modified to take selection at the amino acid level into account. Notice that selection for optimal codon usage will also act on amino acid changes. We represent codon i as a triplet i1i2i3 and codon j as j1j2j3 (i1, i2, i3, j1, j2, j3,
{T, C, A, G}). If codons i and j differ by exactly one nucleotide substitution in position k, then
|
| (5) |
{T, C, A, G}), and
is the rate ratio of nonsynonymous to synonymous substitutions. The parameters of the model are then
, S, and
= {
AT,
AC,
AG,
TA, ...,
GA}, in addition to parameters related to the phylogentic tree such as the relative branch lengths. Additionally, it is possible to allow
, S, or
to vary among branches in the phylogenetic tree. The full model (FM) is here defined as a model in which
, S, or
vary independently among branches, with the exception of the 2 branches around the root in the (binary) tree. In these 2 branches, the values of
, S, and
are assumed to be identical. The total number of parameters of the FM is then 42 for 3 species.
Superimposing this stochastic process on a phylogenetic tree, sampling probabilities of the data can be calculated, and parameters can be estimated using maximum likelihood and numerical optimization (see e.g., Felsenstein 1981
; Goldman and Yang 1994
). In this way, it is possible to estimate lineage-specific parameters of the model, such as parameters pertaining to codon usage bias. Because the most general model we use is not time reversible, estimation is performed on a rooted tree in which D. yakuba is treated as an outgroup. Although the placement of the root along the D. yakuba lineage in principle could be estimated from the data, we fixed it using a molecular clock assumption between D. yakuba and D. simulans, and we assumed that the mutational process was identical on the 2 D. yakuba lineages to reduce the number of free parameters. All the parameter estimates are very robust to the placement of the root, except for the mutational matrix in the D. yakuba lineage itself (under assumptions of non–time reversibility). For example, when D. yakuba is (erroneously) assumed to be the direct ancestor of D. melanogaster and D. simulans, parameter estimates of the mutation rates,
and S on the lineages leading to D. melanogaster and D. simulans differ from the estimates based on the molecular clock assumption by less than 1%. The estimates of selection coefficients and
on the D. yakuba lineage itself changes by only 1–2%.
In addition to analyses where
is considered a parameter, we also perform analyses using the previously published estimates of
by Petrov and Hartl (1999)
. Petrov and Hartl (1999)
used observed substitutions in "dead on arrival" transposable elements (considered to be pseudogenes) that were located through out the genome to infer the underlying mutational pattern in Drosophila. We also analyze a symmetric model without strand biases, that is, where the mutation rates between complementary nucleotides are identical.
We concatenated the sequences from all the loci and estimated parameters under the full model. In a subsequent gene-by-gene analysis, we used mutation parameters from the concatenated data to reduce the number of parameters. Branch lengths and values of
and S were then estimated for each of the 3 lineages.
Notice that we have so far assumed that the strength of the codon usage bias is the same in all amino acids. Obviously, this may not be a realistic assumption and can be relaxed if sufficiently large data sets are available to allow the estimation of additional parameters.
Data
The method was applied to loci for which polymorphism data was available in both D. melanogaster and D. simulans, and polymorphic codons are removed from the analysis, thereby reducing the chance of confounding polymorphisms with fixed substitutions. The maximum likelihood estimates are then only based on (apparently) fixed differences, except where otherwise noted. Eighteen loci were identified from the literature to fit our criteria. A list of the loci and the GenBank accession numbers used can be found in Appendix. For 9 of these loci, polymorphism data from intronic regions was also available. We followed Akashi (1996) in our classification of which codons are optimal. However, the analysis was also repeated assuming only one codon can be optimal for each amino acid. We also present results from the analyses of intronic sequences from 9 of the loci.
Population Genetic Analyses
To further elucidate the role of selection, we construct a test based on comparing variability within and between species at different types of mutations akin to the McDonald and Kreitman test (1991)
. We construct a 2-by-2 contingency table of unpreferred/preferred polymorphism and fixed mutations. A test of homogeneity is then performed using a G-test.
| Results |
|---|
|
|
|---|
Estimation of
and ancestral codon usage biasWe first estimated parameters for the 3 species by concatenating the sequences, assuming that D. yakuba is the outgroup and that the ancestor of these species was at mutation–selection–drift equilibrium at the time of speciation. The results of this analysis are given in table 1.
|
Estimates of
are highest in D. simulans (0.19) and smaller in D. melanogaster (0.15) and D. yakuba (0.14). In contrast, the estimates of S on codon usage bias is similar in D. yakuba (Sy = 1.11) and D. simulans (Ss = 1.17) but close to zero (Sm = 0.13) for D. melanogaster. Assuming a constant mutation rate, the relative small value of
on the D. melanogaster lineage appears primarily to be caused by an increased synonymous rate ratio on this lineage. The expected number of synonymous substitutions per codon on the D. melanogaster lineage is 35% larger than on the D. simulans lineage. For all 3 species, we tested the hypothesis of no selection for optimal codon usage using a likelihood ratio test (table 2). In the case of D. simulans and D. yakuba, we rejected the null hypothesis of S = 0 with strong statistical significance. There is much more power to reject this hypothesis in D. yakuba than in the other lineages, predominantly, because the root is located on this lineage. In the case of D. melanogaster, we could not reject the hypothesis of S = 0. This conclusion also holds up under other assumptions regarding the mutational model, for example, if we assume that the mutation process is identical in D. melanogaster and D. yakuba or if we assume that the mutations rates are identical in D. simulans and D. yakuba. Likewise, this result also holds up if we assume that only one codon is optimal for each amino acid (not shown).
|
McVean and Vierra (2001)
When the analysis is performed including polymorphisms, the estimate of Ss is reduced from 1.17 to 0.82. This is expected if mutations from preferred to unpreferred codons are slightly deleterious because the polymorphism data will contain an excess of such mutations, and they will erroneously be considered fixed differences. This further confirms the existence of codon usage bias in the D. simulans lineage and illustrates the confounding effects of including segregating polymorphism when estimating selection parameters. However, the estimate of Sm changes only from 0.13 to 0.15, again suggesting that the D. melanogaster lineage is either largely unaffected by selection for optimal codon usage or affected by opposite directed selection in different genes (as will be detailed below).
Estimation of mutation parameters
The estimates of mutation rate parameters are quite different between the D. melanogaster and the D. simulans lineages (table 3). In particular, there appears to be a reduction in the A to G and T to C mutations rate and an increase in the C to T mutation rate in D. melanogaster compared with D. simulans. We do not list results for the D. yakuba lineage because these results are sensitive to the placement of the root in a model that is not time reversible. A likelihood ratio tests rejects the hypothesis of equal mutation rates of D. yakuba and D. melanogaster with strong statistical significance (table 2). It also rejects the hypothesis of equal mutation rates between the D. yakuba and D. simulans. The Akaike information criterion (AIC) score for the models assuming
m =
y and
s =
y are –35840.3 and –35826.5, respectively, suggesting that the primary cause of the difference between the mutational processes on the 2 lineages is a change in the mutational process on the D. melanogaster lineage.
|
Assuming strand symmetry, we would expect that the rate of mutation from nucleotide i to j, i
j, equals the rate of mutation from v to k, v
k, if i and v, and j and k form Watson–Crick pairs. We tested this hypothesis using a likelihood ratio test (table 2) and could reject the hypothesis of strand symmetry with strong statistical confidence. This is in agreement with recent studies by Singh et al. (2005)
We also tested if the mutation model of Petrov and Hartl (1999
; PH) fits the data, and we could reject the hypothesis of this mutational model (
=
PH) with strong statistical confidence. However, it is interesting to note that our estimates of the mutation matrix in D. melanogaster is very similar to the PH model, whereas the estimates in D. simulans are much more different from the PH model. This suggests that the PH estimates of the mutation rate are relatively accurate but do not fit the data well simply because the mutation matrices are different in the 3 species. We tested this by imposing the PH model on the D. melanogaster lineage only. Using a likelihood ratio test, we could not reject this model against the general model (LR = 0.7; P = 0.17), demonstrating that the PH model provides a good approximation to the mutational process on this lineage. The results presented here would, therefore, be qualitatively similar if we imposed the PH mutation matrix on the D. melanogaster lineage.
We tested if a model involving only a change in the mutation process but no change in Sm could explain the shift codon usage in D. melanogaster (table 1). Again, the hypothesis of no difference between Sm and Ss can be rejected with strong statistical significance (LR = 20.4, P < 0.0001), suggesting that mutation alone is not driving the difference in codon usage between these species.
Among the models examined here, the best performing model, according to the AIC, is a model with no selection for optimal codon usage in the D. melanogaster lineage but with different mutation matrices among all 3 lineages and selection on the D. yakuba and D. simulans lineages (Sm = 0). The second best model is the FM, and the third best model is a model with equal mutation matrices on the D. yakuba and D. simulans linages and no selection on the D. melanogaster lineage (
s =
y, Sm = 0). Changing the assumptions so that only one codon is optimal for each amino acid gives an Aikaike score of 71619.4 for the full model, 71618.1 for the model with Sm = 0, and 71617.8 for the model with
s =
y and Sm =0, suggesting that the latter model is preferable. In general, the conclusions of no (or very weak) selection in the D. melanogaster lineage and a shift in the mutation process along the D. melanogaster lineage seems to be relative robust to assumptions regarding the set of optimal codons. Also, the much higher AIC value for a model assuming
m =
y than for a model assuming
s =
y (table 1), suggests that the major cause for the difference in mutation process between D. melanogaster and D. simulans is a shift in D. melanogaster.
We estimated mutation matrices for intronic regions from 9 of the loci analyzed to evaluate if the hypothesis of a change in the mutational process is also supported by this data (table 3). The results are roughly compatible with the results obtained from the exonic regions but show an even stronger mutational shift between D. melanogaster and D. simulans in the C to T and A to G mutation rates.
Gene-by-Gene Analysis
Parameters were estimated for each gene based on the species-specific mutation matrices obtained from the concatenated data. This analysis reveals evidence for codon usage bias in all genes in the ancestral D. yakuba lineage (table 4) where a likelihood ratio test is significant at the 5% level (before correction for multiple tests) in all cases but anon1g5. In D. simulans, many genes show evidence for selection for optimal codon usage, and for 3 genes, we can reject the null hypothesis of Ss = 0 at the 5% level using a likelihood ratio test. In all cases where the null hypothesis of Ss = 0 or Sy = 0 can be rejected, the estimates of Ss and Sy are positive (selection for the optimal codon).
|
In the D. melanogster lineage, we can only reject the hypothesis of Sm = 0, for one gene: N3, the 3' end of the Notch locus with strong statistical confidence (P = 0.0003). Curiously, the estimate of Sm for this gene is negative (–1.9), meaning that unpreferred mutations are being favored on the D. melanogaster lineage. Clearly, the evolution of this gene is rather different from the other genes. Notch 3' also has the largest likelihood ratio against the hypothesis of Ss = 0 (30.3) but with a positive estimate of Ss (3.6). Thus, there seems to be selection favoring different sets of codons in this gene region in the D. simulans and the D. melanogaster lineages, as was also inferred by Bauer DuMont et al. (2004)
Population Genetic Analysis
The contingency tables for unpreferred/preferred polymorphisms and fixations are shown in table 5. Homogeneity cannot be rejected in D. melanogaster (P = 0.66) but can be rejected in D. simulans (P < 0.0001). This illustrates that the evolutionary processes are different in these 2 species, and the result is consistent with the hypothesis of a relaxation of selection in the D. melanogaster lineage.
|
At equilibrium, there should be equally many fixed preferred and unpreferred mutations. This is observed in D. simulans but not in D. melanogaster (table 5), again indicating that the process is at equilibrium in D. simulans but not in D. melanogaster. The fact that D. melanogaster have equal ratios of fixed to polymorphic unpreferred and preferred mutations, but an almost 20-fold increase in unpreferred fixations overpreferred fixations, demonstrates that the D. melanogaster is not at equilibrium but has experienced a change in either mutation bias or intensity of selection. In contrast, D. simulans has an equal ratio of unpreferred and preferred fixed mutations, indicating that the difference between the 2 species cannot be explained by nonequilibrium conditions in D. simulans. These results are in agreement with Kern and Begun (2005) who reported nonequilibrium evolution at both synonymous and intron sites in D. melanogaster but equilibrium in D. simulans when comparing GC with AT versus AT to GC fixations. Combining this with the evidence from maximum likelihood analysis and the test of homogeneity, we conclude that D. melanogaster has experienced reduced selection for optimal codon usage.
| Discussion |
|---|
|
|
|---|
A change in codon usage can be due to shifts in selection pressures or a change in the mutation process. Both types of changes have been documented in Drosophila either at the level of individual loci or genome wide (e.g., Munté et al. 1997
Bauer DuMont et al. (2004)
concluded that there has been an acceleration of unpreferred changes in the Notch locus along the D. melanogaster lineage. Our analysis confirms this observation, with a strongly significant likelihood ratio test indicating positive selection for apparent unpreferred synonymous mutations in this locus. Simple relaxation of constraint and the apparent change in mutation bias along this lineage cannot explain the data observed for the 3' end of the Notch locus. Also, the difference in the likelihood ratio from the 3' end of the Notch locus compared with all other loci is so extreme that the Notch locus must be considered a clear outlier. It is also interesting to notice that the selection on the Notch 3' end is also extremely strong in D. simulans but is favoring apparent preferred codons. This raises the possibility that lineage-specific changes in the expression and/or function of the Notch locus is being modulated by codon usage in this locus.
In the other loci, a clear picture is emerging from this likelihood analysis. Selection for optimal codon usage is affecting D. yakuba and D. simulans but is only weakly affecting D. melanogaster, if at all. However, there appears to have been a change in the mutational process in the D. melanogaster toward lower mutation rate of T to C mutations and A to G mutations. This change is observed in both exonic and intronic regions and can, therefore, not be explained by possible model inadequacies. On the other hand, a model involving only a change in the mutation process can also not explain the data. The reduction in selection intensity in the D. melanogaster lineage inferred from the phylogenetic analysis is corroborated by the population genetic data both using the number of preferred to unpreferred changes within and between species and by considering the frequency spectrum on unpreferred and preferred polymorphisms.
The conventional explanation for the apparent reduction in codon bias along the D. melanogaster lineage has been a decrease in population size in this species. However, inferring effective population size for a species is notoriously difficult, and while differences in genomic patterns of variations between D. melanogaster and D. simulans suggests a smaller effective population size in the former species, there are caveats to this interpretation (as discussed in Capy and Gibert 2004
; Morton et al. 2004
). It should also be emphasized that the gene-by-gene analyses reveals that a model assuming no selection on the D. melanogaster lineage cannot explain all our observations as Sm = 0 can be rejected for the Notch locus. The fact that D. melanogaster, presumably, has experienced both a change in the mutation process and a reduction of the intensity of selection may explain why the pattern of codon usage in D. melanogaster has remained such an enigma in studies of molecular evolution. The possibility that selection is working to fix apparently unpreferred mutations in some loci may be a further contributing factor.
The methods in this paper are readily applicable to the large genomic data sets currently being generated in various Drosophila species. Applications to these large data sets will help further elucidate the evolution of codon usage bias in Drosophila. One note of caution, however, arises from the current study. It is clear that in lineages affected by selection for optimal codon bias, the strength of selection will be underestimated because polymorphisms and interspecific substitutions will be confounded. The collection of appropriate polymorphism data will allow this key discrimination to be made.
A program performing the analyses discussed in this paper will be available from http://www.binf.ku.dk/
rasmus/webpage/programs.html.
| Appendix 1 GenBank Accession Numbers of Sequences Analyzed |
|---|
|
|
|---|
Listed below are the genes used in this study and their GenBank accession numbers (D. melanogaster (mel), D. simulans (sim), D. yakuba (yak)): Adh (mel: M17827, M17837, M19547, M17828, M17830, M17831, M17832, M17833, M17834, M17835, M17836; sim: M19263, X57364, X57363, X57362, X57361; yak: X57366), anon1A3 (mel: AF161745, AF161746, AF161727, AF161735, AF161736, AF161737, AF161743, AF161744, AF161729, AF161731, AF161738, AF161739; sim: AF161749, AF161759, AF161750, AF161752, AF161753, AF161754, AF161756, AF161758, AF161760, AF161755, AF161757, AF161751; yak: AF005844), anon1E9 (mel: AF161764, AF161765, AF161767, AF161770, AF161771, AF161772, AF161773, AF161766; sim: AF161776, AF161777, AF161778, AF161779, AF161780, AF161781, AF161782, AF161783; yak: AF005848), anon1G5 (mel: AF005865, AF005879, AF005880, AF161786, AF005866, AF005867, AF005868, AF005869, AF005870, AF005871, AF005873, AF005878; sim: AF005874, AF005875, AF005876, AF161787, AF161788, AF161789, AF161790; yak: AF005852), Hex-A (mel: AF257523, AF257532, AF257533, AF257534, AF257524, AF257525, AF257526, AF257527, AF257528, AF257529, AF257530, AF257531; sim: AF257609, AF257618, AF257619, AF257620, AF257610, AF257611, AF257612, AF257613, AF257614, AF257615, AF257616, AF257617; yak: AF257650), Hex-C (mel: AF257540, AF257549, AF257550, AF257551, AF257541, AF257542, AF257543, AF257544, AF257545, AF257546, AF257547, AF257548; sim: AF257623, AF257632, AF257633, AF257634, AF257635, AF257363, AF257624, AF257625, AF257626, AF257627, AF257628, AF257629, AF257630. AF257631; yak: AF257651), Hex-T1 and Hex-T2 (mel: AF257590, AF257599, AF257600, AF257601, AF257591, AF257592, AF257593, AF257594, AF257595, AF257596, AF257597, AF257598, AF257602; sim: AF257637, AF257642, AF257646, AF257647, AF257648, AF257649, AF257638, AF257640, AF257641, AF257643, AF257645, AF257639, AF257644; yak: AF257652), mth: (mel: AF280552, AF280561, AF280563, AF280553, AF280554, AF280555, AF280556, AF280557, AF280558, AF280559, AF280560; sim: AF280602, AF280593, AF280592, AF280591, AF280601, AF280600, AF280599, AF280598, AF280597, AF280596, AF280595, AF280594; AF280583), N3' (mel: AF360583, AF360581, AF360582, AF360584, AF360585, AF360586, AF360587, AF360588, AF360589, AF360590, AF360591, AF360592, AF360594, AF360595; sim: AY191373, AY191369, AY191370, AY191371, AY191372, AY191374, AY191375, AY191376, AY191377, AY191378, AY191379, AY191380; yak: AY191414), N5' (AF361407, AF361408, AF361409, AF361410, AF361411, AF361412, AF361413, AF361414, AF361415, AF361416, AF361417, AF361418, AF361419, AF361420, AF361421 sim: AY191395, AY191391, AY191392, AY191393, AY191394, AY191395, AY191396, AY191397, AY191398, AY191399, AY191400, AY191401, AY191402; yak: AY191413), per (mel: L07817, L07818, L07819, L07821, L07823, L07825; sim: L07826, L07832, L07828, L07829, L07830, L07831; yak: X61127), Pgi (mel: L27539, L27554, L27555, U20566, U20567, U20568, U20569, U20570, U20571, U20572, U20573, L27540, U20574, U20575, L27541, L27542, L27543, L27544, L27545, L27546, L27553; sim: L27547, U20559, U20560, U20561, U20564, U20565, L27548, L27549, L27550, L27551, L27552, U20556, U20557, U20558; yak: L27673), Pgm (mel: AF290313, AF290328, AF290330, AF290331, AF290315, AF290316, AF290317, AF290323, AF290324, AF290325, AF290326, AF290327; sim: AF290366, AF290367, AF290368, AF290369, AF290358, AF290359, AF290360, AF290361, AF290362, AF290363, AF290364, AF290365, AF290357; yak: AF290370),), Rel (mel: AF204284, AF204286, AF204287, AF204285, AF204288, AF204289; sim: AF204277, AF204278, AF204279, AF204280, AF204281, AF204282, AF204283; yak: AF204290), Tpi (mel: U60836, U60845, U60846, U60847, U60837, U60838, U60839, U60840, U60841, U60842, U60843, U60844, U60851, U60853, U60854; sim: U60861, U60862, U60863, U60864, U60865, U60866, U60867, U60868), U60869; yak: U60870), z (mel: L13045, L13043, L13044, L13046, L13047, L13048; sim: L13050, L13049, L13051, L13052, L13053, L13055; yak: AF255327), Zw (mel: U42738, U42747, U42748, U42749, U43165, U43166, U43167, U44721, U45985, U42739, U42740, U42741, U42742, U42743, U42744, U42745, U42746; sim: L13891, L13892, L13893, L13894, L13876, L13877, L13878, L13879, L13881, L13882, L13883, L13884; yak: U42750).
| Acknowledgements |
|---|
|
|
|---|
This work was supported by National Science Foundation/National Institutes of Health (NIH) grant DMS/NIGMS—0201037 to R. Durrett, R.N., and C.F.A, a grant from the Danish National Science Foundation to R.N., and by NIH grant GM36431 to C.F.A.
| Footnotes |
|---|
Arndt von Haeseler, Associate Editor
| References |
|---|
|
|
|---|
Akashi H. (1995) Inferring weak selection from patterns of polymorphism and divergence at "silent" sites in Drosophila DNA. Genetics 139:1067–1076.[Abstract]
Akashi H. (1996) Molecular evolution between Drosophila melanogaster and D. simulans: reduced codon bias, faster rates of amino acid substitution, and larger proteins in D. melanogaster. Genetics 144:1297–1307.[Abstract]
Akashi H, Ko W-Y, Piao S, John A, Goel P, Lin C-F, Vitins AP. (2006) Molecular evolution in the Drosophila melanogaster species subgroup: frequent parameter fluctuations on the time-scale of molecular divergence. Genetics 172:1711–1726.
Bartolomé C, Maside X, Yi S, Grant AL, Charlesworth B. (2005) Patterns of selection on synonymous and non-synonymous variants in Drosophila miranda. Genetics 169:1495–1507.
Bauer DuMont V, Fay JC, Calabrese PP, Aquadro CF. (2004) DNA variability and divergence at the Notch locus region of Drosophila melanogaster and D. simulans: a case of accelerated synonymous site divergence. Genetics 167:171–185.
Capy P and Gilbert P. (2004) Drosophila melanogaster, Drosophila simulans: so similar yet so different. Genetica 120:5–16.[CrossRef][Web of Science][Medline]
Comeron JM, Kreitman M, Aguadé M. (1999) Natural selection on synonymous sites is correlated with gene length and recombination in Drosophila. Genetics 151:239–249.
De Cock JGR, Klink EC, Ferro W, Lohman PHM, Eeken JCJ. (1992) Neither enhanced removal of cyclobutane pyrimidine dimers nor strand-specific repair is found after transcriptoin induction of the beta-3-tubulin gene in a Drosophila embryonic cell line Kc. Mutat. Res. 293:11–20.[CrossRef][Web of Science][Medline]
Duret L. (2002) Evolution of synonymous codon usage in metazoans. Curr Opin Genet Dev 12:640–649.[CrossRef][Web of Science][Medline]
Felsenstein J. (1981) Evolutionary trees from DNA sequences—a maximum-likelihood approach. J Mol Evol 17:368–376.[CrossRef][Web of Science][Medline]
Goldman N, Thorne JL, Jones DT. (1998) Assessing the impact of secondary structure and solvent accessibility on protein evolution. Genetics 149:445–458.
Goldman N and Yang Z. (1994) A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol Biol Evol 11:725–736.[Abstract]
Kern AD, Jones CD, Begun DJ. (2002) Genomic effects of nucleotide substitutions in Drosophila simulans. Genetics 162:1753–1761.
Kimura M. (1962) On the probability of fixation of mutant genes in a population. Genetics 47:713–719.
Kliman RM. (1999) Recent selection on synonymous codon usage in Drosophila. J Mol Evol 49:343–351.[CrossRef][Web of Science][Medline]
Kliman RM and Hey J. (1993) Reduced natural selection associated with low recombination in Drosophila melanogaster. Mol Biol Evol 10:1239–1258.[Abstract]
Maside X, Lee AW, Charlesworth B. (2004) Selection on codon usage in Drosophila americana. Curr Biol 14:150–154.[CrossRef][Web of Science][Medline]
McDonald JH and Kreitman M. (1991) Adaptive protein evolution at the Adh locus in Drosophila. Nature 20:652–654.
McVean GAT and Vieira J. (2001) Inferring parameters of mutation, selection, and demography from patterns of synonymous site evolution in Drosophila. Genetics 157:245–257.
Morton RA, Choudhary M, Cariou M-L, Singh RS. (2004) A reanalysis of protein polymorphism in Drosophila melanogaster, D. simulans, D. sechellia and D. mauritiana: effects of population size and selection. Genetica 120:101–114.[CrossRef][Web of Science][Medline]
Munté A, Aguadé M, Segarra C. (1997) Divergence of the yellow gene between Drosophila melanogaster and D. subobscura: recombination rate, codon bias and synonymous substitution. Genetics 147:165–175.[Abstract]
Munté A, Aguadé M, Segarra C. (2001) Changes in the recombinational environment affect divergence in the yellow gene of Drosophila. Mol Biol Evol 18:1045–1056.
Muse SV and Gaut BS. (1994) A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. Mol Biol Evol 11:715–724.[Abstract]
Nei M and Gojobori T. (1986) Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol 3:418–426.[Abstract]
Petrov DA and Hartl DL. (1999) Patterns of nucleotide substitution in Drosophila and mammalian genomes. Proc Natl Acad Sci USA 96:1475–1479.
Powell JR and Moriyama EN. (1997) Evolution of codon usage bias in Drosophila. Proc Natl Acad Sci 94:7784–7790.
Rodríguez-Trelles F, Tarrío R, Ayala FJ. (1999) Switch in codon bias and increased rates of amino acid substitution in the Drosophila saltans species group. Genetics 153:339–350.
Rodríguez-Trelles F, Tarrío R, Ayala FJ. (2000) Fluctuating mutation bias and the evolution of base composition in Drosophila. J Mol Evol 50:1–10.[Web of Science][Medline]
Shields DC, Sharp PM, Higgins DG, Wright F. (1988) Silent' sites in Drosophila genes are not neutral: evidence of selection among synonymous codons. Mol Biol Evol 5:704–716.[Abstract]
Singh ND, Arndt PF, Petrov DA. (2005) Genomic heterogeneity of background substitutional patterns in Drosophila melanogaster. Genetics 169:709–722.
Takano-Shimizu T. (1999) Local recombination and mutation effects on molecular evolution in Drosophila. Genetics 153:1285–1296.
Takano-Shimizu T. (2001) Local changes in GC/AT substitution biases and in crossover frequencies on Drosophila chromosomes. Mol Biol Evol 18:606–619.
Van Der Helm PJL, Klink EC, Lohman PHM, Eeken JCJ. (1997) The repair of UV-induced cyclobutane pyrimidine dimers in the individual genes Gart, Notch and white from isolated brain tissue of Drosophila melanogaster. Mutat Res 383:113–124.[Web of Science][Medline]
Yang Z, Goldman N, Friday AE. (1994) Comparison of models for nucleotide substitution used in maximum likelihood phylogenetic estimation. Mol Biol Evol 11:316–324.[Abstract]
Yang Z, Nielsen R, Goldman N, Pedersen A-MK. (2000) Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155:431–449.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
N. D. Singh, P. F. Arndt, A. G. Clark, and C. F. Aquadro Strong Evidence for Lineage and Sequence Specificity of Substitution Rates and Patterns in Drosophila Mol. Biol. Evol., July 1, 2009; 26(7): 1591 - 1605. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. L. Bauer DuMont, N. D. Singh, M. H. Wright, and C. F. Aquadro Locus-Specific Decoupling of Base Composition Evolution at Synonymous Sites and Introns along the Drosophila melanogaster and Drosophila sechellia Lineages Gen Biol Evol, June 22, 2009; 2009(0): 67 - 74. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Anisimova and C. Kosiol Investigating Protein-Coding Sequence Evolution with Probabilistic Codon Substitution Models Mol. Biol. Evol., February 1, 2009; 26(2): 255 - 271. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Rodrigue, N. Lartillot, and H. Philippe Bayesian Comparisons of Codon Substitution Models Genetics, November 1, 2008; 180(3): 1579 - 1591. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. K. Holloway, D. J. Begun, A. Siepel, and K. S. Pollard Accelerated sequence divergence of conserved genomic elements in Drosophila melanogaster Genome Res., October 1, 2008; 18(10): 1592 - 1601. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. R. Haddrill, D. Bachtrog, and P. Andolfatto Positive and Negative Selection on Noncoding DNA in Drosophila simulans Mol. Biol. Evol., September 1, 2008; 25(9): 1825 - 1834. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. P. Foxe, V.-u.-N. Dar, H. Zheng, M. Nordborg, B. S. Gaut, and S. I. Wright Selection on Amino Acid Substitutions in Arabidopsis Mol. Biol. Evol., July 1, 2008; 25(7): 1375 - 1383. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Irimia and S. W. Roy Spliceosomal introns as tools for genomic and evolutionary analysis Nucleic Acids Res., March 1, 2008; 36(5): 1703 - 1712. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Yang and R. Nielsen Mutation-Selection Models of Codon Substitution and Their Use to Estimate Selective Strengths on Codon Usage Mol. Biol. Evol., March 1, 2008; 25(3): 568 - 579. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Andolfatto Hitchhiking effects of recurrent beneficial amino acid substitutions in the Drosophila melanogaster genome Genome Res., December 1, 2007; 17(12): 1755 - 1762. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. D. Singh, V. L. Bauer DuMont, M. J. Hubisz, R. Nielsen, and C. F. Aquadro Patterns of Mutation and Selection at Synonymous Sites in Drosophila Mol. Biol. Evol., December 1, 2007; 24(12): 2687 - 2697. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Jensen, V. L. Bauer DuMont, A. B. Ashmore, A. Gutierrez, and C. F. Aquadro Patterns of Sequence Variability and Divergence at the diminutive Gene Region of Drosophila melanogaster: Complex Patterns Suggest an Ancestral Selective Sweep Genetics, October 1, 2007; 177(2): 1071 - 1085. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





