Molecular Biology and Evolution 19:1464-1473 (2002)
© 2002 Society for Molecular Biology and Evolution
Patterns of Nucleotide Substitution Among Simultaneously Duplicated Gene Pairs in Arabidopsis thaliana

*Department of Ecology and Evolutionary Biology, University of California;
Department of Biology, University of North Carolina at Chapel Hill
| Abstract |
|---|
|
|
|---|
We characterized rates and patterns of synonymous and nonsynonymous substitution in 242 duplicated gene pairs on chromosomes 2 and 4 of Arabidopsis thaliana. Based on their collinear order along the two chromosomes, the gene pairs were likely duplicated contemporaneously, and therefore comparison of genetic distances among gene pairs provides insights into the distribution of nucleotide substitution rates among plant nuclear genes. Rates of synonymous substitution varied 13.8-fold among the duplicated gene pairs, but 90% of gene pairs differed by less than 2.6-fold. Average nonsynonymous rates were
fivefold lower than average synonymous rates; this rate difference is lower than that of previously studied nonplant lineages. The coefficient of variation of rates among genes was 0.65 for nonsynonymous rates and 0.44 for synonymous rates, indicating that synonymous and nonsynonymous rates vary among genes to roughly the same extent. The causes underlying rate variation were explored. Our analyses tentatively suggest an effect of physical location on synonymous substitution rates but no similar effect on nonsynonymous rates. Nonsynonymous substitution rates were negatively correlated with GC content at synonymous third codon positions, and synonymous substitution rates were negatively correlated with codon bias, as observed in other systems. Finally, the 242 gene pairs permitted investigation of the processes underlying divergence between paralogs. We found no evidence of positive selection, little evidence that paralogs evolve at different rates, and no evidence of differential codon usage or third position GC content. | Introduction |
|---|
|
|
|---|
Synonymous and nonsynonymous substitution rates vary among genes, but the evolutionary forces underlying rate variation remain unclear. For nonsynonymous substitutions, substitution rates vary in part because proteins differ in their degree of selective constraint. For example, histone genes probably evolve slowly because most amino acid changes disrupt gene function and are therefore selectively deleterious (Nei 1987
It is even more difficult to identify causes of synonymous rate variation among genes, but at least three molecular characteristics correlate with synonymous substitution rates. The first characteristic is GC content (Ticher and Graur 1989
), which affects synonymous substitution rates because CpG dinucleotides are subject to high rates of mutation (Tsunoyama, Bellgard, and Gojobori 2001
). The second characteristic is codon bias. Highly biased genes evolve more slowly, on average (Sharp and Li 1987
; Akashi 1994a
, 2001
; Eyre-Walker and Bulmer 1995
), presumably because selection for codon use limits the number of acceptable synonymous nucleotide substitutions in highly biased genes. Finally, mutation rates can differ among genomic regions, contributing to variation in synonymous substitution rates among genes (Wolfe, Sharp, and Li 1989a
; Matassi, Sharp, and Gautier 1999
). However, none of these three factors alone consistently explains variation in synonymous substitution rates (Ticher and Graur 1989
; Wolfe, Sharp, and Li 1989a
; Akashi 1994b, 1997
; Matassi, Sharp, and Gautier 1999
), and the interdependence between these factors is not always clear.
An important prerequisite for understanding causes of rate variation is to characterize the distribution of rates among genes. To date, most studies of the distribution of synonymous and nonsynonymous substitution rates have been restricted to animals. In plants, several studies have examined rate variation across evolutionary lineages of mitochondrial and chloroplast genes (e.g., Gaut et al. 1992
; dePamphilis, Young, and Wolfe 1997
; Laroche, Maggia, and Bousquet 1997
), but few studies have either characterized rate variation among plant nuclear genes or discerned the factors contributing to rate variation (Wolfe, Sharp, and Li 1989b
; Alvarez-Valin et al. 1999
). One possible reason for the dearth of plant studies is that most plant nuclear genes are members of multigene families. Copy number within multigene families fluctuates (Clegg, Cummings, and Durbin 1997
), and as a result it is often difficult to identify orthologs between species and hence to compare substitution rates among genes.
The recently sequenced Arabidopsis thaliana genome offers a unique opportunity to study substitution rate variation across plant nuclear genes. Many large arabidopsis chromosomal segments are duplicated (Mayer et al. 1999
; AGI 2000
), and these segments contain genes that were likely duplicated contemporaneously. Genetic distances can be compared among pairs of duplicated sequences (or gene pairs). For this special case in which the time of duplication is equivalent for all gene pairs, comparing genetic distances among gene pairs is equivalent to comparing nucleotide substitution rates among gene pairs.
In addition to providing insight into rate variation among genes, sequence data from duplicated chromosomal regions also facilitate characterization of patterns of sequence divergence between paralogs. Divergence patterns between paralogs have been of interest since Ohno (1970)
suggested that most adaptive changes occur after gene duplication (see also Ohta and Kimura 1973
). At the molecular level, adaptive changes can be detected by measuring the ratio of nonsynonymous (Ka) to synonymous substitution (Ks). A Ka/Ks ratio >1 is strong evidence for positive selection having acted during sequence divergence, and a Ka/Ks < 1 is consistent with purifying selection, although it does not rule out the possibility that positive selection acted. Although the sensitivity of Ka/Ks to detect positive selection can be low (Hughes 1999
), the distribution of Ka/Ks among genes can be useful for characterizing the relative strengths of evolutionary forces acting on individual genes (Charlesworth, Charlesworth, and McVean 2001)
.
Sequence divergence can have other measurable effects. For example, shifts in selective constraint after duplication can lead to changes in rates of nucleotide substitution in one or both of the duplicated sequences (Goodman, Moore, and Matsuda 1975
; Li and Gojobori 1983
; Gonzalez and Jordan 2000
). Similarly, paralogs often diverge in synonymous codon usage (Gaut et al. 1999
; Zhang, Kosakovsky-Pond, and Gaut 2001b
), either as a function of changes in mutation biases or as a function of shifts in gene expression (Fennoy and Bailey-Serres 1993
; Duret and Mouchiroud 1999
).
The large number of duplicated genes in A. thaliana permits unprecedented examination of rate variation among plant nuclear genes and also facilitates study of patterns of evolutionary divergence between paralogs. Here we characterize patterns of molecular evolution in a duplicated region between arabidopsis chromosomes 2 and 4. With data from 242 gene pairs, we address the following questions: (1) What is the distribution of synonymous and nonsynonymous substitution rates among gene pairs? (2) Are synonymous and nonsynonymous substitution rates a function of physical location, GC content, or codon usage? (3) Is there evidence for positive selection acting during the divergence of paralogs? (4) Is sequence divergence between paralogs accompanied by divergence in codon usage or evolutionary rate?
| Materials and Methods |
|---|
|
|
|---|
Data set
A duplicated region of 5.6 Mb on chromosome 2 and 4.6 Mb on chromosome 4 was initially identified by Terryn et al. (1999)
100 MYA (Vision, Brown, and Tanksley 2000
The chromosome 24 duplication was further partitioned into five blocks (numbered 45, 49, 52, 54, and 56 by Vision, Brown, and Tanksley 2000
) that differ in their order and orientation relative to one another on the two chromosomes but show only minor within-block rearrangement (fig. 1
). A minimum of four inversions is necessary to convert the order and orientation of blocks on one chromosome into that of the other. It is thought that these blocks arose from a single duplication event, both because of their suggestive spatial arrangement and based on a genome-wide analysis of the distribution of amino acid distances between duplicated genes (Vision, Brown, and Tanksley 2000
). It should be emphasized, however, that our analyses do not assume that all blocks were duplicated at the same time, but we do assume that gene pairs within a single block were duplicated contemporaneously when they were in collinear order.
|
Our initial task was to identify gene pairs for further study. In the data set compiled by Vision, Brown, and Tanksley (2000)
The accession numbers of the 484 sequences used in the analyses are available at bgbox.bio.uci.edu/data/lq2acn.html.
Alignment
Amino acid alignments were obtained with ClustalW (Thompson, Higgins, and Gibson 1994
), using default parameters. Visual inspection revealed that the alignments around gaps were sometimes ambiguous, and we therefore analyzed two alignment data sets. In the first data set, the residues around gaps were included. In the second data set, five amino acids were removed on either side of each gap. The two sets of amino acid alignments were then back translated using the known coding sequences to produce DNA sequence alignments. Gap treatment did not qualitatively affect our results, and we therefore report only the results for the data set with the original alignments. The DNA sequence alignments are available at bgbox.bio.uci.edu/.
Sequence Analyses
We calculated the percent identity of DNA sequences and protein sequences for all gene pairs. The expected relationship between DNA and amino acid identity under strictly neutral evolution was obtained by simulation, using Evolver (Yang 1997
). We simulated DNA sequences with distances ranging from 0.1 to 2.0 along distance intervals of 0.05 units, corresponding to the range of distances observed between duplicated genes in our data set. For each distance, 20 pairs of DNA sequences of 500 codons were simulated. For all simulated data, percent identity was calculated from DNA sequences and translated protein sequences.
Ks and Ka between duplicated sequences were estimated by the maximum likelihood (ML) method implemented in PAML (Yang 1997
), using the codon model of Goldman and Yang (1994)
. We used a likelihood ratio (LR) comparison to test for a Ka/Ks ratio different from 1.0. To do this, two models were applied to the data: model 0 constrains the Ka/Ks ratio to 1.0, and model 1 estimates the Ka/Ks ratio as a free parameter. The LR of model 0 and model 1 was compared with the
2 distribution with one degree of freedom, as detailed by Yang (1998)
.
Relative rate tests were performed with HYPHY (http://peppercat.statgen.ncsu.edu/
hyphy/), for both protein sequences and DNA sequences. In order to apply the relative rate test, we obtained outgroup sequences for a large number of the gene pairs. Ku et al. (2000)
proposed that the duplication between chromosomes 2 and 4 occurred after the split of the tomato and A. thaliana lineages. If this is true, then plants as divergent, or more divergent, from A. thaliana than tomato can be used as the source of outgroup sequences. The Rosidae are the largest phylogenetic grouping in the NCBI taxonomy that contains A. thaliana and excludes tomato. Accordingly, we searched for candidate outgroup sequences among all GenBank records derived from angiosperms, excluding the Rosidae. TBLASTX (Altschul et al. 1997
) was used to find matches between the 242 genes and the DNA sequences in the database. We required that both arabidopsis genes showed a strong match (expected value <1 x 10-10) to at least one of the sequences in the database for which a predicted or experimentally determined protein translation was available. In cases where multiple GenBank records met these criteria, the outgroup was selected from among the following NCBI-derived taxonomic groupings, in descending order of preference (and increasing taxonomic breadth): Asteridae (which includes tomato), core eudicots, eudicotyledons, and Magnoliophyta.
Each relative rate test was based on one gene pair and its outgroup. For DNA sequences, synonymous and nonsynonymous substitution rates were tested separately using the codon substitution model of Muse and Gaut (1994)
. For amino acid sequences, we used the substitution model of Dayhoff, Schwartz, and Orcutt (1978)
. The relative rate test uses an LR statistic to test the null hypothesis that two sequences evolve at equal rates (Muse and Weir 1992
).
The effective number of codons (ENC) and GC content at synonymous third codon positions (GCS) were calculated using codon W (http://www.molbiol.ox.ac.uk/cu/). We used the ENC as a measure of codon usage because it is not biased by gene length, given a certain minimum length, or by amino acid composition (Wright 1990
; Comeron and Aguade 1998
). ENC values range from 20 to 61; a value of 20 represents extreme bias, and a value of 61 indicates random codon use. We tested for homogeneity of ENC and GCS between duplicated sequences by a permutation procedure described previously (Zhang, Kosakovsky-Pond, and Gaut 2001b
), with the slight modification that the test statistic was the difference in ENC (or GCS) between sequences rather than the variance in ENC (or GCS) among sequences. Test statistics were based on 1000 permutations.
Spatial correlation of Ka and Ks
To examine the relationship between evolutionary distance and physical position, we performed a spatial correlation analysis. The following statistic (Chatfield 1999
, p. 20) was calculated for a range of distances between gene pairs, where distance was defined as the number of intervening genes between gene pairs:
|
|
and s2x are the mean and sample variance of Ka (or Ks), and n is the number of possible comparisons. This statistic tends toward zero when the Ka (or Ks) values of gene pairs are uncorrelated at distance d, approaches 1.0 when they are tightly correlated, and approaches negative 1.0 when they are inversely correlated. To test for significant spatial correlations, the observed values were compared with the tail values of Cd calculated on 1000 permuted data sets generated by assigning the original Ka (or Ks) values to random positions on the chromosome. The analyses were performed separately for each block and for all blocks simultaneously. | Results |
|---|
|
|
|---|
The Distribution of Substitution Rates Among Gene Pairs
We collected 242 gene pairs duplicated between chromosome 2 and 4 that were in compact collinear order and not duplicated in tandem. Their chromosomal locations are shown in figure 1 . Among all gene pairs, DNA sequence identity ranged from 46.8% to 92.8%, and protein sequence identity ranged from 33.2% to 100%. Over the entire region, Ka had a range of 0.00050.78 nonsynonymous substitutions per nonsynonymous site and a mean of 0.17. The range of Ks was 0.314.27 synonymous substitutions per synonymous site, with a mean of 0.82 (table 1 and fig. 2 ). Despite the larger range of Ks, coefficients of variation (CV) of Ka and Ks were similar, with a 1.5-fold difference between them (table 1 ). Furthermore, Ka and Ks were positively and significantly correlated among genes within each block, just as they were correlated over the entire five-block region (Kendall's rank correlation
= 0.70, P < 0.001; table 1
).
|
|
Gene pairs within individual blocks exhibited a similar range of DNA and protein identities to the entire region, but there were apparent differences among blocks. For example, block 45 had a relatively high synonymous CV, and blocks 52 and 54 had a relatively narrow range of Ka and K s (table 1 ). These observations could reflect sampling phenomena because some blocks contained small numbers of duplicated genes. To test for among-block rate variation, we performed an ANOVA. The ANOVA indicated that Ka and Ks are homogeneous among blocks (Ka: P = 0.45, Ks: P = 0.09), further suggesting that the blocks may have been duplicated contemporaneously.
Over the whole region, 58% (141/242) of gene pairs had higher DNA than protein identity, and DNA sequence identity was always higher than protein sequence identity when amino acid identity was greater than
80% (fig. 3
). To determine the relationship between DNA and protein identities relative to a model of strictly neutral evolution, we simulated pairs of gene sequences under neutrality. For simulations, Ka/Ks was set to 1.0;
, the transition and transversion parameter was set to 3.0; and the codon frequency matrix was based on tabulated frequencies from 14,647,315 A. thaliana codons (http://www.kazusa.or.jp/codon/). The simulations indicated that DNA sequence identity was always higher than protein sequence identity under the neutral model (fig. 3
), and the result was qualitatively similar with different transition:transversion ratios (
= 2, 3, or 4; data not shown). In our arabidopsis data set, amino acid sequence identity was always higher than that for simulated gene pairs with equivalent DNA sequence identity (fig. 3
). Thus, selective constraint appears to be slowing the pace of nonsynonymous nucleotide substitution for all the gene pairs in our data set.
|
Examining Factors That Contribute to Rate Variation Among Gene Pairs
Four factors are commonly cited as contributing to rate variation among genes: (1) physical location within the genome, (2) GC content, (3) codon usage bias, and (4) differing levels of selective constraint on amino acid substitutions. In this section, we examine the relationship between rate variation and the first three factors.
We tested the relationship between substitution rate and physical distance with two methods. First, we applied a spatial autocorrelation test to both Ka and Ks. There were no significant results with Ka, but autocorrelation for Ks was greater than expected at the 5% significance level for some distancesfor example, genes separated by 10 genes are more highly correlated than expected at random (fig. 4
). We should note, however, that Ks autocorrelation statistics do not demonstrate significantly high autocorrelation in neighboring genes, as expected under the hypothesis that physical location is a contributing factor to variation in synonymous substitution rates among genes. Thus, the Ks results are difficult to interpret but suggest only weakly that physical location contributes to synonymous rate variation among genes. Second, we applied the permutation method of Williams and Hurst (2000)
to test the null hypothesis that rates are random with respect to physical location. This test also failed to reject the null hypothesis for Ka (P = 0.66), but the test was borderline significant for Ks (P = 0.067). Thus, both tests provide suggestive, but inconclusive, evidence that there could be a relationship between Ks and physical distance.
|
GC content and codon bias have also been shown to correlate with evolutionary rate (Ticher and Graur 1989
|
The Distribution of Ka/Ks Across Gene Pairs
To examine whether positive selection commonly contributed to sequence divergence between paralogs, we estimated Ka/Ks for all gene pairs and tested whether Ka/Ks exceeded 1.0. Estimated Ka/Ks values ranged from 0.00097 to 0.677, with a mean of 0.02 (table 1 and fig. 2 ). For all but one of the gene pairs, a model in which Ka/Ks was estimated from the data fit significantly better at the P < 0.05 level than did a model that constrained Ka/Ks to 1.0. Most (236 of 242) tests remained significant after Bonferroni correction. Altogether Ka/Ks estimates indicated that there was neither positive selection (Ka/Ks > 1.0) nor strictly neutral evolution (Ka/Ks = 1.0) governing sequence divergence in the vast majority (236/241 = 98%) of gene pairs. These conclusions are consistent with the analysis of protein and DNA sequence identity (fig. 3 ).
Divergence in Evolutionary Rate and Codon Usage Between Paralogs
We performed relative rate tests to examine whether sequences within a gene pair evolve at different evolutionary rates. Relative rate tests require an outgroup for each gene pair. Using the search strategy described in Materials and Methods, we identified putative outgroup sequences for 105 gene pairs. To examine the validity of these outgroup assignments, we examined the three pairwise genetic distances for each gene pair and its outgroup. In all 105 cases, paralogs had a smaller distance (as measured by both Ka and Ks) to one another than to the outgroup sequence (data not shown), suggesting that our outgroup assignments were reasonable.
Few paralogs demonstrated deviation from clock-like evolution after divergence. Relative rate tests using protein sequences resulted in 14 significant tests at P < 0.05; three remained significant after Bonferroni correction for an experiment-wide error of
= 0.05. The relative rate tests for nonsynonymous substitution rates provided similar results, with 24 significant gene pairs; three remained significant after Bonferroni correction. Ten of the 14 gene pairs that were significant for protein sequences were also significant for Ka-based tests, showing that results were reasonably consistent between methods. For Ks, six of 105 gene pairs were significant at P < 0.05, but only one remained significant after Bonferroni correction.
To further characterize patterns of sequence divergence between duplicated genes, we measured both GCS and ENC for all gene pairs. Table 3
provides average ENC and GCS for both chromosomes and all blocks. We applied paired t-tests to determine whether the two chromosomes differ significantly for either measure; no significant difference was detected between chromosomes for either ENC (t = 0.478, P = 0.633) or GCS (t = -1.591, P = 0.112). This result did not preclude the possibility that any two paralagous sequences had diverged significantly in codon usage. To test this possibility, we applied a permutation test of homogeneous ENC (or GCS) to each of the 242 gene pairs (Zhang, Kosakovsky-Pond, and Gaut 2001b
). No paralogs exhibited significant difference in either measure after Bonferroni correction (data not shown), and thus there has been no detectable divergence in codon usage or GC content after duplication.
|
| Discussion |
|---|
|
|
|---|
In this study, we examined sequence divergence among gene pairs located in the duplicated region between chromosomes 2 and 4 of A. thaliana. Gene pairs were studied under the assumption that they were duplicated contemporaneously. We took three precautions to ensure the accuracy of this assumption. First, the chromosome 24 duplication consists of five rearranged blocks that presumably owe their duplication to the same event, but we studied each block separately as well as in combination. Second, we did not consider arrayed genes in tandem because it is difficult to determine the relative timing of tandem and chromosomal duplication events. Finally, we studied only genes that occur in compact collinear arrangement between the two chromosomes; collinearity reinforces the notion that the gene pairs originated during a single duplication event.
Our study differs from previous studies of rate variation among genes in that we used paralogous, rather than orthologous, sequences to estimate substitution rates. One potential complication is that gene conversion can occur between paralogs. If gene conversion has occurred in some gene pairs but not others, the net effect is to increase the range and variance of Ka and Ks among gene pairs. We tested for the presence of gene conversion in all gene pairs with Sawyer's (1989)
method but found no evidence for conversion between paralogs (data not shown). We should note, however, that most tests for gene conversion detect events that affect a portion of a gene rather than a complete gene.
Despite the tendency of gene conversion to inflate the range of substitution rates, we found Ks ranges similar to previous studies. For example, Ks ranged 15-fold in a study of 24 Drosophila orthologs (Zeng et al. 1998
) and 20-fold in a study of 363 rat-mouse orthologs (Wolfe and Sharp 1993
). In our study, synonymous substitution rates varied up to 10.4-fold within an individual block and 13.8-fold over the entire data set (table 1
and fig. 2 ).
The distribution of Ka and Ks from arabidopsis gene pairs differs from previous studies of non-plant taxa in two noteworthy ways. First, the ratio of mean synonymous to mean nonsynonymous substitution rates is relatively low. The ratio is
5 for our arabidopsis data (table 1
), but it is as high as 24 for bacterial genes (Sharp 1991
) and
7 for rat-mouse comparisons (Wolfe and Sharp 1993
), based on comparisons between orthologous sequences. Differences in this ratio may reflect different population sizes and life histories (Sharp 1991
), but may also reflect differences between ortholog and paralog comparisons. For example, Kondrashov et al. (2002)
recently found that the ratio of synonymous to nonsynonymous substitution is
twofold lower between paralogs than orthologs. Thus, the forces affecting this ratio remain unclear at present. Second, the ratio of CVs between nonsynonymous and synonymous rates is lower for this study than previous studies. In rat-mouse comparisons, for example, the CV for nonsynonymous rates was fourfold higher than the CV for synonymous rates (Wolfe and Sharp 1993
). Among arabidopsis gene pairs, there is only a 1.5-fold difference between nonsynonymous and synonymous CVs (table 1
), indicating that synonymous and nonsynonymous rates vary among genes to roughly the same extent.
Our analyses represent the most extensive comparison of rates among plant nuclear genes to date. Some previous studies have been based on a much smaller number of sequence comparisons. For example, Wolfe, Sharp, and Li (1989b)
and Gaut (1998)
found that synonymous rates vary up to 2.5-fold among genes, but these studies were based on only 11 and nine nuclear genes, respectively. A more recent study made 212 orthologous sequence comparisons between arabidopsis and Brassica rapis (Tiffin and Hahn 2002
), but relied on EST data, which could bias results depending on the accuracy of EST data and the gene regions sequenced. Nonetheless, Tiffin and Hahn (2002)
described two rate characteristics that are similar to our results: (1) an
24-fold range of synonymous rate variation among genes and (2) a twofold difference in CV between nonsynonymous and synonymous rates. Together these two plant studies indicate that synonymous rates vary at least an order of magnitude among plant nuclear genes and also that variation in nonsynonymous and synonymous rates are relatively similar among genes, based on the CV. It is clear, however, that a general picture of plant nuclear gene evolution requires additional studies, particularly given the fact that different plant lineages evolve with different mutational patterns (Tiffin and Hahn 2002
) and different rates (e.g., Eyre-Walker and Gaut 1997
).
With the increasing availability of genome sequences, several studies have used genetic distances (either Ka or Ks) to make genome-wide inferences about the timing and frequency of gene and genome duplication (Gaut and Doebley 1997
; Lynch and Conery 2000
; Vision, Brown, and Tanksley 2000
; Friedman and Hughes 2001
). For example, Lynch and Conery (2000)
used Ks values between duplicated arabidopsis sequences as a proxy for divergence time; in essence they assumed that substitution rates were homogeneous among gene pairs. Their use of Ks has come under criticism (Long and Thornton 2001
; Zhang, Gaut, and Vision 2001a
), but this study provides quantitative insights about the degree to which the assumption can be misleading. In a worst-case scenario, Ks provides time estimates that differ up to 14-fold (table 1 ). However, the average effect of the homogeneous rate assumption is not nearly as dramatic because
90% of the gene pairs in this study fall within a Ks range of 0.4621.188 (fig. 2
), and thus most genes (
90%) vary less than 2.57-fold in rate. Overall, however, our study indicates that Ks variation among genes is generally higher than previously reported for plants (Wolfe, Sharp, and Li 1989b
; Gaut 1998
), and thus one should be cautious when equating Ks with time.
Factors Contributing to Rate Variation Among Gene Pairs
What are the forces that contribute to variation in evolutionary rates among gene pairs? The first possibility, discussed earlier, is gene conversion. Although we did not detect gene conversion in any of our gene pairs, the possibility of gene conversion should not be ignored. However, the fact that the range of variation found in our 242 gene pairs is similar to that of orthologous gene pairs from other studies (rat-mouse, drosophilids and Arabidopsis-Brassica; see earlier) suggests that the effects of gene conversion have been negligible. In addition, a similar study of yeast paralogs found no evidence of gene conversion, suggesting gene conversion may not be widespread (Pál, Papp, and Hurst 2001
).
The origin of duplicated chromosomes can also affect the variance in rates across gene pairs. If chromosomal duplication originated via a polyploid event, the original gene pairs contained varied levels of residual standing variation (polymorphism) at the onset of duplication. Standing variation contributes to variation in genetic distances among gene pairs (discussed in Gaut and Doebley 1997
). However, for highly diverged gene pairs like those we have studied here, the level of standing variation at the time of origin should be very low relative to the total divergence between paralogs. Hence the origin effect probably contributes little to the observed variance in substitution rates among the 242 arabidopsis gene pairs.
Physical location is a third potential contributor to rate variation among genes. For example, Williams and Hurst (2000)
found that nonsynonymous substitution rates between mouse and rat vary as a function of genome location. Similarly, Matassi, Sharp, and Gautier (1999)
documented location effects on synonymous substitution rates in human-rodent comparisons. Both of these effects likely reflect variable mutation rates along chromosomes (Casane et al. 1997
; Lercher, Williams, and Hurst 2001
). We examined the relationship between physical location and nucleotide substitution rates with two methods, and neither analysis provided evidence that physical location affects nonsynonymous substitution rates. In contrast, there is a hint that physical location and synonymous substitution rates are correlated, but the effect, if present, is weak. It should be noted that previous studies focused on whole genomes, with neighboring genes up to 5 cM apart (Matassi, Sharp, and Gautier 1999
; Williams and Hurst 2000
; Lercher, Williams, and Hurst 2001
), which is roughly 10 Mb in humans (Williams and Hurst 2000
). In contrast, our entire study focused on a region of 5 Mb, with neighboring genes separated by only a few kb. If the physical scale that affects substitution rates is large, our analyses could miss location effects.
Variation in selective constraint among proteins also contributes to rate variation among gene pairs. Deviation from a molecular clock is one potential measure of variation in selective constraint between paralogs, but on the whole our analyses uncovered little deviation from clock-like evolution. With some caveats, Ka/Ks can also be used for characterizing the relative strength of evolutionary forces acting on individual gene pairs. For the 242 gene pairs in our study, Ka/Ks ranged from 0.0 to 0.70, with a mean of 0.20, suggesting extensive variation in selective constraint among gene pairs. However, Ka/Ks never exceeded 1.0 (table 1 and fig. 3 ), and there is thus no evidence that positive selection has driven divergence between any of the paralogs on the duplicated regions of chromosomes 2 and 4. Altogether, variation in selective constraint, but not positive selection, likely contributes to variation in substitution rates among arabidopsis gene pairs.
Both GC content and codon bias have been shown to correlate with nucleotide substitution rates (Moriyama and Gojobori 1992
; Alvarez-Valin et al. 1999
), and hence GC content and codon bias may also contribute to nucleotide substitution rate variation among gene pairs. We detected no correlations between GCS and Ks in arabidopsis gene pairs, but did find a negative correlation between GCS and Ka. The phenomena underlying the negative correlation are unclear, but the lack of correlation between GCS and Ks is not surprising given the relatively homogeneous base composition of the A. thaliana genome (Barakat, Matassi, and Bernardi 1998
; AGI 2000
). We also found a correlation between codon bias and Ks that is similar to that documented in other species (Sharp and Li 1987
; Moriyama and Hartl 1993
, pp. 847858; Akashi 1994b
). This correlation is consistent with strong codon bias (low ENC) limiting acceptable synonymous changes and thus retarding substitution rates.
One interesting feature of codon bias is that it often also correlates with gene expression leveli.e., highly expressed genes are more biased (Gouy and Gautier 1982
; Duret and Mouchiroud 1999
; Coghlan and Wolfe 2000
). Surprisingly, highly expressed genes also evolve with relatively low Ka (Duret and Mouchiroud 2000
). Both of these observations suggest that yet another factorgene expressioncontributes to variation in Ka and perhaps Ks among genes. To briefly explore this possibility with the arabidopsis gene pairs, we employed the database procedures of Duret and Mouchiroud (2000)
and counted the number of BLAST hits for each gene pair to an arabidopsis EST database (http://www.kazusa.or.jp/en/plant/arabi/EST/). We then regarded the number of hits as a rough approximation of the expression level of a gene pair (this approximation is particularly rough given the diverse origin and preparation methods of cDNA libraries used for EST sequencing), and compared this expression level with evolutionary rates. Based on pairwise correlations among Ka, Ks and the number of EST hits, there is a significantly negative partial correlation between expression and Ka (r = -0.172, P = 0.007) but no significant correlation between expression and Ks (r = -0.037, P = 0.57). More detailed analyses require better expression data, but this significant negative correlation is consistent with recent ideas that evolutionary rates may be either proximally or secondarily a function of gene expression (reviewed in Akashi 2001
).
| Acknowledgements |
|---|
|
|
|---|
We thank A. McLysaght, M. Tenaillon, M. Mondragon, M. Clegg, and P. Tiffin for valuable comments, and S. Muse provided valuable advice. We also thank S. Kosakovsky-Pond and Z. Yang for assistance with Hyphy and PAML, respectively. This work was funded by NSF grants DBI-9872631 and DEB-0104796. T.J.V. was supported by the USDA-ARS Center for Agricultural Bioinformatics and the Cornell Theory Center.
| Footnotes |
|---|
Brian Golding, Reviewing Editor
Address for correspondence and reprints: Brandon S. Gaut, Department of Ecology and Evolutionary Biology, 321 Steinhaus Hall, University of California, Irvine, California 92697-2525. E-mail: bgaut{at}uci.edu ![]()
Keywords: nucleotide substitution rates
positive selection
codon usage
regional mutation ![]()
| References |
|---|
|
|
|---|
AGI (Arabidopsis Genome Initiative). 2000 Analysis of the genome sequence of the flowering plant Arabidopsis thaliana Nature 408:796-815[Medline]
Akashi H., 1994a. Inferring weak selection from patterns of polymorphism and divergence at "silent" sites in Drosophila DNA Genetics 139:1067-1076[Abstract]
. 1994b. Synonymous codon usage in Droshophila melanogaster: natural selection and translational accuracy Genetics 136:927-935[Abstract]
. 1997 Codon bias evolution in Drosophila Population genetics of mutation-selection drift. Gene 205:269-278
. 2001 Gene expression and molecular evolution Curr. Opin. Genet. Dev 11:660-666[Web of Science][Medline]
Altschul S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, D. J. Lipman, 1997 Gapped BLAST and PSI-BLAST: a new generation of protein database search programs Nucleic Acids Res 25:3389-3402
Alvarez-Valin F., K. Jabbari, N. Carels, G. Bernardi, 1999 Synonymous and nonsynonymous substitutions in genes from Gramineae: intragenic correlations J. Mol. Evol 49:330-342[Web of Science][Medline]
Barakat A., G. Matassi, G. Bernardi, 1998 Distribution of genes in the genome of Arabidopsis thaliana and its implications for the genome organization of plants Proc. Natl. Acad. Sci. USA 95:10044-10049
Bird A. P., 1980 DNA methylation and the frequency of CpG in animal DNA Nucleic Acids Res 8:1499-1504
Casane D., S. Boissinot, B. H. J. Chang, L. C. Shimmin, W. H. Li, 1997 Mutation pattern variation among regions of the primate genome J. Mol. Evol 45:216-226[Web of Science][Medline]
Charlesworth D., B. Charlesworth, G. McVean, 2001 Genome sequences and evolutionary biology, a two-way interaction TREE 16:235-242
Chatfield C., 1999 The analysis of time series: an introduction Chapman and Hall, London
Clegg M. T., M. P. Cummings, M. L. Durbin, 1997 The evolution of plant nuclear genes Proc. Natl. Acad. Sci. USA 94:7791-7798
Coghlan A., K. H. Wolfe, 2000 Relationship of codon bias to mRNA concentration and protein length in Saccharomyces cerevisiae Yeast 16:1131-1145[Web of Science][Medline]
Comeron J. M., M. Aguade, 1998 An evaluation of measures of synonymous codon usage bias J. Mol. Evol 47:268-274[Web of Science][Medline]
Dayhoff M. O., R. M. Schwartz, B. C. Orcutt, 1978 A model for evolutionary change in proteins Pp. 345352 in M. O. Dayhoff, ed. Atlas of protein sequence and structure. National Biochemical Research Foundation, Washington, D.C.
dePamphilis C. W., N. D. Young, A. D. Wolfe, 1997 Evolution of plastid gene rps2 in a lineage of hemiparasitic and holoparasitic plants: many losses of photosynthesis and complex patterns of rate variation Proc. Natl. Acad. Sci. USA 94:7367-7372
Duret L., D. Mouchiroud, 1999 Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila and Arabidopsis Proc. Natl. Acad. Sci. USA 96:4482-4487
. 2000 Determinants of substitution rates in mammalian genes: expression pattern affects selection intensity but not mutation rate Mol. Bio. Evol 17:68-85
Eyre-Walker A., M. Bulmer, 1995 Synonymous substitution rates in enterobacteria Genetics 140:1407-1412[Abstract]
Eyre-Walker A., B. S. Gaut, 1997 Correlated rates of synonymous site evolution among plant genomes Mol. Biol. Evol 14:455-460[Abstract]
Fennoy S. L., J. Bailey-Serres, 1993 Synonymous codon usage in Zea mays L. nuclear genes is varied by levels of C and G-ending codons Nucleic Acids Res 21:5294-5300
Friedman R., A. L. Hughes, 2001 Gene duplication and the structure of eukaryotic genomes Genome Res 11:373-381
Gaut B. S., 1998 Molecular clocks and nucleotide substitution rates in higher plants Evol. Biol 30:93-120
Gaut B. S., J. F. Doebley, 1997 DNA sequence evidence for the segmental allotetraploid origin of maize Proc. Natl. Acad. USA 94:6809-6814
Gaut B. S., S. V. Muse, W. D. Clark, M. T. Clegg, 1992 Relative rates of nucleotide substitution at the rbcL locus of monocotyledonous plants J. Mol. Evol 35:292-303[Web of Science][Medline]
Gaut B. S., A. S. Peek, B. R. Morton, M. T. Clegg, 1999 Patterns of genetic diversification within the Adh gene family in the grasses (Poaceae) Mol. Biol. Evol 16:1086-1097[Abstract]
Goldman N., Z. H. Yang, 1994 Codon-based model of nucleotide substitution for protein coding DNA sequences Mol. Biol. Evol 11:725-736[Abstract]
Gonzalez D. S., I. K. Jordan, 2000 The alpha-mannosidases: phylogeny and adaptive diversification Mol. Biol. Evol 17:292-300
Goodman M., G. W. Moore, G. Matsuda, 1975 Darwinian evolution in the genealogy of hemoglobin Nature 253:603-608[Medline]
Gouy M., C. Gautier, 1982 Codon usage in bacteria: correlation with gene expressivity Nucleic Acids Res 10:7055-7074
Gusfield D., 1997 Algorithms on strings trees and sequences Cambridge University Press, Cambridge, Mass
Hughes A. L., 1999 Adaptive evolution of genes and genomes Oxford University Press, Oxford
Kondrashov F. A., I. B. Rogozin, Y. I. Wolf, E. V. Koonin, 2002 Selection in the evolution of gene duplications Genome Biol 3:1-9
Ku H.-M., T. Vision, J. Liu, S. D. Tanksley, 2000 Comparing sequenced segments of the tomato and Arabidopsis genomes: large-scale duplication followed by selective gene loss creates a network of synteny Proc. Natl. Acad. Sci. USA 97:9121-9126
Laroche J., P. Li, L. Maggia, J. Bousquet, 1997 Molecular evolution of angiosperm mitochondrial introns and exons Proc. Natl. Acad. Sci. USA 94:5722-5727
Lercher M. J., E. J. B. Williams, L. D. Hurst, 2001 Local similarity in evolutionary rates extends over whole chromosomes in human-rodent and mouse-rat comparisons: implications for understanding the mechanistic basis of the male mutation bias Mol. Bio. Evol 18:2032-2039
Li W. H., T. Gojobori, 1983 Rapid evolution of goat and sheep globin genes following gene duplication Mol. Biol. Evol 1:94-108[Abstract]
Lin X. Y., S. S. Kaul, S. Rounsley, et al. (37 co-authors) 1999 Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana Nature 402:761-768[Medline]
Long M. Y., K. Thornton, 2001 Gene duplication and evolution Science 293:U1.
Lynch M., J. S. Conery, 2000 The evolutionary fate and consequences of duplicate genes Science 290:1151-1155
Matassi G., P. M. Sharp, C. Gautier, 1999 Chromosomal location effects on gene sequence evolution in mammals Curr. Biol 9:786-791[Web of Science][Medline]
Mayer K., C. Schuller, R. Wambutt, et al. (240 co-authors) 1999 Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana Nature 402:769-777.[Medline]
Moriyama E. N., T. Gojobori, 1992 Rates of synonymous substitution and base composition of nuclear genes in Drosophila Genetics 130:855-964[Abstract]
Moriyama E. N., D. N. Hartl, 1993 Codon usage bias and base composition of nuclear genes in Drosophila Genetics 134.
Muse S. V., B. S. Gaut, 1994 A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome Mol. Biol. Evol 11:715-724[Abstract]
Muse S. V., B. S. Weir, 1992 Testing for equality of evolutionary rates Genetics 132:269-276[Abstract]
Nei M., 1987 Molecular evolutionary genetics Columbia University Press, New York
Ohno S., 1970 Evolution by gene duplication Springer-Verlag, Heidelberg
Ohta T., M. Kimura, 1973 Model of mutation appropriate to estimate the number of electrophoretically detectable alleles in a finite population Genet. Res 22:201-204.[Web of Science][Medline]
Pál C., B. Papp, L. D. Hurst, 2001 Highly expressed genes in yeast evolve slowly Genetics 158:927-931
Sawyer S., 1989 Statistical tests for detecting gene conversion Mol. Biol. Evol 6:526-538[Abstract]
Sharp P. M., 1991 Determinants of DNA sequence divergence between Escherichia coli and Salmonella typhimurium codon usage, map position, and concerted evolution J. Mol. Evol 33:23-33[Web of Science][Medline]
Sharp P. M., W.-H. Li, 1987 The rate of synonymous substitution in enterobacterial genes is inversely related to codon usage bias Mol. Biol. Evol 4:222-230[Abstract]
Terryn N., L. Heijnen, A. De Keyser, et al. (21 co-authors) 1999 Evidence for an ancient chromosomal duplication in Arabidopsis thaliana by sequencing and analyzing a 400-kb contig at the APETALA2 locus on chromosome 4 FEBS Lett 445:237-245.[Web of Science][Medline]
Thompson J. D., D. G. Higgins, T. J. Gibson, 1994 ClustalWimproving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice Nucleic Acids Res 22:4673-4680
Ticher A., D. Graur, 1989 Nucleic acid composition, codon usage, and the rate of synonymous substitution in protein-coding genes J. Mol. Evol 28:286-298[Web of Science][Medline]
Tiffin P., W. M. Hahn, 2002 Coding sequence divergence between two closely related plant species: Arabidopsis thaliana and Brassica rapa ssp. pekinensis J. Mol. Evol 54::746-753.[Web of Science][Medline]
Tsunoyama K., M. I. Bellgard, T. Gojobori, 2001 Intragenic variation of synonymous substitution rates is caused by nonrandom muations at methylated CpG J. Mol. Evol 53:456-464[Web of Science][Medline]
Vision T. J., D. G. Brown, S. D. Tanksley, 2000 The origins of genomic duplication in the Arabidopsis genome Science 290:2114-2117
Williams E. J., L. D. Hurst, 2000 The proteins of linked genes evolve at similar rates Nature 407:900-903[Medline]
Wolfe K. H., P. M. Sharp, 1993 Mammalian gene evolution: nucleotide sequence divergence between mouse and rat J. Mol. Evol 37:441-456[Web of Science][Medline]
Wolfe K. H., P. M. Sharp, W.-H. Li, 1989a. Mutation rates differ among regions of the mammalian genome Nature 337:283-285[Medline]
. 1989b. Rates of synonymous substitution in plant nuclear genes J. Mol. Evol 29:208-211
Wright F., 1990 The effective number of codons' used in a gene Gene 87:23-29[Web of Science][Medline]
Yang Z., 1997 PAML: a program package for phylogenetic analysis by maximum likelihood CABIOS 13:555-556
. 1998 Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution Mol. Bio. Evol 15:568-573[Abstract]
Zeng L.-W., J. M. Comeron, B. Chen, M. Kreitman, 1998 The molecular clock revisited: the rate of synonymous versus replacement change in Drosophila Genetica 102/103:369-382
Zhang L. Q., B. S. Gaut, T. J. Vision, 2001a. Gene duplication and evolution Science 293:U1-U2
Zhang L. Q., S. Kosakovsky-Pond, B. S. Gaut, 2001b. A survey of the molecular evolutionary dynamics of twenty-five multigene families from four grass taxa J. Mol. Evol 52:144-156[Web of Science][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
X. Wang, H. Tang, J. E. Bowers, and A. H. Paterson Comparative inference of illegitimate recombination between rice and sorghum duplicated genes produced by polyploidization Genome Res., June 1, 2009; 19(6): 1026 - 1032. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. V. Han, J. P. Demuth, C. L. McGrath, C. Casola, and M. W. Hahn Adaptive evolution of young gene duplicates in mammals Genome Res., May 1, 2009; 19(5): 859 - 867. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Throude, S. Bolot, M. Bosio, C. Pont, X. Sarda, U. M. Quraishi, F. Bourgis, P. Lessard, P. Rogowsky, A. Ghesquiere, et al. Structure and expression analysis of rice paleo duplications Nucleic Acids Res., March 1, 2009; 37(4): 1248 - 1259. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. W. Innes, C. Ameline-Torregrosa, T. Ashfield, E. Cannon, S. B. Cannon, B. Chacko, N. W.G. Chen, A. Couloux, A. Dalwani, R. Denny, et al. Differential Accumulation of Retroelements and Diversification of NB-LRR Disease Resistance Genes in Duplicated Regions following Polyploidy in the Ancestor of Soybean Plant Physiology, December 1, 2008; 148(4): 1740 - 1759. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. M. Tobias, G. Sarath, P. Twigg, E. Lindquist, J. Pangilinan, B. W. Penning, K. Barry, M. C. McCann, N. C. Carpita, and G. R. Lazo Comparative Genomics in Switchgrass Using 61,585 High-Quality Expressed Sequence Tags The Plant Genome, November 1, 2008; 1(2): 111 - 124. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Wang, A. Diehl, F. Wu, J. Vrebalov, J. Giovannoni, A. Siepel, and S. D. Tanksley Sequencing and Comparative Analysis of a Conserved Syntenic Segment in the Solanaceae Genetics, September 1, 2008; 180(1): 391 - 408. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Yi, J. Wen, A. Golan-Goldhirsh, and D. E. Parfitt Phylogenetics and reticulate evolution in Pistacia (Anacardiaceae) Am. J. Botany, February 1, 2008; 95(2): 241 - 251. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Wang, H. Tang, J. E. Bowers, F. A. Feltus, and A. H. Paterson Extensive Concerted Evolution of Rice Paralogs and the Road to Regaining Independence Genetics, November 1, 2007; 177(3): 1753 - 1763. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. W. Ganko, B. C. Meyers, and T. J. Vision Divergence in Expression between Duplicated Genes in Arabidopsis Mol. Biol. Evol., October 1, 2007; 24(10): 2298 - 2309. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. P. Byrne and K. H. Wolfe Consistent Patterns of Rate Asymmetry and Gene Loss Indicate Widespread Neofunctionalization of Yeast Genes After Whole-Genome Duplication Genetics, March 1, 2007; 175(3): 1341 - 1350. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. D. Morin, E. Chang, A. Petrescu, N. Liao, M. Griffith, R. Kirkpatrick, Y. S. Butterfield, A. C. Young, J. Stott, S. Barber, et al. Sequencing and analysis of 10,967 full-length cDNA clones from Xenopus laevis and Xenopus tropicalis reveals post-tetraploidization transcriptome remodeling Genome Res., June 1, 2006; 16(6): 796 - 803. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. R. Hoen, K. C. Park, N. Elrouby, Z. Yu, N. Mohabir, R. K. Cowan, and T. E. Bureau Transposon-Mediated Expansion and Diversification of a Family of ULP-like Genes Mol. Biol. Evol., June 1, 2006; 23(6): 1254 - 1268. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. A. Chapman, J. E. Bowers, F. A. Feltus, and A. H. Paterson Buffering of crucial functions by paleologous duplicated genes may contribute cyclicality to angiosperm genome duplication PNAS, February 21, 2006; 103(8): 2730 - 2735. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. M. Wilson, S. Flibotte, P. I. Missirlis, M. A. Marra, S. Jones, K. Thornton, A. G. Clark, and R. A. Holt Identification by full-coverage array CGH of human DNA copy number increases relative to chimpanzee and gorilla Genome Res., February 1, 2006; 16(2): 173 - 181. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Y. Huang, N. Grunheit, N. Ahmadinejad, J. N. Timmis, and W. Martin Mutational Decay and Age of Chloroplast and Mitochondrial Genomes Transferred Recently to Angiosperm Nuclear Chromosomes Plant Physiology, July 1, 2005; 138(3): 1723 - 1733. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. E. Pfeil, J. A. Schlueter, R. C. Shoemaker, and J. J. Doyle Placing Paleopolyploidy in Relation to Taxon Divergence: A Phylogenetic Analysis in Legumes Using 39 Gene Families Syst Biol, June 1, 2005; 54(3): 441 - 454. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. C. Spady, O. Seehausen, E. R. Loew, R. C. Jordan, T. D. Kocher, and K. L. Carleton Adaptive Molecular Evolution in the Opsin Genes of Rapidly Speciating Cichlid Species Mol. Biol. Evol., June 1, 2005; 22(6): 1412 - 1422. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. C. Moore, S. R. Grant, and M. D. Purugganan Molecular Population Genetics of Redundant Floral-Regulatory Genes in Arabidopsis thaliana Mol. Biol. Evol., January 1, 2005; 22(1): 91 - 103. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Guillet-Claude, N. Isabel, B. Pelgas, and J. Bousquet The Evolutionary Implications of knox-I Gene Duplications in Conifers: Correlated Evidence from Phylogeny, Gene Mapping, and Analysis of Functional Divergence Mol. Biol. Evol., December 1, 2004; 21(12): 2232 - 2245. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. I. Wright, C. B. K. Yau, M. Looseley, and B. C. Meyers Effects of Gene Expression on Molecular Evolution in Arabidopsis thaliana and Arabidopsis lyrata Mol. Biol. Evol., September 1, 2004; 21(9): 1719 - 1726. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Ma and J. L. Bennetzen Inaugural Article: Rapid recent growth and divergence of rice nuclear genomes PNAS, August 24, 2004; 101(34): 12404 - 12410. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Takebayashi, E. Newbigin, and M. K. Uyenoyama Maximum-Likelihood Estimation of Rates of Recombination Within Mating-Type Regions Genetics, August 1, 2004; 167(4): 2097 - 2109. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Blanc and K. H. Wolfe Functional Divergence of Duplicated Genes Formed by Polyploidy during Arabidopsis Evolution PLANT CELL, July 1, 2004; 16(7): 1679 - 1691. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Blanc and K. H. Wolfe Widespread Paleopolyploidy in Model Plant Species Inferred from Age Distributions of Duplicate Genes PLANT CELL, July 1, 2004; 16(7): 1667 - 1678. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. I. Tenaillon, J. U'Ren, O. Tenaillon, and B. S. Gaut Selection Versus Demography: A Multilocus Investigation of the Domestication Process in Maize Mol. Biol. Evol., July 1, 2004; 21(7): 1214 - 1225. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Lu and M. D. Rausher Evolutionary Rate Variation in Anthocyanin Pathway Genes Mol. Biol. Evol., November 1, 2003; 20(11): 1844 - 1853. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Vandepoele, C. Simillion, and Y. Van de Peer Evidence That Rice and Other Cereals Are Ancient Aneuploids PLANT CELL, September 1, 2003; 15(9): 2192 - 2202. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. S. Senchina, I. Alvarez, R. C. Cronn, B. Liu, J. Rong, R. D. Noyes, A. H. Paterson, R. A. Wing, T. A. Wilkins, and J. F. Wendel Rate Variation Among Nuclear Genes and the Age of Polyploidy in Gossypium Mol. Biol. Evol., April 1, 2003; 20(4): 633 - 643. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Blanc, K. Hokamp, and K. H. Wolfe A Recent Polyploidy Superimposed on Older Large-Scale Duplications in the Arabidopsis Genome Genome Res., February 1, 2003; 13(2): 137 - 144. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||














