MBE Advance Access originally published online on March 1, 2006
Molecular Biology and Evolution 2006 23(5):1068-1075; doi:10.1093/molbev/msj115
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Article |
Correlated Asymmetry of Sequence and Functional Divergence Between Duplicate Proteins of Saccharomyces cerevisiae
School of Biology, Georgia Institute of Technology
E-mail: soojinyi{at}gatech.edu.
| Abstract |
|---|
|
|
|---|
The role of sequence divergence in functional divergence of duplicate genes is a topic of great interest. In this study, we compare the numbers of amino acid substitutions in each sequence since two yeast duplicates diverged, using a preduplication ancestral outgroup. Using this strategy, we explored the relationship between sequence divergence and functional divergence between duplicate partners. We show that the degree of relative functional asymmetry between duplicate proteins is proportional to the relative sequence divergence between them. Furthermore, of the two duplicates, the copy closer to their ancestral sequence (fewer number of amino acid substitutions) interacts with more proteins and affects fitness more severely when deleted. Therefore, asymmetric sequence divergence between duplicates is correlated with asymmetric functional divergence and may underlie the duplicate's role in genetic robustness against mutations. Among the functional traits considered, protein abundance appears to have the strongest correlation with the nonsynonymous divergence between duplicates. Taken together with the results from whole-genome analyses, our results indicate that within-species duplicates are subject to the same evolutionary force that acts on interspecific sequence and functional divergence. In particular, we detect signs of purifying selection on the more slowly evolving duplicate.
Key Words: Saccharomyces cerevisiae whole-genome duplication functional divergence sequence divergence gene duplication
| Introduction |
|---|
|
|
|---|
Gene duplication is essential in shaping genomic architectures in diverse taxa (Ohno 1970
Duplicate genes also tend to diverge asymmetrically in their functions (Wagner 2002
). What drives such asymmetric functional evolution of duplicates? One obvious candidate is natural selection, as evidenced by its effect on sequence evolution. If so, asymmetric divergence of sequence and function of duplicates should be correlated. In addition, the more slowly evolving duplicate may show signs of purifying selection, while the faster evolving copy may show signs of either relaxed selective constraint or positive selection, in both sequence and functional evolution.
Detecting effects of natural selection on functional evolution is a topic of great interest. Although there remain many issues to be resolved, several interesting findings have emerged from large-scale analyses. First, the number of interactions of a protein in a protein-protein interaction network (Fraser et al. 2002
) as well as measures of centrality (Hahn and Kern 2005
) affects its evolutionary rate negatively. This observation is taken as evidence of strong purifying selection for more interactive proteins. However, bias in some protein-protein interaction assays may inflate this correlation (Bloom and Adami 2003
). Certainly, this relationship appears not as strong as proposed earlier (Jordan, Wolf, and Koonin 2003
; Hahn, Conant, and Wagner 2004
; Lemos et al. 2005
). Second, proteins that have less severe fitness effects when deleted (more dispensable) tend to evolve faster than proteins that are less dispensable (Yang, Gu, and Li 2003
; Zhang and He 2005
). This is again taken as evidence of purifying selection because less dispensable (and hence more important) proteins are under stronger selective constraint. However, this effect appears to act only in short evolutionary timescale and may disappear in longer evolutionary timescale (Zhang and He 2005
). Third, among the factors known to affect the evolutionary rate of a protein, the expression level appears as a major determinant in diverse taxa, also considered to indicate the role of natural selection (Duret and Mouchiroud 2000
; Pál, Papp, and Hurst 2001
; Nuzhdin et al. 2004
; Subramanian and Kumar 2004
; Drummond et al. 2005
; Lemos et al. 2005
; Drummond, Raval, and Wilke 2006
).
Therefore, if asymmetric sequence divergence is driven by natural selection, the more slowly evolving duplicate should be more central in a protein-protein interaction network, less dispensable, and also more expressed. To test these predictions, one needs to distinguish slowly evolving duplicates from fast evolving ones and relate their sequence evolution to functional divergence. Conant and Wagner (2003)
have analyzed the sequence and functional data available at that time but did not detect a significant correlation between asymmetric sequence divergence and functional divergence. The limiting factor in their study was the requirement of a preduplication outgroup for each pair of duplicates and their functional characteristics (Conant and Wagner 2003
).
Much more genomic and functional data have become available now. In particular, it has been well established that Saccharomyces cerevisiae has experienced an ancient genome duplication, which generated many protein duplicates of the same age (Wolfe and Shields 1997
; Kellis, Birren, and Lander 2004
). Here, we took advantage of the fact that for a large number of duplicate protein pairs in S. cerevisiae, the corresponding preduplication single-copy gene is available in Kluyveromyces waltii (Kellis, Birren, and Lander 2004
; fig. 1). Using the single-copy gene in K. waltii as the outgroup, we can quantify the relative asymmetry in sequence divergence between the two copies. Furthermore, we can infer which copy of the two duplicates has accumulated more amino acid substitutions (fig. 1). Using this information and the functional information from whole-genome scale experiments, we explored asymmetric sequence divergence between duplicate genes and the relationship between sequence divergence and functional divergence of duplicate proteins.
|
| Materials and Methods |
|---|
|
|
|---|
Yeast Gene Trios
Kellis, Birren, and Lander (2004)
Yeast Whole-Genome Comparison
Orthologs between K. waltii and S. cerevisiae were defined using the OrthoMCL algorithm (Li, Stoeckert, and Roos 2003
) with a P value of 1010 and other default parameters. Briefly, the OrthoMCL is a genome-scale algorithm to cluster orthologous protein sequences, by producing groups shared by two or more species (or genomes) and representing species-specific gene expansion families. It first looks for reciprocal best hits from all-to-all BlastP results within each genome and reciprocal better hits across any two genomes. Related proteins are then interlinked in a similarity graph, and the Markov clustering algorithm (Enright, Van Dongen, and Ouzounis 2002
) is used to split mega clusters. We chose singletons only (identified as two taxa, two gene classes from the OrthoMCL) for our whole-genome comparison. As above, we removed ribosomal proteins from our analyses. A total of 1,571 orthologous gene pairs were analyzed. None of the WGD duplicates were included in this set.
Functional Data
Protein-protein interaction networks for S. cerevisiae were downloaded from the General Repository for Interaction Datasets, which archive and display physical, genetic, and functional interactions (Breitkreutz, Stark, and Tyers 2003
; Hahn and Kern 2005
). Its protein-protein interaction data was obtained from various functional genomic methods including but not limited to affinity precipitation, affinity chromatography, yeast two-hybrid analysis, biochemical assay, and synthetic lethality. We converted this data to undirected networks and removed all 300 self-interactions. All network statistics were calculated using the Pajek software package (Batagelj and Mrvar 2003
). We obtained the absolute abundance data from a S. cerevisiae fusion library, where the absolute abundance of each open reading frame in its natural chromosomal location is visualized by immunodetection (Ghaemmaghami et al. 2003
). This measure is an excellent indicator of expression level (Ghaemmaghami et al. 2003
). For fitness data, we used the minimum fitness value as defined by Gu et al. (2003)
from single-gene deletion experiments (Steinmetz et al. 2002
). We arbitrarily chose the time course 2 data set. We have 144 and 202 duplicate proteins for analyses of the abundance and the fitness effect, respectively.
Relative Sequence Divergence
Using the pairwise nonsynonymous divergence (dN) among the yeast gene trios, we define the relative nonsynonymous divergence
where the outgroup protein in K. waltii is assigned as sequence 1, and the two S. cerevisiae proteins in a duplicate pair are assigned arbitrarily as sequences 2 and 3 (fig. 1). RN quantifies the relative difference in the number of amino acid substitutions that occurred on each of the two branches within a duplicate pair. If we divide the number of nonsynonymous substitutions to include only those that occurred after the duplication for each lineage (referred to as dN2 and dN3 for each duplicate), then the RN statistic is exactly the same as
because
and
RN can be greater than 1 because dN23 is occasionally less than
To reduce errors in estimation, we used gene trios with dN < 1.0 in any pairwise comparison.
To further analyze which duplicate gene has accumulated more nonsynonymous substitutions, we used the signed relative nonsynonymous divergence,
SRN preserves the properties of the above measure
and gives information on asymmetry between a pair. For the same reason as discussed above, SRN can be less than 1 or more than 1. Note that if both of the numerator and the denominator of SRN are zeros, it needs to be defined as 0/0 = 0. SRN is approximately normally distributed in our data (Fig. S1.A, Supplementary Material online).
Relative Functional Divergence
Functional data is currently unavailable for K. waltii (the outgroup). Therefore, we used a modified Canberra distance (Johnson and Wichern 2002
, p. 671) to quantify relative functional divergence. Specifically, for any functional measure X, we used
to quantify the relative functional divergence between the two copies (sequence 2 and 3 as defined above) of a duplicate gene pair. These are explained in detail for each functional asymmetry analysis (see Results). Similarly, one can use dN12 + dN13 as the denominator for RN
measure. However, the above RN
measure is superior because we can consider dN23 as (dN12 + dN13 r) (r > 0, a random variable). Nevertheless, our results did not change if dN12 + dN13 is used instead of dN23 (see Supplementary Material online). As expected, the RN measures calculated using dN23 or dN12 + dN13 are significantly correlated with each other (Kendall's correlation coefficient [
] = 0.8393, P
0).
| Results |
|---|
|
|
|---|
Functional and Sequence Evolution in the Whole-Genome Comparison
We first examined the relationship between sequence evolution and functional evolution between orthologous gene pairs of S. cerevisiae and K. waltii. To measure the independent effect of each variable after the effects of other variables are removed (Whittaker 1996
p) (Gibbons and Chakraborti 2003
p = 0), using a normal approximation (Sokal and Rohlf 1995These analyses revealed several interesting findings (table 1). First, the effect of centrality measures in the protein-protein interaction network is weakly correlated with sequence divergence in the whole-genome comparison. However, this relationship was not significant when the effects of other variables were taken into account. Second, fitness effect also had a significant effect on sequence evolution, independent of other factors considered. Third, the centrality and the fitness effect of a protein are significantly negatively correlated, independent of other effects. Fourth, protein abundance appears as the strongest determinant of the sequence divergence in the whole-genome comparison.
|
Drummond et al. (2006)
Many WGD Gene Pairs Evolve Asymmetrically at Sequence Level
We investigated the proportion of duplicate pairs generated by WGD that have evolved significantly asymmetrically at the sequence level, using two different approaches. First, we derived the variance of the RN. For this, we used the first-order Taylor series approximation, which provides a natural method to approximate variance through polynomials (Casella and Berger 1990
, p. 328). The derivation is shown in the Supplementary Material online. Using the standard error estimated from this method, we tested the null hypothesis (RN = 0) against the alternative hypothesis (RN
0) for each gene pair. We found that 107 out of 413 duplicate pairs show asymmetric sequence divergence (RN
0, P < 0.0001) after correcting for multiple comparisons, using the Bonferroni method (109/449 if we include ribosomal proteins).
Secondly, we used a likelihood-ratio test approach to quantify the proportion of duplicate pairs with asymmetric sequence divergence at the amino acid level. We compared two models of amino acid sequence evolution, similar to in Zhang, Gu, and Li (2003)
and Conant and Wagner (2003)
. In the first model, the two copies of a duplicate protein are assumed to evolve at the same rate. The second model allows the two branches to have different evolutionary rates. Two times the difference in likelihoods of these two models is compared to the chi-square distribution with 1 df. If significant, the results suggest that the two branches have evolved at unequal rates. We find that 128 pairs among the 413 duplicate pairs (after the Bonferroni correction, see above) fit the asymmetric sequence evolution model better (129 out of 449, including ribosomal proteins). This estimate is well in accord with the result from a previous analysis (Kellis, Birren, and Lander 2004
). Not surprisingly, the majority (92%) of the asymmetric pairs detected by the first method overlaps with the asymmetric pairs detected by the second method.
Therefore, both the empirical and the likelihood-based approaches yielded a similar proportion of proteins that evolve at different rates. Namely, at least a quarter of the duplicate proteins generated by the ancient WGD in yeast show evidence of asymmetric sequence evolution. If we do not use the conservative Bonferroni correction, the proportion of asymmetric pairs increases to almost half of the duplicate pairs.
We then explored the relationship between the relative asymmetry in nonsynonymous divergence (RN) and the pairwise nonsynonymous divergence (dN between duplicates in a pair). We found that these two measures are significantly positively correlated. In other words, as the pairwise dN between the two duplicates increases, RN also increases (
= 0.14, P < 0.01, fig. 2A). When we used the dN12 + dN13 as the denominator, we observe the same pattern (
= 0.25, P < 0.0001, Fig. S2.A, Supplementary Material online).
|
These observations indicate that nonsynonymous divergence between two protein duplicates was often driven by one partner accumulating disproportionately more amino acid substitutions. The log likelihood of the above comparison (in the previous section) is also significantly positively correlated with RN (
= 0.59, P < 0.0001, fig. 2B).
Asymmetric Sequence Divergence Is Correlated with Asymmetric Functional Divergence
We analyzed three different sets of functional genomics data to determine whether sequence divergence and functional divergence between duplicate partners are correlated. For all these analyses, we not only quantified the amount of relative asymmetry in functional divergence but also determined how each copy of the duplicate pair affected the asymmetry. From here on, we will use a signed measure of RN,
This measure includes information about which copy of the duplicate pair is more derived; a positive value of SRN means copy 2 has accumulated more amino acid substitutions.
We first analyzed the numbers of functional interactions of duplicate proteins, taking the yeast protein interaction network as a reasonable statistical estimate of the real functional network (Wagner 2000
, 2001
). In the yeast protein interaction network, each gene can be represented as a node, and the number of interactions of a particular protein can be expressed as the number of edges (connectivity). We calculated the signed relative asymmetry of connectivity within a duplicate pair, as
for the sequences 2 and 3 as designated earlier, where Ki is the connectivity of sequence i. If the relative sequence asymmetry between duplicate proteins is correlated with asymmetric amount of functional interactions they have, then we expect SRN and SRK to be significantly correlated. The sign of the correlation coefficient will then inform us of the nature of the correlation.
We find that SRN and SRK are significantly negatively correlated (
= 0.17, P < 0.001; fig. 3A and table 2). This result shows that duplicate proteins with similar sequences tend to have similar numbers of interactions, and as two proteins diverge further, the difference in connectivities also increases. Furthermore, the negative correlation shows that it is the duplicate with fewer amino acid substitutions that tends to have more protein interactions. The more derived proteins tend to lose protein interactions.
|
|
It has been shown that some methods to measure protein-protein interactions are biased toward counting more interactions for highly expressed proteins. Because highly expressed proteins evolve more slowly, such bias will generate spurious negative correlation between the number of interactions and evolutionary rates (Bloom and Adami 2003
= 0.13, P < 0.05).
Two other measures of centrality of a protein in a network, namely closeness and betweenness, may be biologically more meaningful (Hahn and Kern 2005
). Briefly, betweenness and closeness are two-dimensional measures of a protein's centrality in a network, as opposed to the connectivity, which is a one-dimensional metric of importance in the network (Wasserman and Faust 1994
, p. 178; Hahn and Kern 2005
). We found that both the signed relative closeness (SRC) and the signed relative betweenness (SRB) are significantly negatively correlated with the SRN (
= 0.10, 0.13, P = 0.02, P = 0.005, respectively). The negative coefficients indicate that the copy close to the ancestral sequence tends to have a more central role in the protein interaction network, in accordance with the above results. Interestingly, the correlation is stronger when the raw number of interaction is used (SRK) than when the betweenness and closeness (SRB or SRC) are used. However, none of these relationships were significant when the effects of other functional measures were considered.
Next, we analyzed the levels of protein abundance in yeast (Ghaemmaghami et al. 2003
). All yeast open reading frames have been individually tagged, and the absolute abundance levels have been measured by immunodetection methods, providing a glimpse of the posttranscriptional yeast proteome (Ghaemmaghami et al. 2003
). We calculated the signed relative asymmetry in protein abundance,
following the same sequence designation as above, where Ai is the abundance level of sequence i. Positive SRA means copy 2 is more abundant, and negative SRA means copy 3 is more abundant.
We found that protein duplicate pairs with greater relative asymmetry in sequence divergence also tend to have greater asymmetry in the abundance levels, and the sign of the correlation is negative (
= 0.22, P < 0.001, fig. 3B and table 2). Therefore, two partners in a duplicate pair tend to maintain similar levels of abundance if their sequences are similar. When their sequences diverge from each other, they also tend to have different abundance levels. The more derived duplicate occurs less abundantly.
As for the effect of sequence divergence on each gene's contribution to the fitness of the whole organism, we analyzed fitness measures of strains with corresponding single-gene deletion ("fitness effects"; Steinmetz et al. 2002
). If a gene shows severe reduction in fitness when removed, this value decreases. If two duplicates have maintained similar sequences, then removal of one duplicate of such a pair may not cause a severe fitness effect because the other copy can compensate for the loss. In accord to this prediction, a recent analysis of fitness effects of duplicates and singletons in the yeast genome has shown that there is a significantly higher probability of functional compensation for a duplicate gene than for a singleton, and the frequency of compensation decreases as the sequence divergence between duplicates increases (Gu et al. 2003
).
Here, we quantified each gene's contribution to pairwise fitness differences between duplicates, relative to their sequence divergence. For this, we defined a measure of relative fitness asymmetry in each duplicate pair as
using the minimum fitness as defined previously (Gu et al. 2003
). If the deletion of copy 2 has a more severe effect on fitness than that of the copy 3, SRF will be negative (because F2 is more reduced, hence less than, F3). When a pair of duplicates have similar sequences (more symmetric, hence small SRN), then removal of one duplicate can be compensated by the presence of the other copy, and both F2 and F3 are close to 1, and SRF will be close to zero. On the other hand, in an asymmetric duplicate pair (SRN large), they can no longer compensate for each other's loss, and the fitness effect of a gene is determined solely by its importance in the organismal fitness. If the copy that has fewer amino acid substitutions affect fitness more severely when deleted, then SRF and SRN should be positively correlated.
We found that among the 217 duplicate gene pairs for which fitness data were available, SRF and SRN were significantly positively correlated (
= 0.18, P < 0.001, fig. 3C and table 2). A gene that has accumulated more amino acid substitutions since the duplication tends to exhibit less severe fitness effect when deleted. Thus, sequence divergence may also underlie the duplicate proteins' role in genetic robustness against deleterious mutations.
Fitness effect and pairwise sequence divergence (dN) are correlated (Gu et al. 2003
; also, see table 2). Because our measure of asymmetry (RN) is also significantly correlated with dN, it is possible that the significant correlation between SRN
SRF is due to the correlation between dN
F. However, the latter correlation is much weaker than the former and not significant (
= 0.09, see table 2). Therefore, the observed correlation is not an artifact caused by the significant correlation between dN and fitness effect.
Different measures of functional asymmetry between duplicate proteins are significantly correlated with each other (table 2). This indicates that these measures are affected by other common factors. Hence, we further examined independent relationships between SRN and functional asymmetry by means of partial correlation analyses (table 2). We found that when the effects of other functional measures are eliminated, SRN is only significantly correlated with the SRA. Therefore, asymmetric evolution of the levels of protein abundance has the strongest effect on asymmetric sequence evolution of duplicate proteins.
| Discussion |
|---|
|
|
|---|
In this study, we investigated whether asymmetric sequence divergence is correlated with asymmetric functional divergence. Wagner (2002)
Namely, within each duplicate pair, the more slowly evolving partner tends to have more interactions in the protein-protein interaction network, more abundant in yeast cells, and have a greater fitness effect when deleted from the genome (less dispensable). This trend is in exact parallel to the results from recent whole-genome analyses of the relationship between sequence and functional divergence, as discussed in Introduction (e.g., Duret and Mouchiroud 2000
; Pál, Papp, and Hurst 2001
; Jordan et al. 2003
; Yang, Gu, and Li 2003
; Nuzhdin et al. 2004
; Subramanian and Kumar 2004
; Drummond et al. 2005
; Hahn and Kern 2005
; Lemos et al. 2005
; Zhang and He 2005
; Drummond, Raval, and Wilke 2006
). In fact, we ascertained the general relationship between sequence and functional evolution in our analysis of the whole-genome orthologs between S. cerevisiae and K. waltii (table 1); we observe weak relationships in the expected direction between pairwise sequence divergence and the number of interactions in a protein-protein interaction network and protein dispensability. Protein abundance (an indicator of expression level; Ghaemmaghami et al. 2003
) appears as the strongest determinant of pairwise sequence divergence between genes in the K. waltii genome and in the S. cerevisiae genome.
While pairwise correlation analyses of duplicates are generally consistent with the results from whole-genome analyses, most of them were weak and not significant (table 2). In contrast, correlations between asymmetry measures (SR measures) were generally stronger and significant (table 2). This implies that similar evolutionary forces that determine interspecific sequence and functional evolution underlie the observed asymmetric evolution of duplicate proteins. Also, the "relative" divergence of two duplicates is in general a better indicator of functional divergence than pairwise divergence. Therefore, in addition to a pairwise correlation analysis, it will be useful to consider relative asymmetry of sequence divergence when inferring functional divergence of duplicate proteins.
What drives the (correlated) asymmetric sequence and functional evolution? Asymmetric sequence divergence may be driven by purifying selection on the more slowly evolving duplicate, in conjunction with relaxed functional constraint and/or positive selection on the faster evolving duplicate (Conant and Wagner 2003
; Zhang, Gu, and Li 2003
). The slow evolution of more connected, more important proteins is taken as evidence of strong purifying selection (Fraser et al. 2002
; Krylov et al. 2003
; Yang, Gu, and Li 2003
; Hahn and Kern 2005
; Zhang and He 2005
). Several recent studies have shown that duplicate genes in general tend to evolve more slowly (Jordan, Wolf, and Koonin 2004
; Davis and Petrov 2004
), indicating strong purifying selection on duplicate proteins, presumably to conserve its "ancestral" sequence and function. In our data, the correlation between the relative asymmetry in sequence divergence (SRN) and the relative asymmetry in selective constraint (measured as dN/dS) is significant and positive (
= 0.5721, P
0). These observations support the conclusion that the more slowly evolving duplicates are under stronger purifying selection, while the faster evolving duplicates may experience relaxed selective constraint.
We measured functional divergence as a one-dimensional variable, in terms of the number of interactions, dispensability measures, and abundance measures. In reality, functional differentiations of duplicate genes are most likely to occur in temporal and spatial variation (e.g., Makova and Li 2003
). Therefore, a more detailed analysis utilizing spatial and temporal differentiation of functions of duplicates will shed further light on the mechanism of duplicate gene evolution. In addition, the correlation between asymmetric sequence divergence and asymmetric functional divergence typically explains only about
20% of total variation between the two variables (table 2). Therefore, other factors, such as divergence of regulatory sequences, may be more important in determining duplicate gene evolution (Conant and Wagner 2003
). In this respect, it is notable that protein abundance (an indicator of protein expression; Ghaemmaghami et al. 2003
) repeatedly appears as the strongest determinant of protein sequence evolution in whole-genome comparisons in yeast (this study, and also see Drummond, Raval, and Wilke 2006
), in Drosophila (Lemos et al. 2005
), and in yeast WGD duplicates (this study). Interestingly, the correlation between SRN and SRA was weaker than the normal correlation between dN
A (table 2). This was the only case where the correlation between asymmetry measures was weaker than the pairwise correlation.
Expression patterns diverge rapidly soon after duplication (Gu et al. 2002
; Makova and Li 2003
; Gu, Zhang, and Huang 2005
; Li, Yang, and Gu 2005
) and after genome duplications (Adams and Wendel 2005
). Such rapid expression divergence after duplication may not be triggered by protein sequence divergence. Rather, some other factors, such as divergence of cis- and trans-regulatory sequences, may alter expression patterns of duplicate genes (e.g., Li, Yang, and Gu 2005
). In turn, different expression patterns impose differential natural selection on protein-coding sequences. Future analyses of the determinants of expression divergence in young duplicates may shed light on this issue. Such analyses will also address several important open questions, including the degree to which sequence divergence (coding and regulatory) governs the functional divergence between duplicate proteins and the relationship between different aspects of functional evolution for singletons and duplicates.
| Supplementary Material |
|---|
|
|
|---|
Supplementary Fig. S1.A and S2.A and other supplementary materials are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
| Acknowledgements |
|---|
|
|
|---|
Comments on the manuscript by Wen-Hsiung Li, Esther Bétran, Navin Elango, Daniel Promislow, Nathan Bowen, and two anonymous reviewers are greatly appreciated. We also thank various comparative and functional genomics groups of S. cerevisiae who provided indispensable scientific resources. This study is supported by the funds from the Georgia Institute of Technology to S.V.Y.
| Footnotes |
|---|
Kenneth Wolfe, Associate Editor
| References |
|---|
|
|
|---|
Adams, K. L., and J. F. Wendel. 2005. Novel patterns of gene expression in polyploid plants. Trends Genet. 10:539543.
Batagelj, V., and A. Mrvar. 2003. Pajekanalysis and visualization of large networks. Pp. 77103 in M. Junger and P. Mutzel, eds. Graph Drawing Software, Springer, Berlin.
Bloom, J. D., and C. Adami. 2003. Apparent dependence of protein evolutionary rate on number of interactions is linked to biases in protein-protein interaction data sets. BMC Evol. Biol. 3:21.[CrossRef][Medline]
Breitkreutz, B. J., C. Stark, and M. Tyers. 2003. The GRID: the general repository for interaction datasets. Genome Biol. 4:R23.[CrossRef][Medline]
Casella, G., and R. L. Berger. 1990. Statistical inference. Duxbury Press, Belmont, Calif.
Conant, G. C., and A. Wagner. 2003. Asymmetric sequence divergence of duplicate genes. Genome Res. 13:20522058.
Davis, J. C., and D. A. Petrov. 2004. Preferential duplication of conserved proteins in eukaryotic genomes. PLoS Biol. 2:e55.
Drummond, D. A., J. D. Bloom, C. Adami, C. O. Wilke, and F. H. Arnod. 2005. Why highly expressed proteins evolve slowly. Proc. Natl. Acad. Sci. USA 102:1433814343.
Drummond, D. A., A. Raval, and C. O. Wilke. 2006. A single determinant dominates the rate of yeast protein evolution. Mol. Biol. Evol. 23:327337.
Duret, L., and D. Mouchiroud. 2000. Determinants of substitution rates in mammalian genes: expression pattern affects selection intensity but not mutation rate. Mol. Biol. Evol. 17:6870.
Enright, A. J., S. Van Dongen, and C. A. Ouzounis. 2002. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30:15751584.
Fraser, H. B., A. E. Hirsh, L. M. Steinmetz, C. Scharfe, and M. W. Feldman. 2002. Evolutionary rate in the protein interaction network. Science 296:750752.
Ghaemmaghami, S., W.-K. Huh, K. Bower, R. W. Howson, A. Belle, N. Dephoure, E. K. O'Shea, and J. S. Weissman. 2003. Global analysis of protein expression in yeast. Nature 425:737741.[CrossRef][Medline]
Gibbons, J. D., and S. Chakraborti. 2003. Nonparametric statistical inference. 4th edition. Marcel Dekker, New York.
Gu, X., Z. Zhang, and W. Huang. 2005. Rapid evolution of expression and regulatory divergences after yeast gene duplication. Proc. Natl. Acad. Sci. USA 102:707712.
Gu, Z., D. Nicolae, H. H.-S. Lu, and W.-H. Li. 2002. Rapid divergence in expression between duplicate genes inferred from microarray data. Trends Genet. 18:609613.[CrossRef][Web of Science][Medline]
Gu, Z., L. M. Steinemann, X. Gu, C. Scharfe, R. W. Davis, and W.-H. Li. 2003. Role of duplicate genes in genetic robustness against null mutations. Nature 421:6366.[CrossRef][Medline]
Hahn, M. W., G. C. Conant, and A. Wagner. 2004. Molecular evolution in large genetic networks: does connectivity equal constraint? J. Mol. Evol. 58:203211.[CrossRef][Web of Science][Medline]
Hahn, M. W., and A. D. Kern. 2005. Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. Mol. Biol. Evol. 22:803806.
Ito, T., T. Chiba, T. Ozawa, M. Yoshida, M. Hattori, and Y. Sakaki. 2001. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA 98:45694574.
Johnson, R. A., and D. W. Wichern. 2002. Applied multivariate statistical analysis. 5th edition. Prentice Hall, Upper Saddle River, N.J.
Jordan, I. K., Y. I. Wolf, and E. V. Koonin. 2003. No simple dependence between protein evolution rate and the number of protein-protein interactions: only the most prolific interactors tend to evolve slowly. BMC Evol. Biol. 3:1.[CrossRef][Medline]
. 2004. Duplicated genes evolve slower than singletons despite the initial rate increase. BMC Evol. Biol. 4:22.[CrossRef][Medline]
Kellis, M., B. W. Birren, and E. S. Lander. 2004. Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature 428:617624.[CrossRef][Medline]
Krylov, D. M., Y. I. Wolf, I. B. Rogozin, and E. V. Koonin. 2003. Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution. Genome Res. 13:22292235.
Lemos, B., B. R. Bettencourt, C. D. Meiklejohn, and D. L. Hartl. 2005. Evolution of proteins and gene expression levels are coupled in Drosophila and are independently associated with mRNA abundance, protein length, and number of protein-protein interactions. Mol. Biol. Evol. 22:13451354.
Li, L., C. J. Stoeckert Jr, and D. S. Roos. 2003. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 13:21782189.
Li, W.-H., J. Yang, and X. Gu. 2005. Expression divergence between duplicate genes. Trends Genet. 21:602607.[CrossRef][Web of Science][Medline]
Lynch, M., and J. S. Conery. 2000. The evolutionary fate and consequences of duplicate genes. Science 290:11511155.
Makova, K. D., and W.-H. Li. 2003. Divergence in the spatial pattern of gene expression between human duplicate genes. Genome Res. 13:16381645.
Nuzhdin, S. V., M. L. Wayne, K. L. Harmon, and L. M. McIntyre. 2004. Common pattern of evolution of gene expression leveland protein sequence in Drosophila. Mol. Biol. Evol. 21:13081317.
Ohno, S. 1970. Evolution by gene duplication. Springer-Verlag, Berlin.
Pál, C., B. Papp, and L. D. Hurst. 2001. Highly expressed genes in yeast evolve slowly. Genetics 158:927931.
Sokal, R. R., and F. J. Rohlf. 1995. Biometry: the principles and practice of statistics in biological research. 3rd edition. W. H. Freeman and Company, New York.
Steinmetz, L. M., C. Scharfe, A. M. Deutschbauer et al. (11 co-authors). 2002. Systematic screen for human disease genes in yeast. Nat. Genet. 31:400404.[Web of Science][Medline]
Subramanian, S., and S. Kumar. 2004. Gene expression intensity shapes evolutionary rates of the proteins encoded by the vertebrate genome. Genetics 168:373381.
Uetz, P., L. Giot, G. Cagney et al. (20 co-authors). 2000. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403:623627.[CrossRef][Medline]
Wagner, A. 2000. Robustness against mutations in genetic networks of yeast. Nat. Genet. 24:355361.[CrossRef][Web of Science][Medline]
. 2001. The yeast protein interaction network evolves rapidly and contains few redundant duplicate genes. Mol. Biol. Evol. 18:12831292.
. 2002. Asymmetric functional divergence of duplicate genes in yeast. Mol. Biol. Evol. 19:17601768.
Wasserman, S., and K. Faust. 1994. Social network analysis. Cambridge University Press, Cambridge.
Whittaker, J. 1996. Graphical models in applied multivariate statistics. John Wiley and Sons, New York.
Wolfe, K. H., and D. C. Shields. 1997. Molecular evidence for an ancient duplication of the entire yeast genome. Nature 387:708713.[CrossRef][Medline]
Yang, J., Z. Gu, and W.-H. Li. 2003. Rate of protein evolution versus fitness effect of gene deletion. Mol. Biol. Evol. 20:772774.
Yang, Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13:555556.
Yang, Z., and R. Nielsen. 2000. Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol. Biol. Evol. 17:3243.
Zhang, J. G., and X. He. 2005. Significant impact of protein dispensability on the instantaneous rate of protein evolution. Mol. Biol. Evol. 22:11471155.
Zhang, P., Z. Gu, and W.-H. Li. 2003. Different evolutionary patterns between young duplicate genes in the human genome. Genome Biol. 4:R56.[CrossRef][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
G. C. Conant and K. H. Wolfe Probabilistic Cross-Species Inference of Orthologous Genomic Regions Created by Whole-Genome Duplication in Yeast Genetics, July 1, 2008; 179(3): 1681 - 1692. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. R. Scannell and K. H. Wolfe A burst of protein sequence evolution and a prolonged period of asymmetric evolution follow gene duplication in yeast Genome Res., January 1, 2008; 18(1): 137 - 147. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. W. Ganko, B. C. Meyers, and T. J. Vision Divergence in Expression between Duplicated Genes in Arabidopsis Mol. Biol. Evol., October 1, 2007; 24(10): 2298 - 2309. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. P. Cusack and K. H. Wolfe Not Born Equal: Increased Rate Asymmetry in Relocated and Retrotransposed Rodent Gene Duplicates Mol. Biol. Evol., March 1, 2007; 24(3): 679 - 686. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Guan, M. J. Dunham, and O. G. Troyanskaya Functional Analysis of Gene Duplications in Saccharomyces cerevisiae Genetics, February 1, 2007; 175(2): 933 - 943. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


2 = 15.14 with 1 df).


