MBE Advance Access originally published online on January 29, 2007
Molecular Biology and Evolution 2007 24(4):1005-1011; doi:10.1093/molbev/msm019
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
Proportion of Solvent-Exposed Amino Acids in a Protein and Rate of Protein Evolution




* Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan
Department of Ecology and Evolution, University of Chicago
Institute of Bioinformatics, National Chiao Tung University, Hsinchu, Taiwan
E-mail: whli{at}uchicago.edu.
| Abstract |
|---|
|
|
|---|
Translational selection, including gene expression, protein abundance, and codon usage bias, has been suggested as the single dominant determinant of protein evolutionary rate in yeast. Here, we show that protein structure is also an important determinant. Buried residues, which are responsible for maintaining protein structure or are located on a stable interaction surface between 2 subunits, are usually under stronger evolutionary constraints than solvent-exposed residues. Our partial correlation analysis shows that, when whole proteins are included, the variance of evolutionary rate explained by the proportion of solvent-exposed residues (Pexposed) can reach two-thirds of that explained by translational selection, indicating that Pexposed is the most important determinant of protein evolutionary rate next only to translational selection. Our result suggests that proteins with many residues under selective constraint (e.g., maintaining structure or intermolecular interaction) tend to evolve slowly, supporting the "fitness (functional) density" hypothesis.
Key Words: evolutionary rate protein structure fitness density functional density solvent accessibility disordered
| Introduction |
|---|
|
|
|---|
The issue of what factors determine the rate of protein evolution has drawn much attention in recent years (for review, see McInerney 2006
Here, instead of considering the protein as a whole as in the studies reviewed above, we look into differences in evolutionary constraints among residues to examine the fitness (functional) density hypothesis. Dickerson (1971)
found that surface residues that interact with other proteins tend to be highly conserved. Later, Kimura and Ohta (1973)
found that the rate of amino acid substitution at surface residues of the
and ß globins evolve 10 times faster than residues in the heme pocket. Similarly, it has been found that residues in the interfaces of obligate protein complexes are more conserved than residues in transient interactions (Mintseris and Weng 2005
) and that the solvent-inaccessible core of a protein is better conserved than solvent-accessible residues in a protein (Overington et al. 1992
; Goldman et al. 1998
; Bustamante et al. 2000
). Moreover, residues in the buried core and residues on the solvent-exposed surfaces were shown to have different substitution patterns due to different selection pressures (Tseng and Liang 2006
). From these findings, it is reasonable to speculate that a protein with a small proportion of solvent-exposed residues (Pexposed) should evolve slowly. However, a contradictory result was found recently (Bloom, Drummond, et al. 2006
). It is therefore interesting to investigate whether the structure of a protein, especially the solvent accessibility of the residues, is an important determinant of protein evolutionary rate.
| Materials and Methods |
|---|
|
|
|---|
Genomic Data
We studied genes in the Saccharomyces cerevisiae genome and obtained nonsynonymous rates (KA) from Wall et al. (2005)
Solvent Accessibility Prediction Using the Homology Model
Although using the three-dimensional (3D) structures of yeast proteins to estimate the proportion of exposed amino acids in a protein (Pexposed) is the best choice, completely determined 3D structures are available for only about 100 yeast proteins. For yeast proteins without 3D structure, we have therefore used the (Protein Data Bank) homologues for yeast ORFs in the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/). The PDB homologues are protein structures from various species (including S. cerevisiae) homologous to yeast ORFs, and we used them to estimate Pexposed, assuming that the 3D structures for the homologues are identical.
For each yeast ORF, the PDB homologue with the lowest divergence to the yeast ORF sequence was chosen. The solvent accessible surface areas (ACCs) for each residue of the PDB homologue were obtained from DSSP (database of secondary structure assignments for all protein entries in the PDB; http://swift.cmbi.ru.nl/gv/dssp/) (Kabsch and Sander 1983
). Both the core residues of a protein, which are important in maintaining protein structure, and the residues on the stable interaction surface between 2 subunits can be regarded as buried because in the native 3D structure of a protein complex they are indeed not solvent exposed. Therefore, for protein structures including more than 1 chain, the interchain contacts were included to calculate the ACC values for these residues. Relative solvent accessibility (RelACC) was the ACC value for each residue divided by the maximum value of ACC for the amino acid (represented in percentage), which is estimated from a Gly-X-Gly extended tripeptide conformation. We define residues with RelACC higher than 25% as exposed residues and the others as buried. The exposed and buried residues defined using this threshold have approximately equal numbers. The Pexposed value for each PDB homologue was thus calculated. We have also tried 2 other RelACC values (0% and 50%) as the threshold, but the conclusions were essentially the same.
Note that there are a number of problems with the PDB data. First, even for a yeast protein with the 3D structure completely determined, Pexposed is not known but must be estimated. Second, for a yeast protein without the 3D structure, we have to use the structure of a protein homologous to the yeast protein. Moreover, PDB homologues are not available for many yeast ORFs. Third, the structural similarity between the yeast protein and its distant PDB homologue may hold only for the well-folded, conserved domains, but not for the other regions. Fourth, many PDB structures are only partially determined, and most of them are restricted to the well-folded regions in the proteins. Only the structures of well-folded proteins may be completely determined. The alignment length between the PDB homologue and the yeast ORF sequence can approximately represent the proportion of the structure determined. The unaligned regions are either not structurally determined or too diverged between the yeast protein and its PDB homologue. They are often disordered regions in 3D structures. Fifth, some PDB structures only include one or a few subunits, not the entire protein complex. In this case, residues on the stable interaction surface, which should be buried in vivo, are mistakenly treated as exposed residues in the PDB structure data. Therefore, Pexposed estimated from PDB homologues is only applicable for a limited number of proteins. To overcome these problems, we also used support vector machine (SVM) to predict Pexposed for each ORF directly from the amino acid sequence.
Solvent Accessibility Prediction Using SVM
We used the same training data set as Kim and Park (2004)
, which includes 480 proteins all with known 3D structures and with less than 25% sequence similarity between sequences. The ACC for each residue of these 480 proteins were obtained from DSSP. Residues on the stable interaction surface between 2 subunits were also regarded as buried residues. We used position-specific scoring matrices (PSSM), secondary structure profiles, and hydropathy indexes (Kyte and Doolittle 1982
) as feature factors. A 15-amino acidsliding window was used to represent the local environment of the protein sequences. We used 5iterations of PSI-Blast (Altschul et al. 1997
) against the nonredundant protein sequence database to produce PSSM. The secondary structure profiles describing the occurring probabilities for helix, sheet, and coil were generated using the PSIPRED secondary structure prediction method (Jones 1999
). SVM prediction was performed using a library for SVM version 2.6 (Chang and Lin 2001
). A 7-fold cross validation test yields 78% accuracy. Note that Pexposed (PDB) is only calculated for the determined, well-folded regions, whereas Pexposed (SVM) can be estimated for the whole protein.
| Results and Discussion |
|---|
|
|
|---|
Proportion of Exposed Residues in a Protein
To investigate how well Pexposed (SVM) represents the proportion of exposed residues, we compared it with Pexposed (PDB) for ORFs with their PDB homologues for the cases with alignment length >98% and sequence identity >98% (i.e., completely determined PDB structures from S. cerevisiae, data set I in table 1). Because the buried/exposed state of residues predicted by SVM have 22% inaccuracy, proteins with high (low) Pexposed would tend to have their Pexposed (SVM) underpredicted (overpredicted); in other words, the predicted values would have a smaller variance. We indeed found 9/9 ORFs with Pexposed (PDB) >0.6 have their Pexposed (SVM) slightly underpredicted, whereas 30/31 ORFs with Pexposed (PDB) <0.4 have their Pexposed (SVM) slightly overpredicted. Compared with Pexposed (PDB), Pexposed (SVM) has, as expected, a slightly higher mean (closer to 0.5) and a smaller variance (table 1). The correlation coefficient between Pexposed (PDB) and Pexposed (SVM) is high but not perfect (R = 0.72, n = 96, P = 1.3 x 1016). The reason might be that most ORFs (84/96) have their Pexposed (PDB) between 0.3 and 0.6, and not evenly distributed from 0 to 1; the noise introduced by SVM may therefore likely disturb the correlation. Nevertheless, the correlation between Pexposed (SVM) and KA is very similar to the correlation between Pexposed (PDB) and KA (0.450 vs. 0.465, first row in table 1). This result suggests that Pexposed (SVM) can be used to estimate the correlation between Pexposed and KA.
|
All proteins in data set I have their 3D structures well determined and this fact implies that they are all well-folded proteins. Structurally less well-determined proteins usually contain disordered regions, which contain mainly exposed residues and have been found to evolve rapidly (Brown et al. 1992
Pexposed and Rate of Protein Evolution
England and Shakhnovich (2003)
suggested that proteins with a higher contact density (fewer exposed residues) are more designable (a protein structure encoded by many sequences). Bloom, Drummond, et al. (2006)
proposed that proteins with higher designable structures evolve more rapidly. They stated, "although buried residues are generally more conserved than exposed ones, increasing the fraction of buried residues leads to an overall increase in the evolutionary rate of all residues in the protein, primarily via a dramatic increase in KA for the exposed residues." Therefore, the reduction in Pexposed "is more than compensated for by the increased variability of exposed residues in proteins with high contact density." Interestingly, we found that when the threshold of the alignment length is high, that is, for proteins with fewer disordered residues, Pexposed is negatively correlated with KA (estimated either by PDB homologues or SVM prediction; table 1), which is consistent with the observation of Bloom, Drummond, et al. (2006)
. They also used a stringent criterion to restrict their data set, that is, the number of identities in the total length of the alignment is >80%. However, we found that Pexposed (SVM) is positively correlated with KA when the alignment threshold is decreased, especially when proteins with many disordered regions are included (table 1).
We next noted that the negative correlation between Pexposed and KA found in this study (e.g., in data set I) is not as strong as that in Bloom, Drummond, et al. (2006)
. The major reason is that they did not consider interchain (intersubunit) contacts, whereas we did. Bloom, Drummond, et al. (2006)
found that proteins with a smaller contact density evolve much slower at their exposed sites. We analyzed 100 proteins in their data set, for which the complex annotations are available (Lin et al. 2007
). We found that all the 11 proteins with a contact density <6 are heterocomplexes and 9 out of these 11 proteins have at least 7 subunits. In contrast, only 26 out of the 89 proteins with a contact density >6 were annotated to have 7 or more subunits. We also noted that the number of complex subunits (k) is negatively correlated with KA in the data set of Bloom, Drummond, et al. (2006)
(R = 0.32, n = 62, P = 1.2 x 102), which is consistent with Teichmann's (2002)
finding that stable complex proteins evolve more slowly. Therefore, we suggest that for these well-folded proteins, selection pressure on residues at the interchain interaction sites is as important as designability (inferred from contact density) for determining the evolutionary rate.
Note that the correlation between Pexposed and evolutionary rate may reflect 2 contradictory effects, that is, fitness (functional) density and designability. The switch from negative to positive correlations between Pexposed (SVM) and KA as Pexposed (SVM) increases indicates that the effect of designability (inferred from contact density) on evolutionary rate (Bloom, Drummond, et al. 2006
) might be restricted to the well-folded proteins (or cores). This also explains the slightly negative correlation between Pexposed (PDB) and KA when the alignment length is not long (table 1), that is, only the designability of the well-folded cores is inferred by Pexposed (PDB). The significant positive correlation between Pexposed (SVM) and KA for proteins containing large disordered regions (table 1) suggests that for these proteins, fitness (functional) density can explain the variance of evolutionary rate much better than designability.
When all proteins are included, partial correlation analysis shows that Pexposed (SVM) still significantly positively correlates with evolutionary rate even when the translational selection predictors (mRNA expression, protein abundance, and codon usage bias measured by CAI) are controlled (table 2). The variance of evolutionary rates explained by Pexposed (SVM) is more than half of that explained by mRNA expression or protein abundance, and even slightly more than that explained by CAI. This result suggests that in general, fitness (functional) density has much higher impact than designability on protein evolutionary rate. It is likely that for well-folded proteins the variances of Pexposed and evolutionary rate are small, so that the differences in selection pressure between exposed and buried residues are almost compensated by the effect of designability in these proteins, but this is not true for other proteins (fig. 1).
|
|
Principal Component Regression Analysis
Note that although partial correlation analysis may be unreliable when data are noisy and the correlation is weak (Drummond et al. 2006
5% of the variance. This inference is misleading because the 3 translational selectionrelated predictors are not mutually independent, and this decides the order of the components. To demonstrate this problem, let us consider only 1 translational selectionrelated predictor, say CAI, and the 2 Pexposed values obtained from PDB homologues and SVM prediction. Now the 2 Pexposed variables contribute to
90% of CP1, whereas CAI contributes only 11% (table 4). CP1 explains almost 20% of the variance of KA. We cannot conclude that all this 20% variance is contributed by Pexposed because CAI is not really controlled. This example shows that PCR analysis tends to overestimate the contribution of correlated predictors to the variation of a response variable but underestimate the contributions of other predictors. In the presence of nonindependent factors, if the purpose is to obtain 1 representative component and to see the correlation between this component and the response variable, PCR analysis is very useful. However, if the purpose is to see the correlation between one factor and the response variable whereas controlling other factors, PCR analysis can be misleading.
|
|
Contribution of Pexposed to Variation in Rate of Protein Evolution
To conduct the analysis more appropriately, we used suitable controls. We defined CP1 in the principal component analysis (PCA) for the 3 predictors, mRNA expression, protein abundance, and CAI, as translational selection, and we controlled it to calculate partial correlation between Pexposed (SVM) and KA. (Because mRNA expression, protein abundance, and CAI together represent translational selection + noise and because CP1 is the best variable to represent these 3 predictors at the same time, controlling translational selection would be better than controlling the 3 predictors individually). As seen near the bottom of table 2, the contribution of Pexposed (SVM) to the variance of KA is 11.3% when the translational selection is controlled, which is about two-thirds of the contribution by translational selection (18.3%) to the variance of KA when Pexposed (SVM) is controlled. This analysis suggests that Pexposed contributes
10% to variation in KA and is the most important known determinant next to translational selection. Note that this might be an underestimate because of the considerable uncertainty involved in the estimation of Pexposed.
Fraser (2005)
showed that party hubs (proteins interacting with most of their partners simultaneously; Han et al. 2004
) evolve slower than date hubs (proteins interacting with different partners at different times). Because most party hubs are protein complexes, whereas date hubs are not (Han et al. 2004
), we compared Pexposed (SVM) values between them. Not surprisingly, party hubs, on average, have a smaller Pexposed (49.7%) compared with date hubs (56.9%, t-test P = 5.0 x 105). It is therefore reasonable to speculate that Pexposed should also explain part of the difference in evolutionary rate between party and date hubs. Similarly, subunits of a large heterocomplex should have more protein interactions and should be less dispensable (Lin et al. 2007
). Pexposed may therefore underlie the correlations between these 2 factors and evolutionary rate (Hirsh and Fraser 2001
; Fraser et al. 2002
; Yang et al. 2003
; Wall et al. 2005
; Zhang and He 2005
).
It is worth noting that proteins with a high Pexposed may evolve slowly or fast, whereas proteins with a low Pexposed almost always have a low evolutionary rate (fig. 1). This result suggests that protein 3D structure provides only a general index, that is, buried resides cannot evolve freely. Some exposed residues (e.g., residues at active sites or ligand-binding sites) may be functionally important and are thus conserved. Protein mutagenesis experiments have shown that increasing a protein's thermodynamic stability dramatically increases its tolerance to mutations that suggests deleterious mutations usually act by hindering the formation of a properly folded protein rather than altering a protein's function (Bloom et al. 2005
; Bloom, Labthavikul, et al. 2006
). The evolutionary constraint of highly expressed proteins was suggested to reduce the burden of protein misfolding (Drummond et al. 2005
). Similarly, it is likely that buried residues are conserved because they are important to make proteins fold or interact correctly among subunits or proteins. Although translational selection largely governs the rate of evolution for the whole protein (Drummond et al. 2006
), our study shows that fitness (functional) density negatively correlates with protein evolutionary rate, that is, a protein with more residues under selective constraint tends to evolve more slowly. We expect that an even better correlation will be found when the fitness (functional) density can be appropriately defined rather than estimated as buried residues.
Box 1
PCA transforms n factors (which may not be mutually independent) to n independent components by rotating the axes such that the first component has the largest variance by any projection of the data, and the second component has the second largest variance, and so on. Given a data set {x1, x2, x3}, we can obtain the first component, CP1 = a11x1 + a12x2 + a13x3, where a
, a
2, and a
indicate the contributions of x1, x2, and x3 to CP1 and they are summed to 1. We can then use CP1 to correlate with a response, y, and calculate R2, the variance of y explained by CP1.
However, although a
indicates the contributions of x1 to CP1, using a
x R2 to represent the variance of y explained by x1 is invalid. This problem can be demonstrated by a simple example with a data set {x1, x2}, where x1 and x2 have the same variance and are correlated. We can then obtain CP1 = a11x1 + a12x2 and CP2 = a21x1 + a22x2, where a11 = a12 and a21 = a22, so that x1 and x2 contribute equally to both CP1(a
= a
) and CP2(a
= a
). We can thus correlate CP1 and CP2 with y and calculate the variance of y explained by CP1 and CP2, respectively. When y = x1, it is obvious that x1 can completely explain the variance of y, whereas only a proportion of the variance of y can be explained by x2 (x1 and x2 are correlated). However, the fact that the variances of y explained by x1 and x2 are different cannot be revealed by PCR analysis (x1 and x2 contribute equally to both CP1 and CP2, i.e., a
= a
and a
= a
.)
The second problem is that, when the inputs include many nonindependent factors, the first component can be highly correlated to these factors, so that it can include as much information as possible. After the first component is decided, the second component is determined by including as much of the remaining information as possible. Given a data set {x1, x2, x3, x4} where the variables in the subset {x1, x2, x3} are highly correlated with each other, CP1 will be mainly composed of x1, x2 and x3. If the subset {x1, x2, x3} has covariance with x4, this covariance will therefore be mainly included in CP1 but not other components. In this case, x4 contributes mainly to CP2 but also weakly to CP1. For CP1, CP2 is actually controlled because CP1 and CP2 are independent to each other. However, we cannot say that for CP1, factor x4 is controlled because CP1 includes the covariance shared between the subset {x1, x2, x3} and x4.
| Acknowledgements |
|---|
|
|
|---|
We thank Eduardo P.C. Rocha, Martin Lercher, Kevin Bullaughey, and Jeffrey Tseng for comments and the Structural Bioinformatics Core at the National Chiao Tung University for hardware and software support. The work was supported by grants from National Science Council (NSC 094-2917-I-009-015 to Y.-S.L. and NSC 093-3112-B-009-001 to J.-K.H.) and National Institutes of Health grants to W.-H.L.
| Footnotes |
|---|
William Martin, Associate Editor
| References |
|---|
|
|
|---|
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:33893402.
Bloom JD and Adami C. (2003) Apparent dependence of protein evolutionary rate on number of interactions is linked to biases in protein-protein interactions data sets. BMC Evol Biol 3:21.[CrossRef][Medline]
Bloom JD, Drummond DA, Arnold FH, Wilke CO. (2006) Structural determinants of the rate of protein evolution in yeast. Mol Biol Evol 23:17511761.
Bloom JD, Labthavikul ST, Otey CR, Arnold FH. (2006) Protein stability promotes evolvability. Proc Natl Acad Sci USA 103:58695874.
Bloom JD, Silberg JJ, Wilke CO, Drummond DA, Adami C, Arnold FH. (2005) Thermodynamic prediction of protein neutrality. Proc Natl Acad Sci USA 102:606611.
Brown CJ, Takayama S, Campen AM, Vise P, Marshall TW, Oldfield CJ, Williams CJ, Dunker AK. (1992) Evolutionary rate heterogeneity in proteins with long disordered regions. J Mol Evol 55:104110.
Bustamante CD, Townsend JP, Hartl DL. (2000) Solvent accessibility and purifying selection within proteins of Escherichia coli and Salmonella enterica. Mol Biol Evol 17:301308.
Chang C-C and Lin C-J. (2001) Training
-support vector classifiers: theory and algorithms. Neural Comput 13:21192147.[CrossRef][Web of Science][Medline]
Dickerson RE. (1971) The structures of cytochrome c and the rates of molecular evolution. J Mol Evol 1:2645.[CrossRef][Medline]
Drummond DA, Bloom JD, Adami C, Wilke CO, Arnold FH. (2005) Why highly expressed proteins evolve slowly. Proc Natl Acad Sci USA 102:1433814343.
Drummond DA, Raval A, Wilke CO. (2006) A single determinant dominates the rate of yeast protein evolution. Mol Biol Evol 23:327337.
England JL and Shakhnovich EI. (2003) Structural determinant of protein designability. Phys Rev Lett 90:218101.[CrossRef][Medline]
Fraser HB. (2005) Modularity and evolutionary constraint on proteins. Nat Genet 37:351352.[CrossRef][Web of Science][Medline]
Fraser HB, Hirsh AE, Steinmetz LM, Scharfe C, Feldman MW. (2002) Evolutionary rate in the protein interaction network. Science 296:750752.
Ghaemmaghami S, Huh W-K, Bower K, Howson RW, Belle A, Dephoure N, O'Shea EK, Weissman JS. (2003) Global analysis of protein expression in yeast. Nature 425:737741.[CrossRef][Medline]
Goldman N, Thorne JL, Jones DT. (1998) Assessing the impact of secondary structure and solvent accessibility on protein evolution. Genetics 149:445458.
Han J-DJ, Bertin N, Hao T, et al. (2004) Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature 430:8893 (11 co-authors).[CrossRef][Medline]
Hirsh AE and Fraser HB. (2001) Protein dispensability and rate of evolution. Nature 411:10461049.[CrossRef][Medline]
Holstege FCP, Jennings EG, Wyrick JJ, Lee TI, Hengartner CJ, Green MR, Golub TR, Lander ES, Young RA. (1998) Dissecting the regulatory circuitry of a eukaryotic genome. Cell 95:717728.[CrossRef][Web of Science][Medline]
Ihaka R and Gentleman R. (1996) R: a language for data analysis and graphics. J Comput Graph Stat 5:299314.
Jones DT. (1999) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 292:195202.[CrossRef][Web of Science][Medline]
Kabsch W and Sander C. (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22:25772637.[CrossRef][Web of Science][Medline]
Kim H and Park H. (2004) Prediction of protein relative solvent accessibility with support vector machines and long-range interaction 3D local descriptor. Proteins 54:557562.[CrossRef][Web of Science][Medline]
Kimura M and Ohta T. (1973) Mutation and evolution at the molecular level. Genetics 73:(Suppl.), 1935.
Kimura M and Ohta T. (1974) On some principles governing molecular evolution. Proc Natl Acad Sci USA 71:28482852.
Kyte J and Doolittle RF. (1982) A simple method for displaying the hydropathic character of a protein. J Mol Biol 157:105132.[CrossRef][Web of Science][Medline]
Lin Y-S, Hwang J-K, Li W-H. (2007) Protein complexity, gene duplicability and gene dispensability in the yeast genome. Gene 387:109117.[CrossRef][Web of Science][Medline]
McInerney JO. (2006) The causes of protein evolutionary rate variation. Trends Ecol Evol 21:230232.[CrossRef][Medline]
Mintseris J and Weng Z. (2005) Structure, function, and evolution of transient and obligate proteinprotein interactions. Proc Natl Acad Sci USA 102:1093010935.
Ohta T. (1973) Slightly deleterious mutant substitutions in evolution. Nature 246:9698.[CrossRef][Medline]
Overington J, Donnelly D, Johnson MS, Sali A, Blundell TL. (1992) Environment-specific amino acid substitution tables: tertiary templates and prediction of protein folds. Protein Sci 1:216226.[Web of Science][Medline]
Pal C, Papp B, Hurst LD. (2001) Highly expressed genes in yeast evolve slowly. Genetics 158:927931.
Pal C, Papp B, Hurst LD. (2003) Rate of evolution and gene dispensability. Nature 421:496497.[Medline]
Pal C, Papp B, Lercher MJ. (2006) An integrated view of protein evolution. Nat Rev Genet 7:337348.[CrossRef][Web of Science][Medline]
Rocha EPC. (2006) The quest for the universals of protein evolution. Trends Genet 22:412416.[CrossRef][Web of Science][Medline]
Rocha EPC and Danchin A. (2004) An analysis of determinants of amino acids substitution rates in bacterial proteins. Mol Biol Evol 21:108116.
Sharp PM and Li W-H. (1987) The codon adaptation indexa measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 15:12811295.
Teichmann SA. (2002) The constraints proteinprotein interactions place on sequence divergence. J Mol Biol 324:399407.[CrossRef][Web of Science][Medline]
Tseng YY and Liang J. (2006) Regions and application in protein function inference: a Bayesian Monte Carlo approach. Mol Biol Evol 23:421436.
Wall DP, Hirsh AE, Fraser HB, Kumm J, Giaever G, Eisen MB, Feldman MW. (2005) Functional genomic analysis of the rates of protein evolution. Proc Natl Acad Sci USA 102:54835488.
Wilson AC, Carlson SS, White TJ. (1977) Biochemical evolution. Annu Rev Biochem 46:573639.[CrossRef][Web of Science][Medline]
Yang J, Gu Z, Li W-H. (2003) Rate of protein evolution versus fitness effect of gene deletion. Mol Biol Evol 20:772774.
Zhang J and He X. (2005) Significant impact of protein dispensability on the instantaneous rate of protein evolution. Mol Biol Evol 22:11471155.
Zuckerkandl E. (1976) Evolutionary processes and evolutionary noise at the molecular level. I. Functional density in proteins. J Mol Evol 7:167183.[CrossRef][Web of Science][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
E. A. Franzosa and Y. Xia Structural Determinants of Protein Evolution Are Context-Sensitive at the Residue Level Mol. Biol. Evol., October 1, 2009; 26(10): 2387 - 2395. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

