Molecular Biology and Evolution 19:1022-1025 (2002)
© 2002 Society for Molecular Biology and Evolution
Ratios of Radical to Conservative Amino Acid Replacement are Affected by Mutational and Compositional Factors and May Not Be Indicative of Positive Darwinian Selection
Department of Zoology, George S. Wise Faculty of Life Sciences, Tel Aviv University
| Abstract |
|---|
|
|
|---|
The ratio of radical to conservative amino acid replacements is frequently used to infer positive Darwinian selection. This method is based on the assumption that radical replacements are more likely than conservative replacements to improve the function of a protein. Therefore, if positive selection plays a major role in the evolution of a protein, one would expect the radical-conservative ratio to exceed the expectation under neutrality. Here, we investigate the possibility that factors unrelated to selection, i.e., transition-transversion ratio, codon usage, genetic code, and amino acid composition, influence the radical-conservative replacement ratio. All factors that have been studied were found to affect the radical-conservative replacement ratio. In particular, amino acid composition and transition-transversion ratio are shown to have the most profound effects. Because none of the studied factors had anything to do with selection (positive or otherwise) and also because all of them (singly or in combination) affected a measure that was supposed to be indicative of positive selection, we conclude that selectional inferences based on radical-conservative replacement ratios should be treated with suspicion.
| Introduction |
|---|
|
|
|---|
Nonsynonymous substitutions are far more likely than synonymous substitutions to improve the function of a protein. Because advantageous mutations undergo fixation much more rapidly than neutral mutations and also because the rate of synonymous mutation per synonymous site is the same as the rate of nonsynonymous mutation per nonsynonymous site, the rate of nonsynonymous substitution is expected to exceed that of synonymous substitution, if positive Darwinian selection plays a major role in the evolution of a protein. Nei and Gojobori (1986)
0.5% of the cases. One problem with the nonsynonymous-synonymous ratio is that synonymous substitutions tend to become saturated; therefore, they are underestimated more quickly than nonsynonymous substitutions. In such cases, the nonsynonymous-synonymous ratio may artifactually exceed 1, and positive selection may be inferred where none exists.
Hughes, Ota, and Nei (1990)
proposed to circumvent the saturation problem by using the ratio of radical to conservative amino acid replacements. The rationale of this method is very similar to that used in the nonsynonymous-synonymous ratio case. That is, radical replacements are assumed to be more likely than conservative replacements to improve the function of a protein. Therefore, if positive selection plays a major role in the evolution of a protein, we should expect the radical-conservative ratio to exceed the expectation under no selection. There are several methods to estimate the radicalism or conservatism of a particular amino acid replacement. One, for example, may decide that the property of interest is electric charge, and therefore, all replacements that result in charge changes are radical, whereas all replacements that do not affect charge are conservative. Alternatively, several properties may be considered simultaneously through the use of a physico-chemical measure, such as Grantham's (1974)
distance. The radical-conservative replacement ratio has also been used extensively to infer positive selection (e.g., Hughes, Ota, and Nei 1990
; Hughes 1992
; Rand, Weinreich, and Cezairliyan 2000
; Hughes 2000, 2002
).
In this study, we investigate the possibility that factors unrelated to selection influence the radical-conservative replacement ratio values. For example, it is known that transversions result in more dramatic changes than do transitions. That is, transversions are more likely than transitions to be nonsynonymous in protein-coding regions, and nonsynonymous transversions are more likely to result in radical replacement than nonsynonymous transitions (Zhang 2000
). It is, therefore, possible that differences in radical-conservative replacement ratios may be caused by mutations factors, such as the transition-transversion ratio, rather than selectional forces. In this study, we simulated DNA-sequence evolution and resulting radical-conservative replacement ratios by varying transition-transversion ratios, codon usage, genetic code, and amino acid composition. In the simulation we introduced no hint of positive selection.
| Methods |
|---|
|
|
|---|
Simulated Protein Evolution
Each virtual protein-coding gene was 300 nucleotides long, resulting in a protein 100 amino acids in length. Genetic code, codon usage, and amino acid composition were fixed at the beginning of each simulation. Each virtual gene was used as the ancestor sequence in the simulated-evolution program of ROSE software (Stoye, Evers, and Meyer 1998
Radical-Conservative Ratios
All the 190 possible amino acid replacements were classified using three independent criteria: (1) charge, (2) volume and polarity, and (3) Grantham's (1974)
physico-chemical distance.
Classification by charge was made by dividing the amino acids into three categories: positive (R, H, K), negative (D, E), and uncharged (A, N, C, Q, G, I, L, M, F, P, S, T, W, Y, V).
Classification by volume and polarity was made by dividing the amino acids into six categories: special (C), neutral and small (A, G, P, S, T), polar and relatively small (N, D, Q, E), polar and relatively large (R, H, K), nonpolar and relatively small (I, L, M, V), and nonpolar and relatively large (F, W, Y).
The two classifications above were taken from Zhang (2000)
. We did not use an additional classification in Zhang (2000)
, i.e., polarity, in order to keep the divisions independent of one another. Within each of the two classifications above, amino acid replacements were deemed conservative if they involved exchanges within a category and radical if the exchanges occurred among categories.
As far as Grantham's (1974)
distances are concerned, an amino acid replacement was deemed conservative if the distance value was smaller than 100 and radical otherwise.
Codon Usage
Three patterns of codon usage were used: random, GC biased, and AT biased. In the random pattern, each codon frequency was calculated as the frequency of the amino acid specified by the codon divided by the number of possible codons for the amino acid. In the GC- and AT-biased patterns of codon usage, each codon frequency was calculated as the frequency of the amino acid specified by the codon divided by the number of possible codons ending in GC or AT, respectively.
Amino Acid Composition
Eight amino acid compositions were used. Two compositions were the theoretical equilibrium expectations of two replacement matrices, i.e., Dayhoff's (1978, p. 345)
and JTT (Jones, Taylor, and Thornton 1992
). Five compositions were derived from mean amino acid frequencies in different protein classes: (1) extracellular proteins, (2) anchored proteins, (3) membranal proteins, (4) intracellular proteins, and (5) nuclear proteins. The values were taken from Cedano et al. (1997)
. The eighth composition was of a proline-rich protein as an example of extreme amino acid bias. In this case, the frequency of 19 amino acids was set at 0.045, whereas the frequency of proline was 0.136. All amino acid frequencies are shown in table 1
.
|
Transition-Transversion Ratios
Transition-transversion ratios inferred from real data range widely, depending among others on divergence time, lineage, and DNA origin (e.g., Lanave et al. 1986
Insertion and deletion frequencies were set to zero in order to keep the length of the sequences constant and prevent gaps in the alignment.
Genetic Code
Two genetic codes were used: the standard (so-called universal) code and the vertebrate mitochondrial code.
Statistical Analyses
The effects of various variables and the interactions among them on the three radical-conservative replacement ratios were tested by a multiway analysis of variance (ANOVA). All the effects were considered as fixed.
Reality check
In order to establish that compositional and mutational factors may indeed produce false positive inferences of Darwinian selection, we simulated the evolution of several human protein-coding genes in which positive selection has never been reported, e.g., ß hemoglobin, interleukin 2, ribosomal protein S21 (accession numbers NM_000518.3, NM_001024.2, and NM_000586.1, respectively) under the substitution matrix of pseudogenes (presumably a completely neutral matrix of substitution reflecting the pattern of mutation without selection). The neutral substitution matrix was taken from Graur and Li (1999
, p. 126)
| Results and Discussion |
|---|
|
|
|---|
The results of the multiway ANOVA are shown in table 2 . Regardless of the measure used to estimate the radical-conservative replacement ratio, all four factors that have been studied were found to affect the radical-conservative replacement ratio. The transition-transversion ratio and the amino acid composition, as well as the interaction between these two factors, were found to have the most pronounced affect on the radical-conservative ratio. All three radical-conservative measures are affected by mutational and compositional factors. When the amino acid replacements are classified by charge, most of the variation in the radical-conservative ratio is explained by amino acid composition. When the amino acid replacements are classified by either volume and polarity or by Grantham's distance, most of the variation in the radical-conservative ratio is explained by the transition-transversion ratio. These results were unaffected by either length of protein or divergence time between the proteins.
|
We tested the frequency of false positive inferences of Darwinian selection by simulating neutral evolution in ß hemoglobin, interleukin 2, and ribosomal protein S21. When the radical-conservative ratio was calculated on the basis of volume and polarity, 100% of estimates were false positives. When the radical-conservative ratio was calculated on the basis of Grantham's distances for ß hemoglobin, interleukin 2, and ribosomal protein S21, 17%, 21%, and 13% of the estimates, respectively, were false positives. With these three proteins, we obtained no false positives when the radical-conservative ratio was calculated on the basis of electric charge. We note, however, that false positive inferences of Darwinian selection with electric charge as the yardstick for computing radical-conservative ratio were especially abundant in our simulations when the amino acid composition was that of proteins located in the nucleus. None of the three proteins used in the reality check part are nuclear.
We conclude that many factors that have nothing to do with selection (positive or otherwise) either singly or in combination affect measures that were supposed to be indicative of positive selection. Therefore, selectional inferences based on radical-conservative replacement ratios should be treated with utmost caution. In fact, we recommend that these measures not be used at all.
| Footnotes |
|---|
Pekka Pamilo, Reviewing Editor
Address for correspondence and reprints: Tal Dagan, Department of Zoology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv 69978, Israel. tali{at}kimura.tau.ac.il
. ![]()
Keywords: positive Darwinian selection
conservative replacement
radical replacement
transition bias
codon usage
genetic codes
amino acid composition ![]()
| References |
|---|
|
|
|---|
Bielawski J. P., Z. Yang, 2001 Positive and negative selection in the DAZ gene family Mol. Biol. Evol 18:523-529
Cedano J., P. Aloy, J. A. Perez-Pons, E. Ouerol, 1997 Relation between amino acid composition and cellular location of proteins J. Mol. Biol 266:594-600[Web of Science][Medline]
Dayhoff M. O., 1978 Atlas of protein sequence and structure, Vol. 5 (Suppl.3) National Biomedical Research Foundation, Silver Spring, Md
Endo T., K. Ikeo, T. Gojobori, 1996 Large-scale search for genes on which positive selection may operate Mol. Biol. Evol 13:685-690[Abstract]
Ford M. J., 2001 Molecular evolution of transferrin: evidence for positive selection in salmonids Mol. Biol. Evol 18:639-647
Grantham R., 1974 Amino acid difference formula to help explain protein evolution Science 85:862-864
Graur D., W.-H. Li, 1999 Fundamentals of molecular evolution Sinauer Associates, Inc., Sunderland, Mass
Hughes A. L., 1992 Coevolution of the vertebrate integrin
- and ß-chain genes Mol. Biol. Evol 9:216-234[Abstract]
, 2002 Origin and evolution of viral interleukin-10 and other dna virus genes with vertebrate homologous J. Mol. Biol 54:90-101
Hughes A. L., J. A. Green, J. M. Garbayo, R. M. Roberts, 2000 Adaptive diversifications within a large family of recently duplicated, placentally expressed genes Proc. Natl. Acad. Sci. USA 97:3319-3323.
Hughes A. L., T. Ota, M. Nei, 1990 Positive Darwinian selection promotes charge profile diversity in the antigen binding cleft of class I major-histocompatibility-complex molecules Mol. Biol. Evol 7:515-524[Abstract]
Johnson K. P., J. Seger, 2001 Elevated rates of nonsynonymous substitution in island birds Mol. Biol. Evol 18:874-881
Jones D. T., W. R. Taylor, J. M. Thornton, 1992 The rapid generation of mutation data matrices from protein sequences Comput. Appl. Biosci 8:275-282
Lanave C., S. Tommasi, G. Preparata, C. Saccone, 1986 Transition and transversion rate in the evolution of animal mitochondrial DNA Biosystems 19:273-283[Web of Science][Medline]
Lukens L., J. Doebley, 2001 Molecular evolution of the teosinte branched gene among maize and related grasses Mol. Biol. Evol 18:627-638
Nei M., T. Gojobori, 1986 Simple method for estimating the number of synonymous and non-synonymous nucleotide substitutions Mol. Biol. Evol 3:418-426[Abstract]
Purvis A., L. Bromham, 1997 Estimating the transition/transversion ratio from independent pairwise comparisons with an assumed phylogeny J. Mol. Evol 44:112-119[Web of Science][Medline]
Rand D. M., D. M. Weinreich, B. O. Cezairliyan, 2000 Neutrality tests of conservativeradical amino acid changes in nuclear and mitochondrially encoded proteins Gene 291:115-125
Stoye J., D. Evers, F. Meyer, 1998 ROSE: generating sequence families Bioinformatics 14:157-163
Swanson W. J., A. G. Clark, H. M. Waldrip-Dail, M. F. Wolfner, C. F. Aquadro, 2001 Evolutionary EST analysis identifies rapidly evolving male reproductive proteins in Drosophila Proc. Natl. Acad. Sci. USA 98:7375-7379
Welch D. B., M. S. Meselson, 2001 Rates of nucleotide substitution in sexual and anciently asexual rotifers Proc. Natl. Acad. Sci. USA 98:6720-6724
Yang Z., A. Yoder, 1999 Estimation of the transition/transversion rate bias and species sampling J. Mol. Evol 48:274-283[Web of Science][Medline]
Zhang J., 2000 Rates of conservative and radical nucleotide substitutions in mammalian genes J. Mol. Evol 50:56-68[Web of Science][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. Santin, E. Ars, S. Rossetti, E. Salido, I. Silva, R. Garcia-Maset, I. Gimenez, P. Ruiz, S. Mendizabal, J. L. Nieto, et al. TRPC6 mutational analysis in a large cohort of patients with focal segmental glomerulosclerosis Nephrol. Dial. Transplant., October 1, 2009; 24(10): 3089 - 3096. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Jovelin Rapid Sequence Evolution of Transcription Factors Controlling Neuron Differentiation in Caenorhabditis Mol. Biol. Evol., October 1, 2009; 26(10): 2373 - 2386. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Chun and J. C. Fay Identification of deleterious mutations within three human genomes Genome Res., September 1, 2009; 19(9): 1553 - 1561. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Gojobori, H. Tang, J. M. Akey, and C.-I Wu Adaptive evolution in humans revealed by the negative correlation between the polymorphism and fixation phases of evolution PNAS, March 6, 2007; 104(10): 3907 - 3912. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. P. Gorlov, M. Kimmel, and C. I. Amos Strength of the purifying selection against different categories of the point mutations in the coding regions of the human genome Hum. Mol. Genet., April 1, 2006; 15(7): 1143 - 1150. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Tang and C.-I Wu A New Method for Estimating Nonsynonymous Substitutions and Its Applications to Detecting Positive Selection Mol. Biol. Evol., February 1, 2006; 23(2): 372 - 379. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Richards, Y. Liu, B. R. Bettencourt, P. Hradecky, S. Letovsky, R. Nielsen, K. Thornton, M. J. Hubisz, R. Chen, R. P. Meisel, et al. Comparative genome sequencing of Drosophila pseudoobscura: Chromosomal, gene, and cis-element evolution Genome Res., January 1, 2005; 15(1): 1 - 18. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Wang and J. Zhang Rapid Evolution of Mammalian X-Linked Testis-Expressed Homeobox Genes Genetics, June 1, 2004; 167(2): 879 - 888. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. S. Choi and B. T. Lahn Adaptive Evolution of MRG, a Neuron-Specific Gene Family Implicated in Nociception Genome Res., October 1, 2003; 13(10): 2252 - 2259. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





