MBE Advance Access published online on October 22, 2008
Molecular Biology and Evolution, doi:10.1093/molbev/msn240
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Article |
An Analysis of Structural Influences on Selection in RNA Genes


* Department of Statistics, University of Oxford
Faculty of Life Sciences, University of Manchester
To whom correspondence should be addressed: email: lyngsoe{at}stats.ox.ac.uk
Received for publication May 14, 2008. Revision received October 10, 2008. Accepted for publication October 14, 2008.
Non-coding RNAs (ncRNAs) are transcripts which do not code for protein but rather function as RNA in catalytic, regulatory, or structural roles in the cell. NcRNAs are involved in universally conserved biological processes, including protein synthesis and gene regulation, and have more specific roles, such as in X-chromosome inactivation in eutherian mammals. In this paper we propose and investigate a hypothesis for patterns of sequence selection in structurally conserved ncRNAs. Previous attempts at defining RNA selection compared rates of evolution between paired and unpaired bases, with largely inconclusive results. Our approach focuses only on paired bases in ncRNAs with conserved structure. By analogy to the different properties of codon positions based on the genetic code, we use a well-developed energy model for RNA structure to classify stem positions into structural classes, and argue that they are under different selective constraints. We validate the hypothesis on several RNA families, and use simulated data to verify the evolutionary origin of signals. Our class labelling is shown to be a better model of ncRNA evolution than the tradition of treating stem positions equally. As well as providing a better understanding of RNA evolution, the evolutionary footprint we identify can easily be incorporated into gene finders to improve their specificity.