MBE Advance Access originally published online on July 17, 2007
Molecular Biology and Evolution 2007 24(9):2119-2131; doi:10.1093/molbev/msm142
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
Detecting the Coevolution of Biosequences—An Example of RNA Interaction Prediction



* Simons Center for Systems Biology, Institute for Advanced Study, Princeton, New Jersey
EMBL, European Bioinformatics Institute, Cambridge, England, United Kingdom
Center for Molecular Biology of RNA and Department of Molecular, Cell, and Developmental Biology, University of California at Santa Cruz
Center for Biomolecular Science and Engineering, University of California at Santa Cruz
E-mail: chyeang{at}soe.ucsc.edu.
Accepted for publication July 12, 2007.
A probabilistic graphical model is proposed in order to detect the coevolution between different sites in biological sequences. The model extends the continuous-time Markov process of sequence substitution for single nucleic or amino acids and imposes general constraints regarding simultaneous changes on the substitution rate matrix. Given a multiple sequence alignment for each molecule of interest and a phylogenetic tree, the model can predict potential interactions within or between nucleic acids and proteins. Initial validation of the model is carried out using tRNA and 16S rRNA sequence data. The model accurately identifies the secondary interactions of tRNA as well as several known tertiary interactions. In addition, results on 16S rRNA data indicate this general and simple coevolutionary model outperforms several other parametric and nonparametric methods in predicting secondary interactions. Furthermore, the majority of the putative predictions exhibit either direct contact or proximity of the nucleotide pairs in the 3-dimensional structure of the Thermus thermophilus ribosomal small subunit. The results on RNA data suggest a general model of coevolution might be applied to other types of interactions between protein, DNA, and RNA molecules.
Key Words: coevolution continuous-time Markov models RNA tertiary interactions RNA secondary interactions
Peter Lockhart, Associate Editor