MBE Advance Access originally published online on March 6, 2006
Molecular Biology and Evolution 2006 23(5):1085-1094; doi:10.1093/molbev/msj118
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Article |
Survey of Conserved Alternative Splicing Events of mRNAs Encoding SR Proteins in Land Plants

* Faculty of Bio-Science, Nagahama Institute of Bio-Science and Technology, Siga, Japan; and
Ochanomizu University, Tokyo, Japan
E-mail: k_iida{at}nagahama-i-bio.ac.jp.
| Abstract |
|---|
|
|
|---|
The serine/arginine-rich (SR) protein family plays an important role in constitutive and alternative splicing (AS). These proteins regulate AS in a tissue-specific and stress-responsive manner. Pre-mRNAs encoding SR proteins are often alternatively spliced, and these AS events may be important for the regulation of AS events of other pre-mRNAs. In this study, we analyzed AS events of SR proteins in Arabidopsis thaliana and Oryza sativa (rice). We found three sets of AS events conserved between Arabidopsis and rice. These conserved AS events were found in the plant-novel-SR protein, SC35-like (SCL), and two-Zn-knucklestype 9G8 subfamilies. Each member of these subfamilies has at least one RNA recognition motif (RRM) and at least one intron in the RRM-encoded region. We found that the conserved AS events occurred in these introns and, in each case, the conserved AS events resulted in mature mRNAs encoding proteins with incomplete RRMs. To search for the evolutionary origin of these AS events, we analyzed SR proteins in Physcomitrella patens (moss) in addition to those in Arabidopsis and rice. We found moss homologues of the plant-novel-SR protein, SCL, and the two-Zn-knucklestype 9G8 subfamilies in silico, and these homologues have long introns at the same location of the conserved AS sites in Arabidopsis and rice. Such long introns are quite specific for alternatively spliced introns concerning the Arabidopsis SR protein genes. The long introns found in the moss SR protein genes strongly suggested that conserved AS events in moss SR protein genes might be similar to those in Arabidopsis and rice. We traced the evolutionary origin of the conserved AS events to 400 MYA, when plants first invaded land. These events are likely important in the regulation of whole AS events and likely contribute to the complicated transcriptome described by AS. The complicated transcriptome created by regulated AS events might have provided plants tolerance against droughts or temperature shifts and given them the ability to live on land.
Key Words: Arabidopsis thaliana Oryza sativa Physcomitrella patens transcriptome stress response land plant evolution
| Introduction |
|---|
|
|
|---|
Alternative splicing (AS) is a mechanism by which multiple forms of mature mRNAs are made from a single, premature mRNA. In Arabidopsis and rice (Oryza sativa), 10%20% of all pre-mRNAs undergo AS (Kikuchi et al. 2003
The SR proteins of Arabidopsis are classified into seven subfamilies: SF2/ASF, SC35, one-Zn-knuckletype 9G8, two-Zn-knucklestype 9G8, SCL, plant-novel-SR protein, and SR45 (Kalyna and Barta 2004
). For Arabidopsis, each subfamily contains more than two members, except for SC35 and SR45 subfamilies. Although one might expect gene duplication events in this family, no reports to date have examined whether SR protein AS events have a common evolutionary origin. On the other hand, several groups have reported AS events of pre-mRNAs encoding SR proteins in other species (Gao, Gordon-Kamm, and Lyznik 2004
; Gupta et al. 2005
) but did not determine whether the AS events in those species were conserved. In this study, we compared AS events of mRNAs encoding SR proteins in Arabidopsis and rice and attempted to determine the evolutionary origin of these AS events.
In tracing the origin of the AS events, moss (Physcomitrella patens) is an important target. Mosses and flowering plants are both land plants but are thought to have diverged about 400 MYA (Nishiyama et al. 2003
), about twice as ancient as the divergence of Arabidopsis and rice (145206 MYA; Yu et al. 2002
). By analyzing moss, we expected to obtain evolutionarily primitive information about AS events of SR proteins. Although the likelihood of finding conserved AS events in moss was lowbecause so little moss transcript data were knownwe expected to find AS candidates from the moss genomic sequence. In Arabidopsis, the alternatively spliced introns of the genes of SR proteins are often remarkably long (>400 bp) when compared to other introns in Arabidopsis (Kalyna and Barta 2004
). Most such introns in the Arabidopsis SR protein genes are alternatively spliced. Based on this remarkable property, we studied the possibility of AS events in moss genes encoding SR proteins. We compared probable AS events to those found in Arabidopsis and rice and traced the evolutionary origin of these conserved AS events.
| Materials and Methods |
|---|
|
|
|---|
Data Set
For Arabidopsis thaliana, we used the complete genome sequence released by The Institute for Genomic Research (TIGR) database (Haas et al. 2002
Identification of Exon-Intron Structures and AS Events
To detect AS events, we identified the exon-intron structures of SR protein genes. For Arabidopsis and rice, we identified exon-intron structures by mapping transcripts to the genomes. We mapped transcripts in two steps. First, we roughly mapped transcripts to the genomes using Blast and determined their loci. In the next step, we precisely aligned the transcripts to the loci sequences using GeneSeqer (Brendel, Xing, and Zhu 2004
) and identified the exon-intron structures of SR protein genes. We identified AS events based on exon-intron structures (Okazaki et al. 2002
; Iida et al. 2004
). For each locus, we clustered transcripts from the locus and determined the genomic exon-intron structures. Nucleotides were treated as genomic exon nucleotides if they were found in an exon of any transcript. We compared the exon-intron structures of each transcript and genome and identified AS events.
Identification of Conserved AS Events
We surveyed for conserved AS events based on the positions of alternatively spliced introns during multiple alignments of amino acid sequences. We used reference sequences for the multiple alignments. For Arabidopsis and rice genes, we used the sequences annotated by TIGR as the reference sequences. In loci with multiple spliced forms, some sequences have incomplete domain organization. We chose sequences with the characteristic domain organization of each subfamily. For each gene, we compared the reference sequence and transcripts from the locus and mapped the AS events to the intron positions of the reference sequence. We compared the positions of alternatively spliced introns between multiple alignments and defined AS events at the same position on multiple alignments as "conserved AS events." We made multiple alignments of reference amino acid sequences of each subfamily using ClustalW (Thompson, Higgins, and Gibson 1994
) and determined the conserved AS events from the alignments. We created phylogenetic trees to study the gene duplication events in the evolutionary pathways leading to moss, Arabidopsis, and rice. We made phylogenetic trees using the maximum likelihood method of the PHYLIP software package (http://evolution.genetics.washington.edu/phylip.html). The phylogenetic trees are displayed using TreeView (http://taxonomy.zoology.gla.ac.uk/rod/treeview.html).
Assembling the Moss Genomic Sequence Fragments
The genome sequence of moss (P. patens) was not assembled as of September, 2005, so we could use only fragment sequences. We needed genomic sequences to identify the exon-intron structures of SR protein genes and to search for AS events in moss. We assembled several parts of the moss genome. First, we searched for SR protein homologues from the moss transcript set using Blast. Next, we searched for fragment sequences of loci encoding moss SR protein genes, using Blast with the identified transcript sequences as queries. Finally, we assembled the sequences using the TIGR assembler (Sutton et al. 1995
).
| Results |
|---|
|
|
|---|
Conserved AS Events in the SR Protein Family
For our analyses of SR proteins in three species, we first examined Arabidopsis genes encoding SR proteins. SR protein studies in Arabidopsis are quite advanced (Kalyna and Barta 2004
|
The Plant-Novel-SR Protein Subfamily
We found two members of the plant-novel-SR protein subfamily in the rice genome (table 1). We aligned the amino acid sequences of four members of Arabidopsis and two members of rice and compared the exon-intron structures of these genes. All had the same exon-intron structures for RRM1 and RRM2 (fig. S1, Supplementary Material online). We could not obtain exact multiple alignments for the C terminal arginine-serine-rich (RS) domain due to amino acid sequences of low complexity. We found that introns of the RRM1-encoded regions of each member were alternatively spliced and that they were conserved AS events (figs. 1 and 2 and table 2). In each gene, AS events generated stop codons in RRM1-encoding regions (fig. S1, Supplementary Material online). Alternatively, spliced forms of the mRNA encoded proteins with truncated RRM1. The AS events in atRSp31-At3g61860, atRSp32-At2g46610, and atRSp40-At4g25500 were previously reported by Kalyna and Barta (2004)
|
|
|
|
SCL Subfamily
In the SCL subfamily, we found six homologues in the rice genome (table 1). Similar to our findings for the plant-novel-SR protein subfamily, the exon-intron structures of all members were the same, excluding the RS domains (fig. S1B, Supplementary Material online). AS events of atSCL30a-At3g13570 and atSCL33-At1g55310 have been reported previously (Kalyna and Barta 2004
Two-Zn-KnucklesType 9G8 Subfamily
We found four members of the two-Zn-knucklestype 9G8 subfamily in the rice genome (table 1). Several AS events have previously been reported for this subfamily (Kalyna, Lopato, and Barta 2003
; Kalyna and Barta 2004
; Isshiki, Tsumoto, and Shimamoto 2006
). We reanalyzed AS events in this subfamily to find conserved AS events. The exon-intron structures of all Arabidopsis and rice homologues were the same, excluding the RS domains (fig. S1C, Supplementary Material online). We found conserved AS events in the introns of RRM-encoded regions of two Arabidopsis members (atRSZ33-At2g37340 and atRSZ34-At3g53500) and three rice members (osRSZ37b-Os03g17710, osRSZ36-Os05g02880, and osRSZ37a-Os01g06290) (table 2). The alternatively spliced introns were located 9 aa (26 bp) upstream relative to the conserved AS events of the plant-novel-SR protein subfamily gene (fig. 2). For the two members of Arabidopsis and the osRSZ36-Os05g02880 and osRSZ37b-Os03g17710 members of rice, conserved AS events created stop codons in these transcripts (fig. S4, Supplementary Material online). In osRSZ37a-Os01g06290, 43 bp of exon3a was selected mutually exclusively over 40 bp of exon3b (fig. 4A). Even if exon3b was used, there were no stop codons or frameshifts. We checked the RRM consensus sequences of each AS isoform using a Pfam search (Bateman et al. 2004
). Mature mRNA of exon3b (reference sequence) encoded amino acid sequences similar to the RRM consensus sequence, with an E value of 2.6 x 106. On the other hand, the AS isoform of exon3a encoded amino acid sequences similar to RRM, with an E value of 1.0 x 104. This AS isoform encoded an amino acid sequence of the RRM region with a weakened RRM consensus. For the Os05g02880 member of rice, we found an alternate initiation event in addition to the AS event. This alternative initiation event generated a transcript starting within the intron regions in which the conserved AS events of this subfamily had been found (fig. 4B). Similar to the conserved AS events, this alternative initiation event generated an mRNA encoding a protein with a truncated RRM. Phylogenetic tree analysis indicated that the conserved AS events had an evolutionary root prior to the branching of Arabidopsis and rice (fig. 3C).
|
Long Introns in Moss SR Protein Homologues
To trace the evolutionary origin of the conserved AS events, we analyzed SR proteins in moss (P. patens). Moss and flowering plants diverged about 400 MYA (Nishiyama et al. 2003
|
| Discussion |
|---|
|
|
|---|
Conserved AS Events in Land Plants
We found three sets of evolutionarily conserved AS events in SR protein families that were conserved between monocots and dicots. Each event was an intrasubfamily event and was found in the following subfamilies: plant-novel-SR protein, SCL, and two-Zn-knucklestype 9G8 (figs. 1 and 2 and table 2). Each of the conserved AS events included several types of AS events, including cassette exon type, alternative donor/acceptor type, and retained intron type. Because the alternatively spliced introns were at the same positions across multiple alignments, the evolutionary conservation of these events was clear. The type of AS event seemed to change along with evolutionary divergence. Although the types of AS events varied, conserved AS events had similar properties in that they generated mRNA encoding proteins with incomplete (truncated or weakened) RRMs. We hypothesized that in regard to AS events of SR proteins, the selection pressure was toward encoding amino acid sequences for conserved processes. In other words, all AS event types that generate mRNAs encoding proteins with incomplete RRMs might actually be caused by the same selection pressure. Given this point of view, the origin of the alternative initiation event found in the osRSZ36-Os05g02880 member of the two-Zn-knucklestype 9G8 subfamily was the same as that of the conserved AS events across the subfamily.
Although all three conserved AS events were found near the centers of RRM-coding regions, the positions of the three events were not identical (fig. 2). We regarded each of the three AS events as conserved only in each subfamily but not across subfamilies. We emphasize that despite the fact that they seemed to be of differing origins, all three conserved AS events generated stop codons in RRM-encoding regions. This result suggested the importance of generating mRNAs encoding proteins with incomplete RRMs by AS.
Functions of Conserved AS Events
Each conserved AS event was found in an RRM-coding region. We expected that these AS events would greatly influence SR protein function because the RRMs are essential for SR protein function (Chandler et al. 1997
). The proteins with truncated RRMs that were generated by the AS events should thus be nonfunctional. Another possibility is that mRNAs with abnormal stop codons might be targeted by nonsense-mediated mRNA decay (Lewis, Green, and Brenner 2003
). In either case, the function of the SR proteins might be decreased. For mouse SRp20 and human SC35, "autoregulating" AS mechanisms have been reported (Jumaa and Nielsen 1997
; Sureau et al. 2001
). In Arabidopsis, autoregulation for arSRp30 and atRSZ33 was also reported (Lopato et al. 1999
; Kalyna, Lopato, and Barta 2003
). In these cases, pre-mRNA splicing of an SR protein itself was influenced by the quantity of its own protein products. At the same time, pre-mRNA splicing controlled the quantity of its own protein products. We propose that a similar regulation mechanism could occur in the SR protein families of Arabidopsis and rice. They are not necessarily cases of self-regulation in Arabidopsis or rice, however, because many SR protein genes were the result of gene duplications. Duplicated SR proteins would create complicated regulation pathways. Regardless of self-regulation or nonself-regulation, we assume that the conserved AS events are important in controlling the amount of functional SR protein products.
In our previous study, we reported large-scale changes of AS profiles according to the expressing organs and environmental stresses (Iida et al. 2004
). In that report, we observed induced expression and AS events of mRNA encoding SR proteins under similar conditions. Based on these results, we assumed that regulating the pre- and posttranscriptional levels of SR proteins was critical in controlling whole AS profiles. Our hypothesis of transcriptome regulation mediated by AS of SR protein mRNAs is shown in figure 6. The expression levels of SR protein products are regulated by both transcriptional and AS control. Inducing the expression of pre-mRNA of SR proteins and AS events that created mature mRNAs encoding truncated proteins induced oscillating expression of the SR protein product. Entire AS profiles are influenced by SR proteins, and transcriptomes are constructed to adapt to each condition. Other elements, such as protein degradation, phosphorylation events, and localizations are certainly also important for the regulation of AS. However, transcript-level regulation mediated by conserved AS events must form a critical system of regulation as the events are highly conserved in evolution.
|
We note that all subfamilies containing the conserved AS events (plant-novel-SR, SCL, and two-Zn-knucklestype 9G8) were plant specific. These subfamilies are important in regulating AS (Lopato et al. 1999
The Origin of Conserved AS Events and Land Plant Evolution
Each conserved AS event originated prior to the branching between Arabidopsis and rice, a fact supported by the results of our phylogenetic tree analysis (fig. 3) and the widespread AS events in each subfamily. Arabidopsis thaliana and O. sativaa dicot and a monocot, respectivelydiverged 145206 MYA (Yu et al. 2002
), and we traced the conserved AS events back to this era. We obtained results indicating the possibility of an even more ancient origin for these conserved AS events. Long introns found in moss SR protein homologues might be alternatively spliced, and the origin of the conserved AS events could be as ancient as 400 MYA, when moss and flowering plants diverged and ancestral plants first invaded land. At that time, the ancestors of land plants were exposed to drought conditions and drastic temperature changes. Land plants acquired life cycles with various developmental stages, as well as various tissues and organs. For complicated life cycles, developmental stages, tissues, and organs, a more complicated transcriptome was required. We speculate that the AS events found in the SR proteins greatly contributed to obtaining a transcriptome that was adapted for each new requirement. A more complicated transcriptome might well have allowed plants to live on land.
| Supplementary Material |
|---|
|
|
|---|
Supplementary figures S1S5 and multiple alignments of transcript sequences for each locus are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
| Acknowledgements |
|---|
|
|
|---|
This work was supported by Grants-in-Aid for Scientific Research (C) and for Priority Area "Genome Information Science" to M.G. from the Ministry of Education, Culture, Sports, Science, and Technology of Japan.
| Footnotes |
|---|
Takashi Gojobori, Associate Editor
| References |
|---|
|
|
|---|
Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:33893402.
Bateman, A., L. Coin, R. Durbin et al. (13 co-authors). 2004. The Pfam protein families database. Nucleic Acids Res. 32:D138D141.
Brendel, V., L. Xing, and W. Zhu. 2004. Gene structure prediction from consensus spliced alignment of multiple ESTs matching the same genomic locus. Bioinformatics 20:11571169.
Carninci, P., T. Kasukawa, S. Katayama et al. (194 co-authors). 2005. The transcriptional landscape of the mammalian genome. Science 309:15591563.
Chandler, S. D., A. Mayeda, J. M. Yeakley, A. R. Krainer, and X. D. Fu. 1997. RNA splicing specificity determined by the coordinated action of RNA recognition motifs in SR proteins. Proc. Natl. Acad. Sci. USA 94:35963601.
Gao, H., W. J. Gordon-Kamm, and L. A. Lyznik. 2004. ASF/SF2-like maize pre-mRNA splicing factors affect splice site utilization and their transcripts are alternatively spliced. Gene 339:2537.[CrossRef][ISI][Medline]
Graveley, B. R. 2000. Sorting out the complexity of SR protein functions. RNA 6:11971211.[CrossRef][ISI][Medline]
Gupta, S., B. B. Wang, G. A. Stryker, M. E. Zanetti, and S. K. Lal. 2005. Two novel arginine/serine (SR) proteins in maize are differentially spliced and utilize non-canonical splice sites. Biochim. Biophys. Acta 1728:105114.[Medline]
Haas, B. J., N. Volfovsky, C. D. Town, M. Troukhan, N. Alexandrov, K. A. Feldmann, R. B. Flavell, O. White, and S. L. Salzberg. 2002. Full-length messenger RNA sequences greatly improve genome annotation. Genome Biol. 3:RESEARCH0029.[Medline]
Iida, K., M. Seki, T. Sakurai, M. Satou, K. Akiyama, T. Toyoda, A. Konagaya, and K. Shinozaki. 2004. Genome-wide analysis of alternative pre-mRNA splicing in Arabidopsis thaliana based on full-length cDNA sequences. Nucleic Acids Res. 32:50965103.
Isshiki, M., A. Tsumoto, and K. Shimamoto. 2006. The serine/arginine-rich protein family in rice plays important roles in constitutive and alternative splicing of pre-mRNA. Plant Cell 18:146158.
Jumaa, H., and P. J. Nielsen. 1997. The splicing factor SRp20 modifies splicing of its own mRNA and ASF/SF2 antagonizes this regulation. EMBO J. 16:50775085.[CrossRef][ISI][Medline]
Kalyna, M., and A. Barta. 2004. A plethora of plant serine/arginine-rich proteins: redundancy or evolution of novel gene functions? Biochem. Soc. Trans. 32:561564.[CrossRef][ISI][Medline]
Kalyna, M., S. Lopato, and A. Barta. 2003. Ectopic expression of atRSZ33 reveals its function in splicing and causes pleiotropic changes in development. Mol. Biol. Cell 14:35653577.
Kan, Z., D. States, and W. Gish. 2002. Selecting for functional alternative splices in ESTs. Genome Res. 12:18371845.
Kikuchi, S., K. Satoh, T. Nagata et al. (74 co-authors). 2003. Collection, mapping, and annotation of over 28,000 cDNA clones from japonica rice. Science 303:376379.
Lewis, B. P., B. E. Green, and S. E. Brenner. 2003. Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans. Proc. Natl. Acad. Sci. USA 100:189192.
Lopato, S., M. Kalyna, S. Dorner, R. Kobayashi, A. R. Krainer, and A. Barta. 1999. atSRp30, one of two SF2/ASF-like proteins from Arabidopsis thaliana, regulates splicing of specific plant genes. Genes Dev. 13:9871001.
Macknight, R., M. Duroux, R. Laurie, P. Dijkwel, G. Simpson, and C. Dean. 2002. Functional significance of the alternative transcript processing of the Arabidopsis floral promoter FCA. Plant Cell 14:877888.
Nishiyama, T., T. Fujita, T. Shin-I et al. (12 co-authors). 2003. Comparative genomics of Physcomitrella patens gametophytic transcriptome and Arabidopsis thaliana: implication for land plant evolution. Proc. Natl. Acad. Sci. USA 100:80078012.
Okazaki, Y., M. Furuno, T. Kasukawa et al. (137 co-authors). 2002. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420:563573.[CrossRef][Medline]
Seki, M., M. Narusaka, A. Kamiya et al. (20 co-authors). 2002. Functional annotation of a full-length Arabidopsis cDNA collection. Science 296:141145.
Shi, H., L. Xiong, B. Stevenson, T. Lu, and J. K. Zhu. 2002. The Arabidopsis salt overly sensitive 4 mutants uncover a critical role for vitamin B6 in plant salt tolerance. Plant Cell 14:575588.
Sureau, A., R. Gattoni, Y. Dooghe, J. Stevenin, and J. Soret. 2001. SC35 autoregulates its expression by promoting splicing events that destabilize its mRNAs. EMBO J. 20:17851796.[CrossRef][ISI][Medline]
Sutton, G., O. White, D. Adams, and A. Kerlavage. 1995. TIGR assembler: a new tool for assembling large shotgun sequencing projects. Genome Sci. Technol. 1:918.
Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:46734680.
Wang, B. B., and V. Brendel. 2004. The ASRG database: identification and survey of Arabidopsis thaliana genes involved in pre-mRNA splicing. Genome Biol. 5:R102.[CrossRef][Medline]
Wheeler, D. L., D. M. Church, S. Federhen et al. (11 co-authors). 2003. Database resources of the National Center for Biotechnology. Nucleic Acids Res. 31:2833.
Yoshimura, K., Y. Yabuta, T. Ishikawa, and S. Shigeoka. 2002. Identification of a cis element for tissue-specific alternative splicing of chloroplast ascorbate peroxidase pre-mRNA in higher plants. J. Biol. Chem. 277:4062340632.
Yu, J., S. Hu, J. Wang et al. (100 co-authors). 2002. A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296:7992.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
K. Iida, M. Shionyu, and Y. Suso Alternative Splicing at NAGNAG Acceptor Sites Shares Common Properties in Land Plants and Mammals Mol. Biol. Evol., April 1, 2008; 25(4): 709 - 718. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Ner-Gaon, N. Leviatan, E. Rubin, and R. Fluhr Comparative Cross-Species Alternative Splicing in Plants Plant Physiology, July 1, 2007; 144(3): 1632 - 1641. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Kalyna, S. Lopato, V. Voronin, and A. Barta Evolutionary conservation and regulation of particular alternative splicing events in plant SR proteins Nucleic Acids Res., September 11, 2006; 34(16): 4395 - 4405. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



" or "Ppa




