MBE Advance Access originally published online on March 20, 2007
Molecular Biology and Evolution 2007 24(5):1140-1148; doi:10.1093/molbev/msm045
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
Evolutionary Conservation of UTR Intron Boundaries in Cryptococcus

* Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North, New Zealand
Microbial Analysis Group, Broad Institute of MIT and Harvard University
E-mail: scottwroy{at}gmail.com.
| Abstract |
|---|
|
|
|---|
Despite significant progress, the general functional and evolutionary significance of the untranslated regions (UTRs) of eukaryotic transcripts remain mysterious. Particularly mysterious is the common occurrence of spliceosomal introns in transcript UTRs because UTR splicing is not necessary for restoration of transcript coding sequence. In general, it is not known to what extent such splicing performs an important function or merely represents spliceosomal "noise." We conducted the first analysis of evolutionary conservation of UTR splicing. Among 4 species from Cryptococcus neoformans species complex, we find high levels of conservation of UTR intron boundary sequences, strongly suggesting that UTR intron splicing is conserved by purifying selection. We estimate that 5090% of splice boundaries are maintained by selection. Donor site sequences are more highly conserved than acceptor sequences, and splicing boundaries are more conserved in 5' UTRs than in 3' UTRs. In addition, we report a variety of differences between patterns of UTR splicing in Cryptococcus and corresponding patterns in animals and plants. These results focus attention on the functional roles of eukaryotic UTRs and deepen the mystery of UTR intron splicing.
Key Words: untranslated regions genome evolution purifying selection
| Introduction |
|---|
|
|
|---|
Spliceosomal introns are sequences in eukaryotic genomes that are removed from RNA transcripts by the spliceosome prior to nuclear export and translation. The apparent absence of a general function for introns, as well as their peculiar phylogenetic distribution across eukaryotic lineages, makes them a central mystery of genome evolution (for recent reviews, see Rogozin et al. 2005
However, the vast majority of theoretical and empirical work on spliceosomal intron evolution has focused on introns that interrupt coding sequences, overlooking spliceosomal introns present in both 5' and 3' untranslated regions (UTRs) of protein-coding transcripts for many species (Pesole et al. 2001; Chung et al. 2006
; Hong et al. 2006
). Such UTR introns are found across a wide variety of eukaryotic lineages and reach large numbers in at least some plants and animals (Chung et al. 2006
; Hong et al. 2006
). The existence and broad phylogenetic distribution of UTR introns remain quite mysterious; whereas the removal of introns from coding regions is clearly necessary for accurate translation of full-length proteins, the necessity of splicing of noncoding regions is less obvious. In particular, the relatively high frequency of introns in 3' UTRs is surprising, as the presence of stop codons upstream of these intron boundaries might be expected to cause these transcripts to be targeted for degradation by the nonsense-mediated decay (NMD) pathway (Hentze and Kulozik 1999
). One possibility is that UTR intron presence affects posttranscriptional expression level, as recently found for the EF1
-A3 gene of Arabidopsis (Chung et al. 2006
).
There is an increasing appreciation of the posttranscriptional regulatory effects of UTRs. In the 5' UTR, short open reading frames (ORFs) upstream of the translation initiation site are known to regulate translation levels (Morris and Geballe 2000
; Meijer and Thomas 2002
; Vilela and McCarthy 2003
). In general, ATG triplets lying upstream of the true translation initiation site have been shown to be preferentially conserved (Churbanov et al. 2005
; Zhang and Dietrich 2005
; Crowe et al. 2006
), indicating a functional role for these sites. One of the few general proposed functions of 5' UTR splicing to date is minimization of UTR length in order to avoid mutation to ATG codons, which could cause repression of translation (Hong et al. 2006
). The 3' UTR sometimes contains targets for miRNAs, allowing for specific posttranscriptional regulation (Lai 2002
; Bartel 2004
; Lall et al. 2006
).
A central reason that the evolution of UTR introns has largely escaped consideration to date is the rapid sequence evolution of UTRs (Larizza et al. 2002; Shabalina et al. 2004). In general, UTR sequences evolve more rapidly than coding sequences, and UTR sequences lack a clear general organizing principle such as coding frame, making alignment of UTR sequences over even moderate evolutionary distances difficult. In addition, computational prediction of UTR introns is plagued by similar uncertainties, and the 3' bias of cDNA libraries obscures splicing pattern in 5' regions of transcripts. However, the increasing availability of closely related clusters of well-characterized full-genome sequences and large numbers of full-length cDNAs from intron-rich species finally allows for the study of UTR splicing.
We studied the evolution of UTR splicing in 4 species of the Cryptococcus neoformans species complex (Xu et al. 2000
; Loftus et al. 2005
). Pairwise synonymous divergence within the clade has been previously estimated as ranging from 11% to 37% (Stajich JE, Neafsey DE, unpublished observations), allowing for alignment of many noncoding regions. We studied all C. neoformans UTR introns for which 4-way orthologous sequences were known: 442 introns in 334 5' UTRs and 180 introns in 126 3' UTRs. We determined levels of conservation at intron splice sites across the clade.
Our major findings include the following: 1) UTR intron boundaries are generally highly conserved; 2) 5' UTR intron boundaries are more highly conserved than 3' UTR intron boundaries; 3) intron donor sites are more highly conserved than intron acceptor sites in both 5' and 3' UTRs; 4) we found no cases of (near) exact intron loss/gain; 5) there is a relatively high density of introns beginning with a GC nucleotide, as opposed to the canonical GT, particularly in 3' UTRs; 6) we find significant interconversion of GC and GT donor sites; and 7) patterns of intron density and length in Cryptococcus vary significantly from previously reported patterns in plants and animals. In total, these results indicate purifying selection maintaining the majority of splice sites for UTR introns, particularly in 5' UTRs. One possible explanation for conservation of 5' UTR splicing could be the presence of alternative transcription or translation initiation sites. These results underscore the importance of UTR regions in gene evolution.
| Methods |
|---|
|
|
|---|
Data Set
We compared orthologous UTRs between 4 members of the C. neoformans species complex. The genome assemblies of 4 strains of C. neoformans were obtained from the websites of the sequencing centers that produced them (strain JEC21: TIGR; strain WM276: Michael Smith Genome Center; and strains H99 and R265: Broad Institute). Whole-genome alignments were created using a multistep process using strain JEC21 as a reference. First, pairwise alignments between JEC21 and the other sequenced strains were created using PatternHunter (Ma et al. 2002
10) cases in which use of an alternative inframe upstream ATG extended the coding sequence relative to the TIGR annotation, we used this upstream start site. (Thus we conservatively defined 5' UTR introns. If in fact a downstream ATG is the true translation start site, this will lead to an actual UTR intron being identified as a coding sequence intron and excluded from the analysis.) We extracted the 334 5' UTR and 126 3' UTR 1:1 ortholog sets showing evidence of UTR splicing in C. neoformans. In total, there were 442 5' and 180 3' introns. UTR intron sequences were checked for canonical GT...AG or GC...AG boundaries. Visual inspection yielded no cases in which there was evidence for precise excision/insertion of an intron (i.e., with most/all of the intron removed with fewer than 10 bp on either side). Described analyses were performed by novel Perl programs.
We used BlastN to map all 23,000 available full-length JEC21 cDNAs against the genomic JEC21 sequence to determine variation in the position of the transcription start site.
Definition of Conserved Regions and Levels of Conservation
We utilized a simple definition of conserved regions as nucleotide positions without gaps or uncertain nucleotides (e.g., N). For each single nucleotide or dinucleotide in the alignment, we determined conservation across all 4 species. In addition, we tried a variety of other definitions of "conserved" nucleotide positions. We defined conserved positions as those in which flanking sequence (using total windows of 10, 14, or 20 bp, excluding the position in question) showed at least a threshold level of conservation (50% or 75%). Estimated levels of conservation were relatively constant for all definitions used (6975%), as were estimated fractions of conserved intronic boundary sites. Therefore, we used the simple and straightforward criterion of all ungapped positions.
| Results |
|---|
|
|
|---|
Patterns of UTR Splicing in Cryptococcus
The patterns of UTR splicing in C. neoformans are summarized in table 1. Mean and median intron lengths were 157.0 and 75 bp in 5' UTRs and 63.1 and 57 bp in 3' UTRs, thus intron lengths in 5' but not 3' UTRs showed pronounced rightward skew. Length of 5' UTR introns was negatively correlated with exonic distance from the translation initiation site (r = 0.108, P = 0.015). However, introns very near the translation start site (within 12 bp) showed similar mean (176.1) and median (70 bp) lengths to all 5' UTR introns.
|
Mean and median exon lengths were 132.8 and 91 bp in 5' UTRs and 164.5 and 113 bp in 3' UTRs. Terminal 5' UTR exons (the sequence between the 3' most intron and the translation start site) tended to be shorter with a mean length of 108.3 and a median length of 62 bp, and there was a much higher frequency of short exons (31.3% of terminal exons had lengths less than 30 bp vs. 15.7% for nonterminal exons, 1.2 x 108 by a Fisher Exact test).
Introns constituted a much higher fraction of total UTR length in 5' UTRs (38.4%) than in 3' UTRs (17.2%). The vast majority of intron-containing 5' UTRs (87.6%) and 3' UTRs (80.6%) had a single intron; only 2.3% of 5' and 5.6% of 3' UTRs had more than 2. In total, there were 3.38 introns per kilobase of exonic sequence on average in intron-containing 5' UTRs and 2.56 per kilobase in 3' UTRs. Including both intron-containing and intronless UTRs, there was 1.01 intron per exonic kilobase in 5' UTRs and 0.29 in 3' UTRs. By comparison, there are 3.31 introns per exonic kilobase in coding sequences of C. neoformans.
Intronic and Exonic UTR Sequence Conservation
Levels of sequence conservation at ungapped positions were similar across different classes of sites. A total of 70.9% of all 5' UTR sites and 72.3% of all 3' UTR sites were conserved across all 4 species. In the 5' UTR, 71.6% of exonic sites and 69.7% of intronic sites were conserved across species; in the 3' UTR, 72.9% of exonic sites and 69.4% of intronic sites were conserved. For purposes of direct comparison with levels of conservation of intron boundary dinucleotides, we directly calculated levels of conservation of dinucleotides. In the 5' UTR, 55.3% of exonic dinucleotides and 51.0% of intronic dinucleotides were conserved across species (fig. 1a), slightly higher than expected from levels of single nucleotide conservation (71.6 x 71.6% = 51.3% and 69.7 x 69.7% = 48.5%, respectively). In the 3' UTR, 56.0% of exonic dinucleotides and 51.1% of intronic dinucleotides were conserved (vs. expected 53.1% and 48.2%, respectively).
|
Conservation of Intron Boundaries
In general, intron donor site dinucleotides were more highly conserved than acceptor sites (fig. 1a). In all, 88.1% (281/319) of donor dinucleotides in 5' UTRs and 83.6% (97/116) in 3' UTRs were conserved across all 4 species. Excluding 18 sites that varied between GC and GT in the 4 species, 91.5% (281/307) of 5' donors and 88.2% (97/110) of 3' donors were conserved. Acceptor dinucleotides showed 81.7% (268/328) conservation in 5' UTRs and 71.4% (85/119) conservation in 3' UTRs. The difference in conservation of all donor and acceptor sites is different at the P = 0.0023 level by a Fisher Exact test. The difference in conservation of all 5'and 3' UTR intron boundary dinucleotides is different at the P = 0.011 level.
Sequence conservation at the donor site extended beyond the GT/GC dinucleotide (fig. 1b). For 5' and 3' UTRs, 84.5% and 80.7% of nucleotides in positions 36 were conserved, respectively. No intronic conservation was found in the acceptor site beyond the AG dinucleotide: conservation in positions 3 to 6 showed slightly lower conservation than all intronic sites. Exonic sites flanking the donor site showed slightly elevated levels of conservation; exonic acceptor sites did not show elevated conservation (fig. 1b).
Estimated Fraction of Intron Boundary Conservation
The lower level of sequence divergence at intron boundaries strongly suggests selective conservation. In the simplest model, if a fraction f of intron boundaries were strictly conserved by selection while other boundaries were unconstrained by selection, we would expect that the level of intron boundary divergence would be (1 f) times the neutral level of divergence. Using general intron dinucleotide conservation as the benchmark for the neutral divergence rate, we get (1 f) x 51.5% = 11.9% for 5' UTR donor sites, yielding f = 75.7%. Excluding polymorphic GT/GC acceptor sites, we estimate f = 82.7%. These estimates for all classes of sites are given in table 2.
|
Donor Boundary Sequences
2.5% (8/319) and 8.6% (10/116) of 5' and 3' introns, respectively, began "GC" in C. neoformans (see example in fig. 2a). In total, there were 16 5' sites 14 3' sites that were GC in at least one species and GC or GT in all 4 species. There were 11 cases of conversion in 5' UTRs, of which 7 appear to be GT
GC transitions and one appears to be a GC
GT conversion by parsimony (i.e., one dinucleotide was common to 3 species; see example in fig. 2b). One of the conversions is from an ancestral GT to a GC in species JEC21. In the 3' UTRs, there was one case of GC
GT transition and one additional case of GT
GC, again in species JEC21.
|
Intron Conservation Near Coding Sequence Boundaries
The boundaries of 5' UTR introns near the start codon (within 12 bp) were particularly highly conserved, with 95% of donor sites (98% excluding 2 GT/GC sites) and 93.7% of acceptor sites conserved across species (P = 0.045 compared with all 5' UTR donors by a 1-tailed Fisher Exact test for donor sites; P = 0.011 for acceptor sites; figs. 2b and 3a). One case of a mutated acceptor site near the ATG start site is shown in figure 3b (in gene CN05760). Interestingly, a G indel directly before the ATG start codon provides a possible nearby alternative acceptor site. By contrast, boundaries of 3' UTR introns within 12 bp of the stop codon were nonsignificantly less conserved than other 3' UTR introns, with 7/11 donor and 6/11 acceptor sites conserved.
|
Large UTR Deletions Involving Introns
We found no evidence for (near) exact insertion/loss of the entire intron sequence. However, we did observe large genomic deletions in which most or all of an intron was removed, presumably altering splicing patterns (fig. 4). In the 5' UTR of gene CNA07270, a large genomic deletion has deleted the entirety of the intronic sequence as well as flanking exonic sequence (fig. 4a). In the 5' UTR of CNA01200, a large deletion genomic deletion spans most of an intron as well as adjacent exonic sequence (fig. 4b). In the 5' UTR of CNA01680, a 19 bp deletion is associated with the creation of an intron acceptor site in strain JEC21 (the A...G spanning the deletion; fig. 4c).
|
Alternative Promoters and 5' UTR Splicing
Alternative splicing of 5' UTRs has been shown to be associated with alternative promoters in some cases. To determine whether transcription start sites were more variable in intron-containing 5' UTRs, for each gene, we calculated the variance of transcription start site (among available full-length cDNAs), normalized (i.e., divided) by total UTR length for each gene (Supplementary Materials online). Intron-containing UTRs had significantly higher normalized variances than UTRs without introns (P < 1 x 107, Mann-Whitney U test), consistent with 5' UTR splicing being associated with use of alternative promoters.
| Discussion |
|---|
|
|
|---|
We report the first genome-wide study of the evolutionary conservation of intron splicing in UTRs. We find that boundary sequences of UTR introns are preferentially conserved, suggesting conservation of intron splicing by selection. Intron conservation is particularly pronounced in 5' UTRs near the translation initiation site.
UTR Evolution
The evolutionary and functional significance of UTRs of eukaryotic transcripts remains a mystery (Churbanov et al. 2005
; Lynch et al. 2005
; Hong et al. 2006
). Most prokaryotes and some eukaryotes have extremely short or nonexistent UTRs, thus UTRs are not essential for transcription or translation (discussed in Lynch 2006
). One function and/or liability of UTRs concerns the presence of upstream ATG codons in 5' UTRs (premature start codons [PSCs]). In general, 5' UTRs are depleted for PSCs suggesting that PSCs are often disfavored (Hahn et al. 2003
). However, those PSCs that are present appear to be preferentially conserved, often paired with a nearby inframe stop codon to yield a short upstream ORF (or uORF; Churbanov et al. 2005
; Iacono et al. 2005
). Such uORFs have been shown to affect expression levels, and their preferential evolutionary conservation suggests that they have been utilized as functional posttranscriptional regulators of gene expression (Churbanov et al. 2005
; Neafsey DE, Galagan JE, unpublished results). On the other hand, the potential for random mutation to deleterious PSCs make 5' UTRs a liability, one that could potentially be acted upon by selection (Lynch et al. 2005
).
Patterns of UTR Splicing across Eukaryotic Lineages
The pattern of UTR splicing in Cryptococcus varies considerably from previously reported patterns in other species (Hong et al. 2006
). Intron density in 5' UTRs (1.0 per kilobase) was only 3.5 times higher than in 3' UTRs, less than in plants (10 times) and much less than in animals (30300 times). The median intron size in 5' UTRs was only 31.6% larger than in 3' UTRs and 35.0% larger than in coding sequences, less than for previously studied animals and plants (103289% greater than 3' UTRs, 98713% higher than coding sequences; Hong et al. 2006
). The 3' UTR intron length distribution showed very little rightward skew (mean/median = 1.1), contrary to previously studied animals and plants (ratio from 2.8 to 9.2).
Other patterns of Cryptococcus UTR splicing mirrored results for other species (Hong et al. 2006
). Intron density in 5' UTRs was 3.5 times lower than in coding regions, similar to the previous maximum known value of 2.9 in Arabidopsis thaliana. Intron length in 5' UTRs was significantly negatively correlated with distance from the ATG start codon. As with previous species, intron densities in 5' UTRs were higher than in 3' UTRs and lower than in coding sequences, and intron lengths in 3' UTRs were lower than in 5' UTRs and similar to lengths in coding sequences. As in plants and animals, most UTRs have zero or one intron, and very few have more than two.
Evolutionary Conservation of UTR Splicing
We show here that boundaries of intron sequences are evolutionarily conserved in Cryptococcus. We estimate that 5090% of boundaries are conserved. This strongly suggests not only that selection is retaining general UTR splicing but also that the exact boundaries of the spliced intronic sequence are retained (i.e., that intron boundaries are not rapidly shifting through evolution).
We find that 5' UTR intron boundaries are particularly highly conserved, with an estimated 82.7% and 62.7% of donor and acceptor boundaries having been maintained by selection. Evolutionary conservation is even more pronounced near the ATG translation initiation start site, where an estimated 96.6% of donor boundaries and an estimated 86.1% of acceptor boundaries have been preferentially conserved by selection. These very high levels of splice site preservation indicate an important role for 5' UTR splicing. Recently, a role in posttranscriptional regulation was shown for a 5' UTR intron in Arabidopsis (Chung et al. 2006
), which provides one possible general role for 5' UTR splicing. The particularly high levels of 5' conservation near the translation start site could indicate an even greater influence of sequences and splice boundaries at the end of the UTR on posttranscriptional regulation.
The presence of 3' UTR introns is surprising, as such introns are expected to subject the transcript to degradation by the NMD pathway. However, we find that 3' UTR splice site boundaries are preferentially retained through selection, indicating a functional role for 3' UTR splicing. One possibility is that alternative splicing of 3' UTR introns could utilize NMD as a mechanism for posttranscriptional expression regulation. An even more interesting possibility is that sequences of 3' UTR introns could contain targets for miRNAs, in which case alternative splicing of these introns could utilize RNAi as a mechanism for posttranscriptional regulation (Stark et al. 2003
).
Lack of Intron Loss and Gain in UTRs
We found no evidence for loss or gain of exact or nearly exact intron sequences in the studied species. For Cryptococcus, this result is not surprising as a previous study of intron loss/gain in Cryptococcus coding regions found almost no intron loss/gain (Stajich and Dietrich 2006
). We have previously suggested that in such cases of stasis, a lack of spontaneous mutations leading to intron loss/gain alleles is likely to explain the dearth of change (Roy and Gilbert 2006
; Roy and Hartl 2006
; Roy and Penny 2006
). In this case, the lack of intron loss/gain in UTRs is not surprising. It would be interesting to compare rates of intron loss/gain in UTRs and coding sequences in species with a much larger amount of intron loss/gain.
Longer Introns in 5' UTRs
As found previously for plants and animals (Hong et al. 2006
), introns in C. neoformans 5' UTRs tend to be longer than in coding regions or 3' UTRs. Hong et al. (2006)
previously suggested that this might reflect selection against exonic ATGs in 5' UTRs. According to their model, expansion of introns (i.e., movement of the splicing boundary into adjacent sequence) could be positively selected based on the incorporation of such ATGs into the intronic sequence.
The current results are not very supportive of this model. First, if selection for intron boundary movement were an important factor in the evolution of 5' UTR splicing, we might expect to see frequent movement of intron boundaries. Instead, we show here that the boundaries of UTR introns, and particularly of 5' UTR introns, tend to be conserved through evolution. Second, if in fact 5' introns were longer due to incorporation of previously exonic ATGs, ATGs should be more common near intron boundaries than in the middle of introns. However, no such trend is seenin fact there is a slight trend toward ATGs being less common at the boundaries of 5' UTR introns (fig. 5).
|
Moreover, although the overall deficit of ATGs in exonic regions of 5' UTRs presumably reflects selection against many or most new PSCs, previous results (Churbanov et al. 2005
An Alternative Explanation for Longer 5' UTR Introns
We propose instead that the greater intron length in 5' UTRs reflects the fundamentally different function of 5' UTR introns. The 5' UTRs are often alternatively spliced, with alternative forms sometimes being associated with alternative promoters or translation initiation sites (Garvin et al. 1988
; Mironov et al. 1999
; Loftus et al. 2005
; Kimura et al. 2006
). In such cases, we might expect some interference between the 2 alternative elements. In the case of alternative promoters, transcription enhancers/repressors acting at one promoter might interfere with transcription regulation at the alternative promoters. In the simplest case of alternative translation initiation sites, an upstream site would be utilized in spliced transcripts but a downstream site would be utilized in unspliced transcripts. In this case, in unspliced transcripts, the alternative (upstream) ATG would essentially be a PSC. In either case, the interference between sites might be decreased by increasing the length of genomic sequence between sites, in which case there would be direct selection for increased intron length. These forces might be stronger closer to the beginning of the coding sequence, explaining the inverse relationship between distance from the coding sequence and intron length (Hong et al. 2006
).
Splicing Conservation in UTRs and Coding Regions
These results follow a wealth of recent results demonstrating conservation of introns in coding sequences for a variety of lineages (e.g., Roy and Hartl 2006
; Roy and Penny 2006
; Stajich and Dietrich 2006
). However, "conservation" in the 2 cases denotes different things. The previous studies assessed intron loss/gain, that is, loss of the genomic sequence corresponding to the intron, processes that usually do not alter the sequence of the eventual transcript. Here, we show intron boundary conservation, indicating conservation of splicing of the sequence; in this case, lack of conservation would indicate lack of splicing or a difference in splice boundaries, leading to an alteration in the eventual transcript sequence. In particular, whereas the lack of intron loss/gain change in coding sequences could simply reflect a dearth of the necessary mutations (i.e., Roy and Hartl 2006
), in this case, the conservation of intron splicing boundaries indicates purifying selection maintaining intron splicing. These differences notwithstanding both sets of results attest to the generally slow rate of change of splicing patterns in eukaryotic transcripts.
| Concluding Remarks |
|---|
|
|
|---|
UTRs of transcripts have sometimes been treated as largely neutral features. However, the present results indicate general importance of UTR splicing, adding to the growing appreciation of the functional importance of UTRs. It seems most likely that UTR splicing is important for proper gene expression; however, the precise role of UTR splicing is not clear. Experimental studies should probe the importance of UTR splicing for regulation of transcription, nuclear export, and translation.
| Supplementary Material |
|---|
|
|
|---|
Supplementary materials are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
| Acknowledgements |
|---|
|
|
|---|
We thank Manuel Irimia for helpful comments on the manuscript. S.W.R. thanks Wen Wang and his laboratory for stimulating conversations and hospitality and for keeping him out from underneath the tires of Chinese buses. This work was supported in part by funds from the National Science Foundation and the National Institute of Allergy and Infectious Diseases (D.N.).
| Footnotes |
|---|
Aoife McLysaght, Associate Editor
| References |
|---|
|
|
|---|
Bartel DP. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell (2004) 116:281297.[CrossRef][Web of Science][Medline]
Brudno M, Do CB, Cooper GM, Kim MF, Davydov E, Green ED, Sidow A, Batzoglou S. LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res (2003) 13:721731.
Chung BYW, Simons C, Firth AE, Brown CM, Hellens RP. Effects of 5 UTR introns on gene expression in Arabidopsis thaliana. BMC Genomics (2006) 7:120.[CrossRef][Medline]
Churbanov A, Rogozin IB, Babenko VN, Ali H, Koonin EV. Evolutionary conservation suggests a regulatory function of AUG triplets in 5'-UTRs of eukaryotic genes. Nucleic Acids Res (2005) 33:55125520.
Collins L, Penny D. Complex spliceosomal organization ancestral to extant eukaryotes. Mol Biol Evol (2005) 22:10531066.
Crowe ML, Wang XQ, Rothnagel JA. Evidence for conservation and selection of upstream open reading frames suggests probable encoding of bioactive peptides. BMC Genomics (2006) 7:16.[CrossRef][Medline]
Fedorov A, Merican AF, Gilbert W. Large-scale comparison of intron positions among animal, plant, and fungal genomes. Proc Natl Acad Sci USA (2002) 99:1612816133.
Garvin AM, Pawar S, Marth JD, Perlmutter RM. Structure of the murine lck gene and its rearrangement in a murine lymphoma cell line. Mol Cell Biol (1988) 8:30583064.
Hahn MW, Stajich JE, Wray GA. The effects of selection against spurious transcription factor binding sites. Mol Biol Evol (2003) 20:901906.
Hentze MW, Kulozik AE. A perfect message: RNA surveillance and nonsense-mediated decay. Cell (1999) 96:307310.[CrossRef][Web of Science][Medline]
Hong X, Scofield DG, Lynch M. Intron size, abundance, and distribution within untranslated regions of genes. Mol Biol Evol (2006) 23:23922404.
Iacono M, Mignone F, Pesole G. uAUG and uORFs in human and rodent 5'untranslated mRNAs. Gene (2005) 349:97105.[CrossRef][Web of Science][Medline]
Jeffares DC, Mourier T, Penny D. The biology of intron gain and loss. Trends Genet (2006) 22:1622.[CrossRef][Web of Science][Medline]
Kent WJ. BLATthe BLAST-like alignment tool. Genome Res (2002) 12:656664.
Kimura K, Wakamatsu A, Suzuki Y, Ota T, Nishikawa T, Yamashita R, Yamamoto J, Sekine M, Tsuritani K, Wakaguri H, et al. Diversification of transcriptional modulation: large-scale identification and characterization of putative alternative promoters of human genes. Genome Res (2006) 16:5565. (32 co-authors).
Lai E. Micro RNAs are complementary to 3 UTR sequence motifs that mediate negative post-transcriptional regulation. Nat Genet (2002) 30:363364.[CrossRef][Web of Science][Medline]
Lall S, Grun D, Krek A, Chen K, Wang YL, Dewey CN, Sood P, Colombo T, Bray N, Macmenamin P, et al. A genome-wide map of conserved microRNA targets in C. elegans. Curr Biol (2006) 16:460471. (15 co-authors).[CrossRef][Web of Science][Medline]
Larizza A, Makalowski W, Pesole G, Saccone C. Evolutionary dynamics of mammalian mRNA untranslated regions by comparative analysis of orthologous human, artiodactyl and rodent gene pairs. Comput Chem (2002) 26:479490.[CrossRef][Web of Science][Medline]
Loftus BJ, Fung E, Roncaglia P, Rowley D, Amedeo P, Bruno D, Vamathevan J, Miranda M, Andreson IJ, Fraser JA, et al. The genome of the Basidiomycetous yeast and human pathogen Cryptococcus neoformans. Science (2005) 307:13211324. (54 co-authors).
Lynch M. The origins of eukaryotic genome structure. Mol Biol Evol (2006) 23:450468.
Lynch M, Scofield DG, Hong X. The evolution of transcription-initiation sites. Mol Biol Evol (2005) 22:11371146.
Ma B, Tromp J, Li M. PatternHunter: faster and more sensitive homology search. Bioinformatics (2002) 18:440445.
Meijer HA, Thomas AA. Control of eukaryotic protein synthesis by upstream open reading frames in the 5'-untranslated region of an mRNA. Biochem J (2002) 367:111.[CrossRef][Web of Science][Medline]
Mironov AA, Fickett JW, Gelfand MS. Frequent alternative splicing of human genes. Genome Res (1999) 12:12881293.
Morris DR, Geballe AP. Upstream open reading frames as regulators of mRNA translation. Mol Cell Biol (2000) 20:86358642.
Pesole G, Mignone F, Gissi C, Grillo G, Licciulli F, Liuni S. Structural and functional features of eukaryotic mRNA untranslated regions. Gene (2001) 276:7381.[CrossRef][Web of Science][Medline]
Rodríguez-Trelles F, Tarrío R, Ayala FJ. The origins and evolution of spliceosomal introns. Annu Rev Genet (2006) 40:4776.[CrossRef][Medline]
Rogozin IB, Sverdlov AV, Babenko VN, Koonin EV. Analysis of evolution of exon-intron structure of eukaryotic genes. Brief Bioinform (2005) 6:118134.
Rogozin IB, Wolf YI, Sorokin AV, Mirkin BG, Koonin EV. Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Curr Biol (2003) 13:15121517.[CrossRef][Web of Science][Medline]
Roy SW, Gilbert W. Complex early genes. Proc Natl Acad Sci USA (2005) 102:19861991.
Roy SW, Gilbert W. The evolution of spliceosomal introns: patterns, puzzles, and progress. Nat Rev Genet (2006) 7:211221.[CrossRef][Web of Science][Medline]
Roy SW, Hartl DL. Very little intron loss/gain in plasmodium: intron loss/gain mutation rates and intron number. Genome Res (2006) 16:750756.
Roy SW, Penny D. Large-scale intron conservation and order-of-magnitude variation in intron loss/gain rates in apicomplexan evolution. Genome Res (2006) 16:12701275.
Shabalina SA, Ogurtsov AY, Rogozin IB, Koonin EV, Lipman DJ. Comparative analysis of orthologous eukaryotic mRNAs: potential hidden functional signals. Nucleic Acids (2004) 32:17821782.
Slamovits CH, Keeling PJ. A high density of ancient spliceosomal introns in oxymonad excavates. BMC Evol Biol (2006) 6:34.[CrossRef][Medline]
Stajich JE, Dietrich FS. Evidence of mRNA-mediated intron loss in the human-pathogenic fungus Cryptococcus neoformans. Eukaryotic Cell (2006) 5:789793.
Stark A, Brennecke J, Russell RB, Cohen SM. Identification of Drosophila microRNA targets. PLoS Biol (2003) 1:E60.[CrossRef][Medline]
Vilela C, McCarthy JE. Regulation of fungal gene expression via short open reading frames in the mRNA 5'untranslated region. Mol Microbiol (2003) 49:859867.[CrossRef][Web of Science][Medline]
Xu J, Vilgalys R, Mitchell TG. Multiple gene geneologies reveal recent dispersion and hybridization in the human pathogenic fungus Cryptococcus neoformans. Mol Ecol (2000) 38:12141220.
Zhang Z, Dietrich FS. Identification and characterization of upstream open reading frames (uORF) in the 5 untranslated regions (UTR) of genes in Saccharomyces cerevisiae. Curr Genet (2005) 48:7787.[CrossRef][Web of Science][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
M. Irimia, S. W. Roy, D. E. Neafsey, J. F. Abril, J. Garcia-Fernandez, and E. V. Koonin Complex selection on 5' splice sites in intron-rich organisms Genome Res., November 1, 2009; 19(11): 2021 - 2027. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. L. Ekena, B. C. Stanton, J. A. Schiebe-Owens, and C. M. Hull Sexual Development in Cryptococcus neoformans Requires CLP1, a Target of the Homeodomain Transcription Factors Sxi1{alpha} and Sxi2a Eukaryot. Cell, January 1, 2008; 7(1): 49 - 57. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||






