Skip Navigation


MBE Advance Access originally published online on January 22, 2007
Molecular Biology and Evolution 2007 24(4):896-899; doi:10.1093/molbev/msm010
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
24/4/896    most recent
msm010v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Scofield, D. G.
Right arrow Articles by Lynch, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Scofield, D. G.
Right arrow Articles by Lynch, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

Letters

Position of the Final Intron in Full-Length Transcripts: Determined by NMD?

Douglas G. Scofield, Xin Hong and Michael Lynch

Department of Biology, Indiana University

E-mail: dgscofie{at}indiana.edu.


    Abstract
 TOP
 Abstract
 Methods
 Acknowledgements
 References
 
Nonsense-mediated decay (NMD) pathways for detection and degradation of transcripts containing premature termination (stop) codons (PTCs) are ubiquitous among the eukaryotes. NMD uses the presence of a second signal downstream of a termination codon to distinguish a PTC from a true stop codon. In mammals and perhaps other eukaryotes, the second signal is a protein complex closely associated with exon–exon junctions formed after removal of spliceosomal introns. A valid transcript in such species must have its 3'-most intron positioned so as not to serve as a second signal relative to the true stop. This requirement has been termed the "55-bp rule", in reference to the position within the 3' untranslated region (3' UTR) of valid transcripts downstream of which introns should not be found. However, as more information has become available, it is apparent that the 55-bp rule still holds in species with NMD pathways, which are not intron dependent. To clarify the applicability of the 55-bp rule, we constructed a large database of 3'-most intron positions within full-length transcripts from 4 eukaryotes, 2 of which (human and mouse) use intron positions for NMD, 1 of which (Drosophila melanogaster) does not, and 1 of which (Arabidopsis thaliana) may not use intron positions. Surprisingly, we found intron numbers to be sharply reduced within 3' UTRs in comparison to coding sequences starting immediately downstream of true stop, rather than 55 bp; this strong threshold existed for all 4 species. We suggest that a more general mechanism—higher rates of intron inclusion within 3' UTRs—is better able to explain this threshold. We propose that 3' UTRs are better able to tolerate loss of intron integrity than other gene regions, due to the generally greater length of conserved sequences important within 3' UTR exons. This mechanism may also help to explain the roughly 3 times greater length of 3' UTRs in comparison to 5' UTRs.

Key Words: intron • nonsense-mediated decay • 3' UTR • 55-bp rule • premature termination codon • intron inclusion

In mammals and perhaps many other eukaryotes, the nonsense-mediated decay (NMD) pathway uses introns, specifically exon junction protein complexes (EJCs) associated with intron positions, to assist in the recognition and selective degradation of invalid transcripts containing premature termination (stop) codons (PTCs) (Cheng et al. 1994Go; Kim et al. 2001Go). NMD pathways are ubiquitous within eukaryotes (Maquat 2004Go; Conti and Izaurralde 2005Go) and allow their hosts to avoid the deleterious effects of truncated proteins resulting from translation of these "nonsense" transcripts (Lewis et al. 2003Go).

Clearly, one absolutely critical requirement for the operation of an NMD pathway is that valid transcripts with correct termination codons must not be degraded. This is accomplished through the use of a downstream second signal, the presence of which indicates that a termination codon is premature. In mammals, this signal is an EJC, which is directly attached to the transcript ~20 to 24 bp upstream of the exon–exon junction formed by the removal of a spliceosomal intron (for reviews see Maquat 2005Go). There is a functional gap of ~50 to 55 bp downstream of a stop codon; if an intron is located within this gap, its associated EJC cannot elicit NMD (Nagy and Maquat 1998Go). Therefore, the 3'-most intron within a valid transcript must be no more than 55 bp downstream of the true stop to avoid serving as a second signal. When this 55-bp rule was first proposed and tested, just 2 genes within a heterogeneous collection of 1,500 genes from many species were found to violate the rule (Nagy and Maquat 1998Go).

However, as evidence has accumulated concerning the form of the second signal in nonmammalian eukaryotes, this result has appeared increasingly anomalous given that many of the conforming genes belong to nonmammalian species that are now believed to have second signals that are not intron dependent (Maquat 2004Go; Conti and Izaurralde 2005Go). Here, we examine the ubiquity of this constraint among full-length transcripts of 4 eukaryotes. We asked whether there is a difference between the observed position of the 3'-most intron of a gene and its expected position based on a random distribution. Although human and mouse have intron-dependent NMD, Drosophila melanogaster likely does not and Arabidopsis thaliana may not. In D. melanogaster it has been definitively demonstrated via RNAi that sequence homologs of mammalian EJC proteins have no role in NMD (Gatfield et al. 2003Go), and it is likely that an abnormally long 3' UTR is the second signal in this species (Behm-Ansmant and Izaurralde 2005Go). In A. thaliana, both intron-containing and intron-free genes are subject to NMD (Arciga-Reyes et al. 2006Go), and insertion of 2 different introns at each of +25 bp and +80 bp downstream of stop within the 3' UTR of a reporter gene did not repress expression to any greater degree than a number of other coding sequence (CDS) insertion points (Rose 2004Go). In the plant genus Nicotiana, insertion of a different intron within the 3' UTR of a different reporter gene repressed expression at +99 bp with no effect at +28 bp, and it was further demonstrated that abnormally long 3' UTRs also elicited NMD (Kertesz et al. 2006Go), suggesting that intron-associated NMD in plants is intron-, gene-, and/or species-specific.

For all 4 species, we found a strikingly strong threshold at the termination codon downstream of which very few introns were found, in strong contrast to the random expectation (fig. 1). This threshold is very clearly at the true stop codon; just ~0.6% of transcripts in human, mouse, and D. melanogaster and 1.7% of transcripts in A. thaliana contained an intron downstream of this position.


Figure 1
View larger version (22K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 1.— Observed relative position of the 3'-most intron (Pobs) within full-length transcripts versus its expected position with random intron distribution (Pexp). Solid and dashed lines mark the position of the stop codon and 55 bp downstream, respectively. The diagonal marks equivalent Pobs and Pexp. Included are transcripts having Pobs within 1 kbp upstream and any distance downstream of stop. Values of Pobs or Pexp that exceeded the range of the graph were plotted on the edge.

 
Considering the apparent lack of intron-dependent NMD in A. thaliana and D. melanogaster, we propose that the lack of introns within 3' UTRs is due to higher rates of intron loss via intron inclusion within the 3' UTR than in other regions of a gene, with NMD-related selection having only a much reduced secondary role. Mutational events that disrupt conserved sites necessary for intron removal will affect introns throughout genes, but intron inclusion into 3' UTR exons appears less likely to result in nonfunctional alleles than into the 5' UTR or CDS. The 3' UTR hosts conserved features that are necessary for transcription termination as well as posttranscriptional stabilization and intracellular localization (Mignone et al. 2002Go). These and other features in 3' UTRs are typically on the order of 20+ bp in length and often develop functionally important secondary structures (Chartrand et al. 2001Go; Mignone et al. 2002Go). Their relatively high sequence length and complexity make it unlikely that they would be found within neutrally mutating introns, in contrast to, for example, 3-bp start codons disruptive within the 5' UTR (Lynch et al. 2005Go). A 3-bp start codon has a 1/64 chance of appearing at a position within a neutrally evolving sequence, whereas this same chance for a specific 6-bp sequence such as a 3' UTR-specific polyadenylation signal is 1/4096. For a 20-bp sequence this chance is 10–12, and relaxing the sequence requirements by allowing all positions to be 1 of 2 bases (e.g., purine or pyrimidine) only decreases this chance to 10–6. The tolerance of 3' UTRs for novel sequence is also supported by the relative lability of stop codon positions in Saccharomyces (Giacomelli et al. 2006Go).

As a specific test of this idea, we scanned introns from these same 4 species for the presence of sequences potentially disruptive within 5' UTR and 3' UTR exons and found much reduced proportions of 3' UTR sequences, consistent with their greater length (table 1). Additional sequences from 3' UTRs for which our scans found no matching intronic sequence in any of the 4 species include: FVLE1, a 25-bp localization sequence in Xenopus (Chan et al. 1999Go); the 26-bp histone mRNA 3' stem-loop consensus sequence (Williams and Marzluff 1995Go); the 15-bp 15-LOX-DICE element (Ostareck-Lederer et al. 1994Go), and the 24-bp Bruno response element (Colegrove-Otero et al. 2005Go).


View this table:
[in this window]
[in a new window]

 
Table 1 Presence of conserved sequences important to UTRs within introns (30 bp–100 kbp in length)

 
Intron inclusion may also help to explain the roughly 3 times greater length of 3' UTRs in comparison to 5' UTRs (Pesole et al. 2001Go). Species like A. thaliana and D. melanogaster that use abnormally long 3' UTRs to trigger NMD may have an intron length–sensitive constraint on this mechanism, though the ubiquity of our threshold indicates that this constraint cannot be absolute. It thus seems likely that gene structural evolution interacts with 3' UTR-dependent NMD as it does with intron-dependent NMD (Lynch and Kewalramani 2003Go).

Notably, beyond 55 bp, human and mouse had roughly one-third the relative proportion of introns as did the other 2 species (0.2% vs. ~0.55%, {chi}2 = 18.7, degree of freedom = 1, P = 1 x 10–5). This may indicate the secondary action of NMD-related selection, and further suggests that intron-related NMD in A. thaliana is not ubiquitous and/or may differ mechanistically from that in mammals.

Alternate explanations for low intron numbers in the 3' UTR do not explain the threshold we observed. The first alternative is that intron loss within the 3' UTR is indeed higher, but occurs via homologous recombination with reverse-transcribed cDNAs; however, this predicts a gradual decline in intron numbers toward the 3' ends of genes (Mourier and Jeffares 2003Go). A second alternative is that multiple interactions at transcription termination (Maniatis and Reed 2002Go; Proudfoot 2003Go) create general spatiotemporal constraints that limit the accurate removal of introns at 3' ends of transcripts. In artificial constructs of 3' UTRs within A. thaliana, splicing fidelity was lowest at the 3'-most position of intron insertion (Rose 2002Go, 2004Go), but this is not always the case in plants (Kertesz et al. 2006Go). The strength of these constraints are likely to be correlated with the distance from the intron position to the 3' end of the UTR and thus vary by gene and, more broadly, by species depending upon 3' UTR length. A related alternative is suggested by special splicing requirements known for the final exon of transcripts (Cooke et al. 1999Go), but it is unclear how this could modify the selective and/or adaptive environment to produce a threshold.

Together with our previous study that described a mechanism for intron size evolution within the 5' UTR (Hong et al. 2006Go), we have now proposed 2 models within which intron-related selection differs dramatically between coding and noncoding regions of the same gene. Any further understanding of intron evolution must incorporate the gene context within which introns occur and the mutational and population genetic processes that support gene and genome evolution (Lynch 2002Go; Lynch and Conery 2003Go; Hong et al. 2006Go).


    Methods
 TOP
 Abstract
 Methods
 Acknowledgements
 References
 
To establish locations of 3'-most introns, we aligned publicly available full-length cDNA sequences with genome sequences, using annotations to determine the position of the true stop codon (see Hong et al. 2006Go).

For each full-length transcript, we noted its total length L, the number of introns within the entire transcript Ni, and the bp position of the 3'-most intron relative to stop Pobs. Pobs may be negative, when upstream of stop, or positive, when downstream. We then calculated Pexp, the expected position of this intron relative to stop, by assuming that expected positional eans of randomly distributed introns follow a uniform distribution with equally sized exons of length L/(Ni + 1). From the 5' end of the transcript, the expected position of the 3'-most intron within the full-length transcript is L – [L/(Ni + 1)]; Pexp was calculated from this value.


    Acknowledgements
 TOP
 Abstract
 Methods
 Acknowledgements
 References
 
This work was supported by National Science Foundation grants DBI-0434671 to D.G.S. and MCB-0342431 to M.L.


    Footnotes
 
Kenneth Wolfe, Associate Editor


    References
 TOP
 Abstract
 Methods
 Acknowledgements
 References
 

    Arciga-Reyes L, Wootton L, Kieffer M, Davies B. (2006) UPF1 is required for nonsense-mediated mRNA decay (NMD) and RNAi in Arabidopsis. Plant J 47:480–489.[CrossRef][Web of Science][Medline]

    Behm-Ansmant I and Izaurralde E. (2005) NMD in Drosophila: a snapshot into the evolution of a conserved mRNA surveillance pathway. In Maquat LE (Ed.). Nonsense-mediated mRNA decay(Landes Bioscience, Austin (Texas)).

    Chan AP, Kloc M, Etkin LD. (1999) fatvg encodes a new localized RNA that uses a 25-nucleotide element (FVLE1) to localize to the vegetal cortex of Xenopus oocytes. Development 126:4943–4953.[Abstract]

    Chartrand P, Singer RH, Long RM. (2001) RNP localization and transport in yeast. Annu Rev Cell Dev Biol 17:297–310.[CrossRef][Web of Science][Medline]

    Cheng J, Belgrader P, Zhou XB, Maquat LE. (1994) Introns are cis effectors of the nonsense-codon-mediated reduction in nuclear mRNA abundance. Mol Cell Biol 14:6317–6325.[Abstract/Free Full Text]

    Colegrove-Otero LJ, Minshall N, Standart N. (2005) RNA-binding proteins in early development. Crit Rev Biochem Mol Biol 40:21–73.[CrossRef][Web of Science][Medline]

    Conti E and Izaurralde E. (2005) Nonsense-mediated mRNA decay: molecular insights and mechanistic variations across species. Curr Opin Cell Biol 17:316–325.[CrossRef][Web of Science][Medline]

    Cooke C, Hans H, Alwine JC. (1999) Utilization of splicing elements and polyadenylation signal elements in the coupling of polyadenylation and last-intron removal. Mol Cell Biol 19:4971–4979.[Abstract/Free Full Text]

    Gatfield D, Unterholzner L, Ciccarelli FD, Bork P, Izaurralde E. (2003) Nonsense-mediated mRNA decay in Drosophila: at the intersection of the yeast and mammalian pathways. EMBO J 22:3960–3970.[CrossRef][Web of Science][Medline]

    Giacomelli MG, Hancock AS, Masel J. (2007) The conversion of 3' UTRs into coding regions. Mol Biol Evol 24:457–464.[Abstract/Free Full Text]

    Hong X, Scofield DG, Lynch M. (2006) Intron size, abundance, and distribution within untranslated regions of genes. Mol Biol Evol 23:2392–2404.[Abstract/Free Full Text]

    Kertesz S, Kerenyi Z, Merai Z, Bartos I, Palfy T, Barta E, Silhavy D. (2006) Both introns and long 3'-UTRs operate as cis-acting elements to trigger nonsense-mediated decay in plants. Nucleic Acids Res 34:6147–6157.[Abstract/Free Full Text]

    Kim VN, Kataoka N, Dreyfuss G. (2001) Role of the nonsense-mediated decay factor hUpf3 in the splicing-dependent exon-exon junction complex. Science 293:1832–1836.[Abstract/Free Full Text]

    Lewis BP, Green RE, Brenner SE. (2003) Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans. Proc Natl Acad Sci USA 100:189–192.[Abstract/Free Full Text]

    Lynch M. (2002) Intron evolution as a population-genetic process. Proc Natl Acad Sci USA 99:6118–6123.[Abstract/Free Full Text]

    Lynch M and Conery JS. (2003) The origins of genome complexity. Science 302:1401–1404.[Abstract/Free Full Text]

    Lynch M and Kewalramani A. (2003) Messenger RNA surveillance and the evolutionary proliferation of introns. Mol Biol Evol 20:563–571.[Abstract/Free Full Text]

    Lynch M, Scofield DG, Hong X. (2005) The evolution of transcription-initiation sites. Mol Biol Evol 22:1137–1146.[Abstract/Free Full Text]

    Maniatis T and Reed R. (2002) An extensive network of coupling among gene expression machines. Nature 416:499–506.[CrossRef][Medline]

    Maquat LE. (2004) Nonsense-mediated mRNA decay: a comparative analysis of different species. Curr Genomics 5:175–190.

    Maquat LE. (2005) Nonsense-mediated mRNA decay(Landes Bioscience, Austin (Texas)).

    Mignone F, Gissi C, Liuni S, Pesole G. (2002) Untranslated regions of mRNAs. Genome Biol 3:reviews0004.0001–reviews0004.0010.

    Mourier T and Jeffares DC. (2003) Eukaryotic intron loss. Science 300:1393.[Free Full Text]

    Nagy E and Maquat LE. (1998) A rule for termination-codon position within intron-containing genes: when nonsense affects RNA abundance. Trends Biochem Sci 23:198–199.[CrossRef][Web of Science][Medline]

    Ostareck-Lederer A, Ostareck DH, Standart N, Thiele BJ. (1994) Translation of 15-lipoxygenase mRNA is inhibited by a protein that binds to a repeated sequence in the 3' untranslated region. EMBO J 13:1476–1481.[Web of Science][Medline]

    Pesole G, Mignone F, Gissi C, Grillo G, Licciulli F, Liuni S. (2001) Structural and functional features of eukaryotic mRNA untranslated regions. Gene 276:73–81.[CrossRef][Web of Science][Medline]

    Proudfoot NJ. (2003) Dawdling polymerases allow introns time to splice. Nat Struct Biol 10:876–878.[CrossRef][Web of Science][Medline]

    Rose AB. (2002) Requirements for intron-mediated enhancement of gene expression in Arabidopsis. RNA 8:1444–1453.[Abstract]

    Rose AB. (2004) The effect of intron location on intron-mediated enhancement of gene expression in Arabidopsis. Plant J 40:744–751.[CrossRef][Web of Science][Medline]

    Williams AS and Marzluff WF. (1995) The sequence of the stem and flanking sequences at the 3' end of histone mRNA are critical determinants for the binding of the stem-loop binding protein. Nucleic Acids Res 23:654–662.[Abstract/Free Full Text]

Accepted for publication January 18, 2007.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Genes Dev.Home page
O. Isken and L. E. Maquat
Quality control of eukaryotic mRNA: safeguarding cells from abnormal mRNA function
Genes & Dev., August 1, 2007; 21(15): 1833 - 3856.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
24/4/896    most recent
msm010v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Scofield, D. G.
Right arrow Articles by Lynch, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Scofield, D. G.
Right arrow Articles by Lynch, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?