Skip Navigation


MBE Advance Access originally published online on July 17, 2006
Molecular Biology and Evolution 2006 23(10):1824-1827; doi:10.1093/molbev/msl061
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
23/10/1824    most recent
msl061v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (8)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Roy, S. W.
Right arrow Articles by Penny, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Roy, S. W.
Right arrow Articles by Penny, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

Letter

Very Little Intron Gain in Entamoeba histolytica Genes Laterally Transferred from Prokaryotes

Scott William Roy, Manuel Irimia and David Penny

Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North, New Zealand

E-mail: scottwroy{at}gmail.com.


    Abstract
 TOP
 Abstract
 Methods
 Acknowledgements
 References
 
The evolution of spliceosomal introns remains intensely debated. We studied 96 Entamoeba histolytica genes previously identified as having been laterally transferred from prokaryotes, which were presumably intronless at the time of transfer. Ninety out of the 96 are also present in the reptile parasite Entamoeba invadens, indicating lateral transfer before the species' divergence ~50 MYA. We find only 2 introns, both shared with E. invadens. Thus, no intron gains have occurred in ~50 Myr, implying a very low rate of intron gain of less than one gain per gene per ~4.5 billion years. Nine other predicted introns are due to annotation errors reflecting apparent mistakes in the E. histolytica genome assembly. These results underscore the massive differences in intron gain rates through evolution.

Key Words: intron gain • genome complexity • genome annotation • lateral gene transfer • parasite evolution

Although common to all eukaryotic species, spliceosomal intron number varies tremendously across eukaryotes, from only 3 characterized introns in Giardia lambia to more than 8 introns per gene in vertebrates (compiled in Jeffares et al. 2006Go; Roy and Gilbert 2006Go). Patterns of intron gain and loss also show striking variations and often perplexing patterns (e.g., Bon et al. 2003Go; Perumal et al. 2005Go; Rodríguez-Trelles et al. 2006Go provide an excellent recent review). For instance, intron-rich taxa often show very low rates of intron gain and/or high rates of loss (Seo et al. 2001Go; Rogozin et al. 2003Go; Roy et al. 2003Go; Cho et al. 2004Go; Raible et al. 2005Go; Roy and Gilbert 2005bGo; Stajich and Dietrich 2006Go). Some groups show high degrees of both intron loss and gain; others exhibit almost no loss or gain over very long periods of time (Seo et al. 2001Go; Roy et al. 2003Go; Edvardsen et al. 2004Go; Roy and Hartl 2006Go; Stajich and Dietrich 2006Go). However, attempts to estimate rates of intron loss and gain and to infer the relative importance of the 2 processes have been thwarted by lack of consensus over appropriate evolutionary assumptions, with different groups sometimes reaching very different conclusions from the same data set (Rogozin et al. 2003Go; Babenko et al. 2004Go; Nielsen et al. 2004Go; Qiu et al. 2004Go; Csurös 2005Go; Nguyen et al. 2005Go; Roy and Gilbert 2005aGo, 2005bGo).

Here we take a novel approach. We studied 96 genes from the moderately intron-dense parasitic amoeba Entamoeba histolytica (0.3 introns per gene on average) that were previously identified by phylogenetic analysis to represent lateral gene transfers (LGTs) from prokaryotes to Entamoeba (Loftus et al. 2005Go). Such genes were presumably intronless at the time of LGT, allowing confident inferences about intron gain. Sequence searches showed that 90/96 LGTs are present in the reptile parasite Entamoeba invadens and thus predate the E. histolytica–E. invadens divergence ~50.5 ± 13.5 MYA based on conservative assumptions (see Methods).

Strikingly, 11/90 (11.5%) of these LGTs were predicted to have introns (table 1). However, investigation showed that 9/11 predicted introns reflected annotation errors. In 7 cases, comparison between the genome assembly and individual sequence reads identified assembly errors (in each case a single basepair indel relative to sequence reads). In each case, correction yielded a single long open reading frame (ORF) between the predicted start and stop codons, suggesting against intron presence. In 6 out of 7 cases, homologous E. invadens sequences were obtained; in each case, the corresponding sequence also appeared exonic (multiple of 3 bases and no stop codons).


View this table:
[in this window]
[in a new window]

 
Table 1 Probable LGTs with Annotated Introns and the Conclusions Drawn from the Analyses Reported Here

 
In another case, a homologous E. histolytica mRNA from GenBank (AAA81906.1) had a single basepair indel relative to the predicted gene (which fell within the predicted intron) and an intronless gene structure spanning the 3' of the predicted intron terminus and the downstream exon (fig. 1A). In yet another case, amino acid–level similarity to homologous sequences from bacteria and E. invadens continues through the intron (fig. 1B), suggesting that the predicted intronic sequence is in fact exonic. Interestingly, the E. histolytica sequence but not the corresponding E. invadens sequence contains a single in-frame stop codon, which is confirmed by individual sequencing reads. Whether this apparent gene truncation occurred in natural populations or in the lab is unknown.


Figure 1
View larger version (43K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 1.— Three examples of LGTs predicted to contain introns. (A) 5' alignment of predicted Entamoeba histolytica gene 328.00056 ("Predict") and an E. histolytica GB mRNA (GenBank accession number AAA81906.1). The GB mRNA contains an extra cytosine (arrow) relative to the predicted gene, uses an alternative start codon (underlined), and does not reflect a splicing event. Upper/lowercase indicates exonic/intronic sequence. (B) E. histolytica gene 13.m00321 and homologs. The supposedly intronic sequence (lower case bold) shows strong coding-level sequence similarity to a bacterial homolog (43% amino acid identity; Morella thermoacetica gene, GenBank accession number ABC19526.1) and to the apparent Entamoeba invadens homolog (57% identity), suggesting that it is a coding sequence, not an intron. (C) E. histolytica gene 8.m00343 and homologs from D. discoideum (GenBank accession number XP_629020) and E. invadens. Gray boxes indicate intron positions.

 
Thus, only 2 genes showed evidence of intron presence. One has a close homolog in Dictyostelium discoideum, which shares the intron (fig. 1C). The D. discoideumEntamoeba divergence represents a deep split within amoebozoa, thus this gene is either a very old LGT or is not an LGT at all (fig. 1C). This leaves a single intron in the 2-phosphosulfolactate phosphatase gene. The predicted 226-codon gene contains a single 53-bp intron with 79.2% AT content. The intronic sequence is not a multiple of 3 bases and contains 6 stop codons falling in all 3 frames and is thus almost certainly an intron (fig. 2A). Both upstream and downstream exons show coherent homology to bacterial homologs, suggesting that the intron was inserted into previously contiguous coding sequence (fig. 2B). The gene is absent from D. discoideum, available Acanthamoeba castellanii genomic sequence, and other eukaryotes in GenBank, supporting its lateral transfer. However, the intron is shared with E. invadens (fig. 2A and B), and thus the intron gain predates the E. histolytica–E. invadens divergence. This intron represents the first reported case of intron gain in an amoeba.


Figure 2
View larger version (61K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 2.— Apparent intron insertion in the Entamoeba histolytica 2-phosphosulfolactate phosphatase (87.m00169). (A) Intron and flanking exonic sequence for 4 Entamoeba species. Upper/lowercase indicates exonic/intronic sequence. Stop codons in the frame of the upstream and downstream coding sequences are shown (bold). For Entamoeba terrapinae, only the downstream exon and part of the intron sequence was available. E. hist, E. mosh, E. inva, and E. terr indicate E. histolytica, E. moshkovskii, E. invadens, and E. terrapinae, respectively. (B) Alignment with homologous bacterial genes (ClustalW, default parameters). Asterisks indicate positions at which there is identity between a bacterial gene and an Entamoeba gene. The gray box indicates the intron position. T. teng, T. meri, and C. perf indicate genes from Therobacter tengcongensis (GenBank accession number AAM25151.1), Thermotoga maritima (GenBank accession number AAD35879.1), and Clostridium perfringens (GenBank accession number BAB82262.1), respectively.

 
We found no intron gains in 90 LGTs in ~50.5 ± 13.5 Myr, suggesting a rate of intron gain of less than 0.00022 ± 0.00006 intron gains per gene per Myr or one gain per gene per 4.5 ± 1.2 billion years. Importantly, this conclusion holds even if some of the genes are not actual LGTs because regardless of the genes' origin, no intron gains are found in ~50 Myr. It is unlikely that many gained introns have been subsequently lost because even assuming the highest loss rates ever estimated (~2.2 x 10–9 per year; Roy and Gilbert 2005bGo) only 10% of introns are expected to be lost over 50 Myr. This low rate of gain is not consistent with high intron numbers in diverse modern eukaryotes (e.g., 37.8 billion years would be required to reach the 8.4 introns per gene found in Homo sapiens) or with the apparently high intron numbers already present relatively early in eukaryotic evolution (Csurös 2005Go; Nguyen et al. 2005Go; Roy and Gilbert 2005aGo), implying that rates of intron creation have varied significantly through evolution (see Roy and Gilbert 2005bGo for a more thorough discussion).

Genome-wide studies of closely related species indicate very low rates of intron gain of less than one per gene per 1.5 billion years in animals, fungi, plants, apicomplexans (Roy et al. 2003Go; Coghlan and Wolfe 2004Go; Nielsen et al. 2004Go; Lin et al. 2006Go; Roy and Hartl 2006Go; Stajich and Dietrich 2006Go), and now amoebozoa. Only a single genome-wide study, in A. thaliana, shows a higher rate, though as the authors of that manuscript concede some reported gains may in fact represent losses, and their data warrant further study (Knowles and McLysaght 2006Go). These modern rates are too low to explain modern and estimated ancestral intron densities (Fedorov et al. 2002Go; Csurös 2005Go; Roy and Gilbert 2005bGo), implying much higher rates of intron creation during some earlier period(s) of evolution (Fedorov et al. 2003Go). To explain this pattern, we will need to better understand the evolutionary forces governing intron gain and loss.


    Methods
 TOP
 Abstract
 Methods
 Acknowledgements
 References
 
We downloaded the E. histolytica genome gbk files (version 1) from National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/) and extracted exon and intron sequences. For each of the 11 LGT genes that were predicted to contain introns, we performed BlastN searches of the corresponding genomic region against all E. histolytica reads in the NCBI Trace Archive and compared the assembled sequence with the best hit. For cases in which reads and assembly agreed, we performed TBlastN searches at NCBI for corresponding Entamoeba and A. castellanii sequences and searched NCBI and the D. discoideum genome project for corresponding sequences from other amoebae. TBlastN searches against available genome sequence from other Entamoeba species were performed online (http://www.sanger.ac.uk/Projects/Comp_Entamoeba/). A TBlastN search of the E. histolytica 2-phosphosulfolactate phosphatase sequence against all eukaryotic sequences in GenBank yielded no non-Entamoeba sequences. To estimate dS between E. invadens–E. histolytica, we downloaded available E. invadens mRNAs in GenBank and excluded those not beginning with "ATG" or ending with a stop codon. Reciprocal BlastP searches against the E. histolytica predicted proteome identified 10 putative ortholog pairs with strong amino acid–sequence identity (>40%). Sequences were aligned in ClustalX using default parameters, and average dS and confidence intervals (CI) across genes were calculated using PAUP*4.0 using a general time reversible substitution model estimated from the data set (Lanave et al. 1984Go). Although mutation rates for amoebae have not been estimated, conservatively assuming the highest estimates of which we are aware for any unicellular eukaryote (around 5 x 10–9 per year, in Plasmodium; Castillo-Davis et al. 2004Go; Tanabe et al. 2004Go; Neafsey et al. 2005Go) yields an estimate of 50.5 ± 13.5 Myr.


    Acknowledgements
 TOP
 Abstract
 Methods
 Acknowledgements
 References
 
We thank Warwick Allen for help formatting the figures. MI was supported by funds from Fundacion Caixa Galicia.


    Footnotes
 
Martin Embley, Associate Editor


    References
 TOP
 Abstract
 Methods
 Acknowledgements
 References
 

    Babenko VN, Rogozin IB, Mekhedov SL, Koonin EV. 2004. Prevalence of intron gain over intron loss in the evolution of paralogous gene families. Nucleic Acids Res 32:3724–33.[Abstract/Free Full Text]

    Bon E, Casaregola S, Blandin G, et al. (11 co-authors). 2003. Molecular evolution of eukaryotic genomes: hemiascomycetous yeast spliceosomal introns. Nucleic Acids Res 31:1121–35.[Abstract/Free Full Text]

    Castillo-Davis CI, Bedford TB, Hartl DL. 2004. Accelerated rates of intron gain/loss and protein evolution in duplicate genes in human and mouse malaria parasites. Mol Biol Evol 21:1422–7.[Abstract/Free Full Text]

    Cho S, Jin SW, Cohen A, Ellis RE. 2004. A phylogeny of Caenorhabditis reveals frequent loss of introns during nematode evolution. Genome Res 14:1207–20.[Abstract/Free Full Text]

    Coghlan A, Wolfe KH. 2004. Origins of recently gained introns in Caenorhabditis. Proc Natl Acad Sci USA 101:11362–7.[Abstract/Free Full Text]

    Csurös M. 2005. Likely scenarios of intron evolution. Third RECOMB satellite workshop on comparative genomics. Springer LNCS 3678. p 47–60.

    Edvardsen RB, Lerat E, Maeland AD, Flat M, Tewari R, Jensen MF, Lehrach H, Reinhardt R, Seo HC, Chourrout D. 2004. Hypervariable and highly divergent intron/exon organizations in the chordate Oikopleura dioica. J Mol Evol 59:448–57.

    Fedorov A, Merican AF, Gilbert W. 2002. Large-scale comparison of intron positions among animal, plant, and fungal genes. Proc Natl Acad Sci USA 99:16128–33.[Abstract/Free Full Text]

    Fedorov A, Roy S, Fedorova L, Gilbert W. 2003. Mystery of intron gain. Genome Res 13:2236–41.[Abstract/Free Full Text]

    Jeffares DC, Mourier T, Penny D. 2006. The biology of intron gain and loss. Trends Genet 22:16–22.[CrossRef][ISI][Medline]

    Knowles DG, McLysaght A. 2006. High rate of recent intron gain and loss in simultaneously duplicated Arabidopsis genes. Mol Biol Evol 23:1548–57.[Abstract/Free Full Text]

    Lanave C, Preparata G, Saccone C, Serio G. 1984. A new method for calculating evolutionary substitution rates. J Mol Evol 20:86–93.[CrossRef][ISI][Medline]

    Lin H, Zhu W, Silva JC, Gu X, Buell CR. 2006. Intron gain and loss in segmentally duplicated genes in rice. Genome Biol 7:R41.[CrossRef][Medline]

    Loftus B, Anderson I, Davies R, et al. (54 co-authors). 2005. The genome of the protist parasite Entamoeba histolytica. Nature 433:865–8.[CrossRef][Medline]

    Neafsey DE, Hartl DL, Berriman M. 2005. Evolution of noncoding and silent coding sites in the Plasmodium falciparum and Plasmodium reichenowi genomes. Mol Biol Evol 22:1621–6.[Abstract/Free Full Text]

    Nguyen HD, Yoshihama M, Kenmochi N. 2005. New maximum likelihood estimators for eukaryotic intron evolution. PLoS Comput Biol 1:e79.[CrossRef][Medline]

    Nielsen CB, Friedman B, Birren B, Burge CB, Galagan JE. 2004. Patterns of intron gain and loss in fungi. PLoS Biol 2:e422.[CrossRef][Medline]

    Perumal BS, Sakharhar KR, Chow VT, Pandjassarame K, Sakharkar MK. 2005. Intron position conservation across eukaryotic lineages in tubulin genes. Front Biosci 10:2412–9.[ISI][Medline]

    Qiu WG, Schisler N, Stoltzfus A. 2004. The evolutionary gain of spliceosomal introns: sequence and phase preferences. Mol Biol Evol 21:1252–63.[Abstract/Free Full Text]

    Raible F, Tessmar-Raible K, Osoegawa K, et al. (12 co-authors). 2005. Vertebrate-type intron-rich genes in the marine annelid Platynereis dumerilii. Science 310:1325–6.[Abstract/Free Full Text]

    Rodríguez-Trelles F, Tarrío R, Ayala FJ. 2006. Origin and evolution of spliceosomal introns. Annu Rev Genet 40:47–76.[Medline]

    Rogozin IB, Wolf YI, Sorokin AV, Mirkin BG, Koonin EV. 2003. Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Curr Biol 13:1512–7.[CrossRef][ISI][Medline]

    Roy SW, Fedorov A, Gilbert W. 2003. Large-scale comparison of intron positions in mammalian genes shows intron loss but no gain. Proc Natl Acad Sci USA 100:7158–62.[Abstract/Free Full Text]

    Roy SW, Gilbert W. 2005a. Complex early genes. Proc Natl Acad Sci USA 102:1986–91.[Abstract/Free Full Text]

    Roy SW, Gilbert W. 2005b. Rates of intron loss and gain: implications for early eukaryotic evolution. Proc Natl Acad Sci USA 102:5773–8.[Abstract/Free Full Text]

    Roy SW, Gilbert W. 2006. The evolution of spliceosomal introns: patterns, puzzles and progress. Nat Rev Genet 7:211–21.[CrossRef][ISI][Medline]

    Roy SW, Hartl DL. 2006. Very little intron loss/gain in Plasmodium: intron loss/gain mutation rates and intron number. Genome Res 16:750–6.[Abstract/Free Full Text]

    Seo H-C, Kube M, Edvardsen RB, et al. (11 co-authors). 2001. Miniature genome in the marine chordate Oikopleura dioica. Science 294:2506.[Free Full Text]

    Stajich JE, Dietrich FS. 2006. Evidence of mRNA-mediated intron loss in the human-pathogenic fungus Cryptococcus neoformans. Eukaryotic Cell 5:789–93.[Abstract/Free Full Text]

    Tanabe K, Sakihama N, Hattori T, Ranford-Cartwright L, Goldman I, Escalante AA, Lal AA. 2004. Genetic distance in housekeeping genes between Plasmodium falciparum and Plasmodium reichenowi and within P. falciparum. J Mol Evol 59:687–94.[CrossRef][ISI][Medline]

Accepted for publication July 12, 2006.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
M. Irimia and S. W. Roy
Spliceosomal introns as tools for genomic and evolutionary analysis
Nucleic Acids Res., March 1, 2008; 36(5): 1703 - 1712.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
Y.-H. Loh, S. Brenner, and B. Venkatesh
Investigation of Loss and Gain of Introns in the Compact Genomes of Pufferfishes (Fugu and Tetraodon)
Mol. Biol. Evol., March 1, 2008; 25(3): 526 - 535.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
M. K. Basu, I. B. Rogozin, O. Deusch, T. Dagan, W. Martin, and E. V. Koonin
Evolutionary Dynamics of Introns in Plastid-Derived Genes in Plants: Saturation Nearly Reached but Slow Intron Gain Continues
Mol. Biol. Evol., January 1, 2008; 25(1): 111 - 119.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. W. Roy and D. Penny
Intron length distributions and gene prediction
Nucleic Acids Res., July 9, 2007; 35(14): 4737 - 4742.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. W. Roy and D. Penny
A Very High Fraction of Unique Intron Positions in the Intron-Rich Diatom Thalassiosira pseudonana Indicates Widespread Intron Gain
Mol. Biol. Evol., July 1, 2007; 24(7): 1447 - 1457.
[Abstract] [Full Text] [PDF]


Home page
Eukaryot CellHome page
C. A. Davis, M. P. S. Brown, and U. Singh
Functional Characterization of Spliceosomal Introns and Identification of U2, U4, and U5 snRNAs in the Deep-Branching Eukaryote Entamoeba histolytica
Eukaryot. Cell, June 1, 2007; 6(6): 940 - 948.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. W. Roy and D. Penny
Patterns of Intron Loss and Gain in Plants: Intron Loss-Dominated Evolution and Genome-Wide Comparison of O. sativa and A. thaliana
Mol. Biol. Evol., January 1, 2007; 24(1): 171 - 181.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. W. Roy and D. Penny
Smoke Without Fire: Most Reported Cases of Intron Gain in Nematodes Instead Reflect Intron Losses
Mol. Biol. Evol., December 1, 2006; 23(12): 2259 - 2262.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
23/10/1824    most recent
msl061v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (8)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Roy, S. W.
Right arrow Articles by Penny, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Roy, S. W.
Right arrow Articles by Penny, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?