MBE Advance Access originally published online on July 17, 2006
Molecular Biology and Evolution 2006 23(10):1824-1827; doi:10.1093/molbev/msl061
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Letter |
Very Little Intron Gain in Entamoeba histolytica Genes Laterally Transferred from Prokaryotes
Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North, New Zealand
E-mail: scottwroy{at}gmail.com.
| Abstract |
|---|
|
|
|---|
The evolution of spliceosomal introns remains intensely debated. We studied 96 Entamoeba histolytica genes previously identified as having been laterally transferred from prokaryotes, which were presumably intronless at the time of transfer. Ninety out of the 96 are also present in the reptile parasite Entamoeba invadens, indicating lateral transfer before the species' divergence
50 MYA. We find only 2 introns, both shared with E. invadens. Thus, no intron gains have occurred in
50 Myr, implying a very low rate of intron gain of less than one gain per gene per
4.5 billion years. Nine other predicted introns are due to annotation errors reflecting apparent mistakes in the E. histolytica genome assembly. These results underscore the massive differences in intron gain rates through evolution.
Key Words: intron gain genome complexity genome annotation lateral gene transfer parasite evolution
Although common to all eukaryotic species, spliceosomal intron number varies tremendously across eukaryotes, from only 3 characterized introns in Giardia lambia to more than 8 introns per gene in vertebrates (compiled in Jeffares et al. 2006
; Roy and Gilbert 2006
). Patterns of intron gain and loss also show striking variations and often perplexing patterns (e.g., Bon et al. 2003
; Perumal et al. 2005
; Rodríguez-Trelles et al. 2006
provide an excellent recent review). For instance, intron-rich taxa often show very low rates of intron gain and/or high rates of loss (Seo et al. 2001
; Rogozin et al. 2003
; Roy et al. 2003
; Cho et al. 2004
; Raible et al. 2005
; Roy and Gilbert 2005b
; Stajich and Dietrich 2006
). Some groups show high degrees of both intron loss and gain; others exhibit almost no loss or gain over very long periods of time (Seo et al. 2001
; Roy et al. 2003
; Edvardsen et al. 2004
; Roy and Hartl 2006
; Stajich and Dietrich 2006
). However, attempts to estimate rates of intron loss and gain and to infer the relative importance of the 2 processes have been thwarted by lack of consensus over appropriate evolutionary assumptions, with different groups sometimes reaching very different conclusions from the same data set (Rogozin et al. 2003
; Babenko et al. 2004
; Nielsen et al. 2004
; Qiu et al. 2004
; Csurös 2005
; Nguyen et al. 2005
; Roy and Gilbert 2005a
, 2005b
).
Here we take a novel approach. We studied 96 genes from the moderately intron-dense parasitic amoeba Entamoeba histolytica (0.3 introns per gene on average) that were previously identified by phylogenetic analysis to represent lateral gene transfers (LGTs) from prokaryotes to Entamoeba (Loftus et al. 2005
). Such genes were presumably intronless at the time of LGT, allowing confident inferences about intron gain. Sequence searches showed that 90/96 LGTs are present in the reptile parasite Entamoeba invadens and thus predate the E. histolyticaE. invadens divergence
50.5 ± 13.5 MYA based on conservative assumptions (see Methods).
Strikingly, 11/90 (11.5%) of these LGTs were predicted to have introns (table 1). However, investigation showed that 9/11 predicted introns reflected annotation errors. In 7 cases, comparison between the genome assembly and individual sequence reads identified assembly errors (in each case a single basepair indel relative to sequence reads). In each case, correction yielded a single long open reading frame (ORF) between the predicted start and stop codons, suggesting against intron presence. In 6 out of 7 cases, homologous E. invadens sequences were obtained; in each case, the corresponding sequence also appeared exonic (multiple of 3 bases and no stop codons).
|
In another case, a homologous E. histolytica mRNA from GenBank (AAA81906.1) had a single basepair indel relative to the predicted gene (which fell within the predicted intron) and an intronless gene structure spanning the 3' of the predicted intron terminus and the downstream exon (fig. 1A). In yet another case, amino acidlevel similarity to homologous sequences from bacteria and E. invadens continues through the intron (fig. 1B), suggesting that the predicted intronic sequence is in fact exonic. Interestingly, the E. histolytica sequence but not the corresponding E. invadens sequence contains a single in-frame stop codon, which is confirmed by individual sequencing reads. Whether this apparent gene truncation occurred in natural populations or in the lab is unknown.
|
Thus, only 2 genes showed evidence of intron presence. One has a close homolog in Dictyostelium discoideum, which shares the intron (fig. 1C). The D. discoideumEntamoeba divergence represents a deep split within amoebozoa, thus this gene is either a very old LGT or is not an LGT at all (fig. 1C). This leaves a single intron in the 2-phosphosulfolactate phosphatase gene. The predicted 226-codon gene contains a single 53-bp intron with 79.2% AT content. The intronic sequence is not a multiple of 3 bases and contains 6 stop codons falling in all 3 frames and is thus almost certainly an intron (fig. 2A). Both upstream and downstream exons show coherent homology to bacterial homologs, suggesting that the intron was inserted into previously contiguous coding sequence (fig. 2B). The gene is absent from D. discoideum, available Acanthamoeba castellanii genomic sequence, and other eukaryotes in GenBank, supporting its lateral transfer. However, the intron is shared with E. invadens (fig. 2A and B), and thus the intron gain predates the E. histolyticaE. invadens divergence. This intron represents the first reported case of intron gain in an amoeba.
|
We found no intron gains in 90 LGTs in
50.5 ± 13.5 Myr, suggesting a rate of intron gain of less than 0.00022 ± 0.00006 intron gains per gene per Myr or one gain per gene per 4.5 ± 1.2 billion years. Importantly, this conclusion holds even if some of the genes are not actual LGTs because regardless of the genes' origin, no intron gains are found in
50 Myr. It is unlikely that many gained introns have been subsequently lost because even assuming the highest loss rates ever estimated (
2.2 x 109 per year; Roy and Gilbert 2005b
Genome-wide studies of closely related species indicate very low rates of intron gain of less than one per gene per 1.5 billion years in animals, fungi, plants, apicomplexans (Roy et al. 2003
; Coghlan and Wolfe 2004
; Nielsen et al. 2004
; Lin et al. 2006
; Roy and Hartl 2006
; Stajich and Dietrich 2006
), and now amoebozoa. Only a single genome-wide study, in A. thaliana, shows a higher rate, though as the authors of that manuscript concede some reported gains may in fact represent losses, and their data warrant further study (Knowles and McLysaght 2006
). These modern rates are too low to explain modern and estimated ancestral intron densities (Fedorov et al. 2002
; Csurös 2005
; Roy and Gilbert 2005b
), implying much higher rates of intron creation during some earlier period(s) of evolution (Fedorov et al. 2003
). To explain this pattern, we will need to better understand the evolutionary forces governing intron gain and loss.
| Methods |
|---|
|
|
|---|
We downloaded the E. histolytica genome gbk files (version 1) from National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/) and extracted exon and intron sequences. For each of the 11 LGT genes that were predicted to contain introns, we performed BlastN searches of the corresponding genomic region against all E. histolytica reads in the NCBI Trace Archive and compared the assembled sequence with the best hit. For cases in which reads and assembly agreed, we performed TBlastN searches at NCBI for corresponding Entamoeba and A. castellanii sequences and searched NCBI and the D. discoideum genome project for corresponding sequences from other amoebae. TBlastN searches against available genome sequence from other Entamoeba species were performed online (http://www.sanger.ac.uk/Projects/Comp_Entamoeba/). A TBlastN search of the E. histolytica 2-phosphosulfolactate phosphatase sequence against all eukaryotic sequences in GenBank yielded no non-Entamoeba sequences. To estimate dS between E. invadensE. histolytica, we downloaded available E. invadens mRNAs in GenBank and excluded those not beginning with "ATG" or ending with a stop codon. Reciprocal BlastP searches against the E. histolytica predicted proteome identified 10 putative ortholog pairs with strong amino acidsequence identity (>40%). Sequences were aligned in ClustalX using default parameters, and average dS and confidence intervals (CI) across genes were calculated using PAUP*4.0 using a general time reversible substitution model estimated from the data set (Lanave et al. 1984
| Acknowledgements |
|---|
|
|
|---|
We thank Warwick Allen for help formatting the figures. MI was supported by funds from Fundacion Caixa Galicia.
| Footnotes |
|---|
Martin Embley, Associate Editor
| References |
|---|
|
|
|---|
Babenko VN, Rogozin IB, Mekhedov SL, Koonin EV. 2004. Prevalence of intron gain over intron loss in the evolution of paralogous gene families. Nucleic Acids Res 32:372433.
Bon E, Casaregola S, Blandin G, et al. (11 co-authors). 2003. Molecular evolution of eukaryotic genomes: hemiascomycetous yeast spliceosomal introns. Nucleic Acids Res 31:112135.
Castillo-Davis CI, Bedford TB, Hartl DL. 2004. Accelerated rates of intron gain/loss and protein evolution in duplicate genes in human and mouse malaria parasites. Mol Biol Evol 21:14227.
Cho S, Jin SW, Cohen A, Ellis RE. 2004. A phylogeny of Caenorhabditis reveals frequent loss of introns during nematode evolution. Genome Res 14:120720.
Coghlan A, Wolfe KH. 2004. Origins of recently gained introns in Caenorhabditis. Proc Natl Acad Sci USA 101:113627.
Csurös M. 2005. Likely scenarios of intron evolution. Third RECOMB satellite workshop on comparative genomics. Springer LNCS 3678. p 4760.
Edvardsen RB, Lerat E, Maeland AD, Flat M, Tewari R, Jensen MF, Lehrach H, Reinhardt R, Seo HC, Chourrout D. 2004. Hypervariable and highly divergent intron/exon organizations in the chordate Oikopleura dioica. J Mol Evol 59:44857.
Fedorov A, Merican AF, Gilbert W. 2002. Large-scale comparison of intron positions among animal, plant, and fungal genes. Proc Natl Acad Sci USA 99:1612833.
Fedorov A, Roy S, Fedorova L, Gilbert W. 2003. Mystery of intron gain. Genome Res 13:223641.
Jeffares DC, Mourier T, Penny D. 2006. The biology of intron gain and loss. Trends Genet 22:1622.[CrossRef][ISI][Medline]
Knowles DG, McLysaght A. 2006. High rate of recent intron gain and loss in simultaneously duplicated Arabidopsis genes. Mol Biol Evol 23:154857.
Lanave C, Preparata G, Saccone C, Serio G. 1984. A new method for calculating evolutionary substitution rates. J Mol Evol 20:8693.[CrossRef][ISI][Medline]
Lin H, Zhu W, Silva JC, Gu X, Buell CR. 2006. Intron gain and loss in segmentally duplicated genes in rice. Genome Biol 7:R41.[CrossRef][Medline]
Loftus B, Anderson I, Davies R, et al. (54 co-authors). 2005. The genome of the protist parasite Entamoeba histolytica. Nature 433:8658.[CrossRef][Medline]
Neafsey DE, Hartl DL, Berriman M. 2005. Evolution of noncoding and silent coding sites in the Plasmodium falciparum and Plasmodium reichenowi genomes. Mol Biol Evol 22:16216.
Nguyen HD, Yoshihama M, Kenmochi N. 2005. New maximum likelihood estimators for eukaryotic intron evolution. PLoS Comput Biol 1:e79.[CrossRef][Medline]
Nielsen CB, Friedman B, Birren B, Burge CB, Galagan JE. 2004. Patterns of intron gain and loss in fungi. PLoS Biol 2:e422.[CrossRef][Medline]
Perumal BS, Sakharhar KR, Chow VT, Pandjassarame K, Sakharkar MK. 2005. Intron position conservation across eukaryotic lineages in tubulin genes. Front Biosci 10:24129.[ISI][Medline]
Qiu WG, Schisler N, Stoltzfus A. 2004. The evolutionary gain of spliceosomal introns: sequence and phase preferences. Mol Biol Evol 21:125263.
Raible F, Tessmar-Raible K, Osoegawa K, et al. (12 co-authors). 2005. Vertebrate-type intron-rich genes in the marine annelid Platynereis dumerilii. Science 310:13256.
Rodríguez-Trelles F, Tarrío R, Ayala FJ. 2006. Origin and evolution of spliceosomal introns. Annu Rev Genet 40:4776.[Medline]
Rogozin IB, Wolf YI, Sorokin AV, Mirkin BG, Koonin EV. 2003. Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Curr Biol 13:15127.[CrossRef][ISI][Medline]
Roy SW, Fedorov A, Gilbert W. 2003. Large-scale comparison of intron positions in mammalian genes shows intron loss but no gain. Proc Natl Acad Sci USA 100:715862.
Roy SW, Gilbert W. 2005a. Complex early genes. Proc Natl Acad Sci USA 102:198691.
Roy SW, Gilbert W. 2005b. Rates of intron loss and gain: implications for early eukaryotic evolution. Proc Natl Acad Sci USA 102:57738.
Roy SW, Gilbert W. 2006. The evolution of spliceosomal introns: patterns, puzzles and progress. Nat Rev Genet 7:21121.[CrossRef][ISI][Medline]
Roy SW, Hartl DL. 2006. Very little intron loss/gain in Plasmodium: intron loss/gain mutation rates and intron number. Genome Res 16:7506.
Seo H-C, Kube M, Edvardsen RB, et al. (11 co-authors). 2001. Miniature genome in the marine chordate Oikopleura dioica. Science 294:2506.
Stajich JE, Dietrich FS. 2006. Evidence of mRNA-mediated intron loss in the human-pathogenic fungus Cryptococcus neoformans. Eukaryotic Cell 5:78993.
Tanabe K, Sakihama N, Hattori T, Ranford-Cartwright L, Goldman I, Escalante AA, Lal AA. 2004. Genetic distance in housekeeping genes between Plasmodium falciparum and Plasmodium reichenowi and within P. falciparum. J Mol Evol 59:68794.[CrossRef][ISI][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
M. Irimia and S. W. Roy Spliceosomal introns as tools for genomic and evolutionary analysis Nucleic Acids Res., March 1, 2008; 36(5): 1703 - 1712. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-H. Loh, S. Brenner, and B. Venkatesh Investigation of Loss and Gain of Introns in the Compact Genomes of Pufferfishes (Fugu and Tetraodon) Mol. Biol. Evol., March 1, 2008; 25(3): 526 - 535. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. K. Basu, I. B. Rogozin, O. Deusch, T. Dagan, W. Martin, and E. V. Koonin Evolutionary Dynamics of Introns in Plastid-Derived Genes in Plants: Saturation Nearly Reached but Slow Intron Gain Continues Mol. Biol. Evol., January 1, 2008; 25(1): 111 - 119. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. W. Roy and D. Penny Intron length distributions and gene prediction Nucleic Acids Res., July 9, 2007; 35(14): 4737 - 4742. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. W. Roy and D. Penny A Very High Fraction of Unique Intron Positions in the Intron-Rich Diatom Thalassiosira pseudonana Indicates Widespread Intron Gain Mol. Biol. Evol., July 1, 2007; 24(7): 1447 - 1457. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. A. Davis, M. P. S. Brown, and U. Singh Functional Characterization of Spliceosomal Introns and Identification of U2, U4, and U5 snRNAs in the Deep-Branching Eukaryote Entamoeba histolytica Eukaryot. Cell, June 1, 2007; 6(6): 940 - 948. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. W. Roy and D. Penny Patterns of Intron Loss and Gain in Plants: Intron Loss-Dominated Evolution and Genome-Wide Comparison of O. sativa and A. thaliana Mol. Biol. Evol., January 1, 2007; 24(1): 171 - 181. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. W. Roy and D. Penny Smoke Without Fire: Most Reported Cases of Intron Gain in Nematodes Instead Reflect Intron Losses Mol. Biol. Evol., December 1, 2006; 23(12): 2259 - 2262. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




