Skip Navigation


MBE Advance Access originally published online on March 1, 2007
Molecular Biology and Evolution 2007 24(5):1093-1096; doi:10.1093/molbev/msm037
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrow All Versions of this Article:
24/5/1093    most recent
msm037v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Nguyen, H. D.
Right arrow Articles by Kenmochi, N.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Nguyen, H. D.
Right arrow Articles by Kenmochi, N.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

Letters

The Evolution of Spliceosomal Introns in Alveolates

Hung D. Nguyen, Maki Yoshihama and Naoya Kenmochi

Frontier Science Research Center, University of Miyazaki, Kiyotake, Miyazaki, Japan

E-mail: kenmochi{at}med.miyazaki-u.ac.jp.


    Abstract
 TOP
 Abstract
 Introduction
 Methods
 Supplementary Materials
 Acknowledgements
 References
 
Many issues concerning the evolution of spliceosomal introns remain poorly understood. In this respect, the reconstruction of the evolution of introns in deep branching species such as alveolates is of special significance. In this study, we inferred the intron evolution in alveolates using 3,368 intron positions in 162 orthologs from 10 species (9 alveolates and 1 outgroup, Homo sapiens). We found that although very few intron gains and losses have occurred in Theileria and Plasmodium recently, many intron gains and losses have occurred in the evolution of alveolates. Thus, the rates of intron gain and loss in alveolates have varied greatly across time and lineage. Our results seem to support the notion that massive intron gains and losses have occurred during short episodes, perhaps coinciding with major evolutionary events.

Key Words: intron evolution • intron gain • intron loss • spliceosomal intron • alveolate evolution • apicomplexan evolution


    Introduction
 TOP
 Abstract
 Introduction
 Methods
 Supplementary Materials
 Acknowledgements
 References
 
Eukaryotic genes are often interrupted by extra DNA sequences, which must be spliced out of the pre-mRNA by the spliceosome. These extra sequences are called spliceosomal introns (or introns for short) as opposed to exons, the parts of genes that encode proteins. Many issues concerning the evolution of spliceosomal introns, for example, the intron density of the last eukaryotic common ancestor, the evolutionary forces underlying intron evolution, and the evolutionary significance of introns, remain poorly understood. Reconstruction of intron evolution in deep branching species, such as in the alveolate kingdom, is of special interest for the clarification of these issues.

Most studies about intron evolution have so far been limited to the crown group (i.e., containing metazoans, fungi, and plants) (Rogozin et al. 2003Go; Roy et al. 2003Go; Nielsen et al. 2004Go; Qiu et al. 2004Go; Yoshihama et al. 2007Go). Two recent studies focused on alveolates, but only at a relatively late stage (Roy and Hartl 2006Go; Roy and Penny 2006Go). These studies showed that Plasmodium and Theileria underwent very low rates of intron gain and loss during the last ~100 Myr. The evolution of introns during the early stage of these 2 lineages as well as in other alveolates remains unknown.

This study is aimed at reconstructing the dynamics of intron gain and loss throughout alveolate evolution. We compiled 162 gene orthologs from 9 alveolates and an outgroup, Homo sapiens. The 9 alveolates are: Tetrahymena thermophila, Perkinsus marinus, Cryptosporidium parvum, Toxoplasma gondii, Babesia bigemina, Theileria parva, Theileria annulata, Plasmodium falciparum, and Plasmodium vivax. Figure 1 shows the evolution of 3,368 intron positions in 162,939 bp of alignment regions. The intron density in the last common ancestor of alveolates is about one third of that in humans and is roughly the same as that in T. thermophila. Since then, the evolutionary paths leading to P. marinus, T. gondii, and Theileria were rich in intron gains, whereas those leading to C. parvum, B. bigemina, and Plasmodium were rich in intron losses. The total number of intron losses slightly outnumbered intron gains (as 2,766 vs. 2,242).


Figure 1
View larger version (17K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 1.— The evolution of spliceosomal introns in alveolates. Numbers of introns present in modern species (known) are in normal mode. Numbers of introns present in ancestors (estimated) are in italics. Numbers of gains and losses (estimated) are in bold black font and bold gray font, respectively. Branches that experienced >1.5 gains per loss are shown in bold black lines and those that experienced >1.5 losses per gain are in bold gray lines.

 
Introns are distributed differently within genes depending on the intron richness of their genomes. In intron-poor genomes, introns are often located within the 5' portions of genes, whereas they are evenly distributed in intron-rich genomes. The mRNA-mediated intron loss has been suggested to be the cause of this pattern (Mourier and Jeffares 2003Go). Sverdlov et al. (2004)Go have found that introns are preferentially acquired in the 3' portions of genes in the 2 intron-rich species, H. sapiens and Arabidopsis thaliana, suggesting that intron gain might also occur via a reverse-transcription mechanism. Our analysis of the evolution of intron position distribution is also consistent with these results: Intron gains in intron-rich species (e.g., P. marinus) and intron losses in intron-poor species (e.g., C. parvum and B. bigemina) occur preferentially within the 3' portions of their genes (Supplementary Material 1, Supplementary Material online).

On the one hand, our results show that very few intron gains and losses have occurred recently in Plasmodium and Theileria. Out of the 3,368 intron positions we investigated, we found no intron gains or losses in either P. vivax or P. falciparum since they diverged from each other. Similarly, we found only one intron gain and one intron loss in T. parva and no intron gains or losses in T. annulata since they diverged from each other. These results are consistent with previous results obtained using a greater number of gene orthologs but fewer species (Roy and Hartl 2006Go; Roy and Penny 2006Go), suggesting that trends of intron gain and loss can be revealed by using a moderate number of gene orthologs.

On the other hand, our results also show that many intron gains and losses occurred at early periods during the evolutionary paths leading to Plasmodium and Theileria. For example, many intron gains occurred during the period from the time of divergence of P. marinus to the apicomplexa ancestor, whereas many intron losses occurred during the following period. In addition, the evolutionary path leading to P. marinus was very rich in intron gains, whereas the evolutionary path leading to C. parvum was very rich in intron losses. Thus, the rates of intron gain and loss in alveolates have varied greatly across time and lineage (fig. 2 and Supplementary Material 2, Supplementary Material online).


Figure 2
View larger version (13K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 2.— The variations in the rates of intron gain and loss across time and lineage. Rates of intron gain and loss in the evolution of 3 species: Theileria parva, Perkinsus marinus, and Cryptosporidium parvum are shown. P. marinus has a high rate of intron gain but a low rate of intron loss, whereas C. parvum shows the opposite trend. Rates of intron gain and loss in the evolution of T. parva have varied considerably across time. The circles mark the divergent points of P. marinus and C. parvum.

 
Although certain other considerations, such as saturation of sequence changes and the amino acid substitution model may affect the estimated times of divergence (Roger and Hug 2006Go), we believe that they do not significantly impinge on the conclusions mentioned above. There are mainly 2 reasons for this. Firstly, the use of a large number of sites (~32,000) taken from highly conserved sequence alignments weakens the potential impact of these considerations. Secondly, the variations in estimated times of divergence are quite small (often less than 2-fold; Roger and Hug 2006Go) compared with the variations in intron gain and loss rates (several orders of magnitude).

Our method (Nguyen et al. 2005Go) inferred that 269 (22%) of the 1,217 introns in H. sapiens are shared with 471 introns in the alveolata ancestor. Thus, using parsimony, at least 22% of the introns in H. sapiens must date back to the last common ancestor of H. sapiens with alveolates. Assuming an intron-free ancestral genome at the time of mitochondrial endosymbiosis, one or several large-scale intron gain events must have occurred during the period between the time of endosymbiosis and the emergence of the last common ancestor of metazoans and alveolates. One such event may have happened right after endosymbiosis and may have been the cause of nucleus–cytosol compartmentalization, a major step in the emergence of the first eukaryote (Koonin 2006Go; Martin and Koonin 2006Go). Because none of the sequenced genomes that diverged before the last common ancestor of metazoans and alveolates has so far been shown to be intron rich, other large-scale intron insertion events may have also occurred during the period between the last eukaryotic common ancestor and the last common ancestor of metazoans and alveolates.

The study of intron gains and losses in paralogous gene families from the crown group has suggested that large-scale intron gains and losses may have occurred during transitional evolutionary epochs (Babenko et al. 2004Go). In this study, we have shown further that the evolution of introns in an early branching kingdom, the alveolates, also follows this pattern. The large-scale episodic occurrence of intron gain and loss can be accounted for by 2 models. The neutralist model proposes that transitional evolutionary episodes were often associated with population bottlenecks, which would have weakened purifying selection and facilitated the fixation of otherwise deleterious mutations, such as intron gains and losses (Lynch and Conery 2003Go; Babenko et al. 2004Go). Alternatively, large-scale intron gains and losses may be associated with occasional invasions of transposable elements, whose reverse transcriptases may have determined the rates of intron gain and loss (Roy and Hartl 2006Go). The rigorous testing of these models will be the subject of future research.


    Methods
 TOP
 Abstract
 Introduction
 Methods
 Supplementary Materials
 Acknowledgements
 References
 
Compilation of the Data Set
Supplementary Material 3 (Supplementary Material online) shows the sources of the genome sequences and annotations of the 10 species studied. We used the same method for collecting gene orthologs as published elsewhere (Nguyen et al. 2006Go; Yoshihama et al. 2006Go). We first extracted protein sequences of H. sapiens from the 684-ortholog data set (Rogozin et al. 2003Go). Next, for each of the sequences in H. sapiens, we searched for the reciprocal best hit sequences in T. thermophila, C. parvum, T. gondii, T. annulata, T. parva, P. falciparum, and P. vivax using Blast (Altschul et al. 1997Go). The gene structures of the above sequences were constructed based on the annotations. The 684 orthologs from 8 species having annotations were then aligned by using MUSCLE (Edgar 2004Go), and the multiple sequence alignments were manually examined to choose those that were well conserved. This yielded 190 orthologs. Based on their alignments, these orthologs were then manually curated to reduce the effect of annotation mistakes.

Because there were no annotations for P. marinus and B. bigemina, the protein sequences of T. annulata were used as inputs for the Blast program (Altschul et al. 1997Go) to find DNA regions in these 2 genomes that most matched those of T. annulata. The regions were then extracted and their gene structures were manually constructed based on sequence similarity with T. annulata and other species. In the end, we were able to obtain 162 (full or partial) orthologs from all 10 species (Supplementary Materials 4 and 5, Supplementary Material online).

Construction of Phylogenetic Trees and Intron Evolution
Multiple sequence alignments of each of these orthologs were built using MUSCLE (Edgar 2004Go). An ad hoc program was written in C to map the intron positions on these alignments and extract an intron presence/absence matrix of all intron positions in the conserved regions. We also examined each alignment manually to add other intron positions, which were not in the conserved regions and clearly not misaligned, to the intron presence/absence matrix (Supplementary Material 6, Supplementary Material online). All conserved regions having a length ≥ 50 amino acids were extracted and concatenated together, and the SEQBOOT (100 replicates), PROTDIST, NEIGHBOR, and CONSENSE programs of the PHYLIP package (Felsenstein and Churchill 1996Go) were used to build a phylogenetic tree for the species (Supplementary Material 7, Supplementary Material online). Finally, the intron presence/absence matrix and the phylogenetic tree (with H. sapiens as the outgroup) were used as inputs for our maximum likelihood method (Nguyen et al. 2005Go) to infer intron evolution.

Estimation of the Times of Divergence
A simple algorithm was used to construct a linearized tree from the phylogenetic tree. Let hX be the height of node X and lXY be the length of branch XY. In our algorithm, the tree is traversed by the postorder traversal so that at each internal node, T, the heights of its 2 child nodes U and V are already known. The algorithm first computes: h1 = hU + lTU and h2 = hV + lTV. If h1 ≥ h2, hT is assigned the value of h1 and the heights of all nodes on the subtree rooted at V are scaled up with the h1/h2 ratio. Otherwise, the roles of U and V are exchanged when computing hT. The advantage of our algorithm as compared with the one proposed by Takezaki et al. (1995)Go is that none of the branches on the linearized tree will have a length of zero. Finally, we calibrated the times of divergence using the assumption that the ciliate T. thermophila branched off the tree ~800 MYA (Douzery et al. 2004Go).


    Supplementary Materials
 TOP
 Abstract
 Introduction
 Methods
 Supplementary Materials
 Acknowledgements
 References
 
Supplementary Materials are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).


    Acknowledgements
 TOP
 Abstract
 Introduction
 Methods
 Supplementary Materials
 Acknowledgements
 References
 
The authors would like to thank Dr Tetsuo Hashimoto for useful comments. This study was supported by Grants-in-Aid for Scientific Research (14035103, 15310135, 188093, and 17770207) from the Ministry of Education, Culture, Sports, Science and Technology and the Japan Society for the Promotion of Science (JSPS). H.D.N. is a research fellow of the JSPS (17-05174). Preliminary sequence data of Toxoplasma gondii and Plasmodium vivax, Perkinsus marinus, and Babesia bigemina were obtained from ToxoDB (http://www.toxoDB.org), PlasmoDB (http://www.plasmoDB.org), the Institute for Genomic Research (http://www.tigr.org/tdb/e2k1/pmg/), and the Wellcome Trust Sanger Institute (http://www.sanger.ac.uk/Projects/B_bigemina/), respectively. Sequencing of T. gondii, P. marinus, and B. bigemina is funded by the National Institute of Allergy and Infectious Disease, the National Science Foundation, and the Wellcome Trust Sanger Institute, respectively. Sequencing of P. vivax is funded by the National Institute of Allergy and Infectious Diseases, the US Department of Defense, and the Burroughs Wellcome Fund.


    Footnotes
 
Takashi Gojobori, Associate Editor


    References
 TOP
 Abstract
 Introduction
 Methods
 Supplementary Materials
 Acknowledgements
 References
 

    Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res (1997) 25:3389–3402.[Abstract/Free Full Text]

    Babenko VN, Rogozin IB, Mekhedov SL, Koonin EV. Prevalence of intron gain over intron loss in the evolution of paralogous gene families. Nucleic Acids Res (2004) 32:3724–3733.[Abstract/Free Full Text]

    Douzery EJP, Snell EA, Bapteste E, Delsuc F, Philippe H. The timing of eukaryotic evolution: does a relaxed molecular clock reconcile proteins and fossils? Proc Natl Acad Sci USA (2004) 101:15386–15391.[Abstract/Free Full Text]

    Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res (2004) 32:1792–1797.[Abstract/Free Full Text]

    Felsenstein J, Churchill GA. A hidden Markov model approach to variation among sites in rate of evolution. Mol Biol Evol (1996) 13:93–104.[Abstract]

    Koonin EV. The origin of introns and their role in eukaryogenesis: a compromise solution to the introns-early versus introns-late debate? Biol Direct (2006) 1:22.[CrossRef][Medline]

    Lynch M, Conery JS. The origin of genome complexity. Science (2003) 302:1401–1404.[Abstract/Free Full Text]

    Martin W, Koonin EV. Introns and the origin of nucleus-cytosol compartmentalization. Nature (2006) 440:41–45.[CrossRef][Medline]

    Mourier T, Jeffares DC. Eukaryotic intron loss. Science (2003) 300:1393.[Free Full Text]

    Nguyen HD, Yoshihama M, Kenmochi N. New maximum likelihood estimators for eukaryotic intron evolution. PLoS Comput Biol (2005) 1:e79.[CrossRef][Medline]

    Nguyen HD, Yoshihama M, Kenmochi N. Phase distribution of spliceosomal introns: implications for intron origin. BMC Evol Biol (2006) 6:69.[CrossRef][Medline]

    Nielsen CB, Friedman B, Birren B, Burge CB, Galagan JE. Patterns of intron gain and loss in fungi. PLoS Biol (2004) 2:e422.[CrossRef][Medline]

    Qiu WG, Schisler N, Stoltzfus A. The evolutionary gain of spliceosomal introns: sequence and phase preferences. Mol Biol Evol (2004) 21:1252–1263.[Abstract/Free Full Text]

    Roger AJ, Hug LA. The origin and diversification of eukaryotes: problems with molecular phylogenetics and molecular clock estimation. Philos Trans R Soc Lond B Biol Sci (2006) 361:1039–1054.[Abstract/Free Full Text]

    Rogozin IB, Wolf YI, Sorokin AV, Mirkin BG, Koonin EV. Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Curr Biol (2003) 13:1512–1517.[CrossRef][Web of Science][Medline]

    Roy SW, Fedorov A, Gilbert W. Large-scale comparison of intron positions in mammalian genes shows intron loss but no gain. Proc Natl Acad Sci USA (2003) 100:7158–7162.[Abstract/Free Full Text]

    Roy SW, Hartl DL. Very little intron loss/gain in Plasmodium: intron loss/gain mutation rates and intron number. Genome Res (2006) 16:750–756.[Abstract/Free Full Text]

    Roy SW, Penny D. Large-scale intron conservation and order-of magnitude variation in intron loss/gain rates in apicomplexan evolution. Genome Res (2006) 16:1270–1275.[Abstract/Free Full Text]

    Sverdlov AV, Babenko VN, Rogozin IB, Koonin EV. Preferential loss and gain of intron in 3' portions of genes suggests a reverse-transcription mechanism on intron insertion. Gene (2004) 338:85–91.[CrossRef][Web of Science][Medline]

    Takezaki N, Rzhetsky A, Nei M. Phylogenetic test of the molecular clock and linearized trees. Mol Biol Evol (1995) 12:823–833.[Abstract]

    Yoshihama M, Nakao A, Nguyen HD, Kenmochi N. Analysis of ribosomal protein gene structures: implications for intron evolution. PLoS Genet (2006) 2:e25.[CrossRef][Medline]

    Yoshihama M, Nguyen HD, Kenmochi N. Intron dynamics in ribosomal protein genes. PLoS ONE (2007) 2:e141.[CrossRef]

Accepted for publication February 26, 2007.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Mol Biol EvolHome page
A. R. Omilian, D. G. Scofield, and M. Lynch
Intron Presence-Absence Polymorphisms in Daphnia
Mol. Biol. Evol., October 1, 2008; 25(10): 2129 - 2139.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. W. Roy and M. Irimia
Origins of Human Malaria: Rare Genomic Changes and Full Mitochondrial Genomes Confirm the Relationship of Plasmodium falciparum to Other Mammalian Parasites but Complicate the Origins of Plasmodium vivax
Mol. Biol. Evol., June 1, 2008; 25(6): 1192 - 1198.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
M. Csuros, I. B. Rogozin, and E. V. Koonin
Extremely Intron-Rich Genes in the Alveolate Ancestors Inferred with a Flexible Maximum-Likelihood Approach
Mol. Biol. Evol., May 1, 2008; 25(5): 903 - 911.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. W. Roy and D. Penny
Widespread Intron Loss Suggests Retrotransposon Activity in Ancient Apicomplexans
Mol. Biol. Evol., September 1, 2007; 24(9): 1926 - 1933.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrow All Versions of this Article:
24/5/1093    most recent
msm037v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Nguyen, H. D.
Right arrow Articles by Kenmochi, N.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Nguyen, H. D.
Right arrow Articles by Kenmochi, N.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?