MBE Advance Access originally published online on February 21, 2008
Molecular Biology and Evolution 2008 25(5):821-830; doi:10.1093/molbev/msn013
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
Near Intron Positions Are Reliable Phylogenetic Markers: An Application to Holometabolous Insects

,
* Department of Genetics, Institute of Biology II, University of Leipzig, Leipzig, Germany
Bioinformatics Group, Department of Computer Science, University of Leipzig, Leipzig, Germany
Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany; RNomics Group, Fraunhofer Institute for Cell Therapie and Immunology, Leipzig; Institute for Theoretical Chemistry, University of Vienna, Wien, Austria; Santa Fe Institute, Santa Fe, NM
E-mail: krauss{at}rz.uni-leipzig.de.
| Abstract |
|---|
|
|
|---|
Today, the reconstruction of the organismal evolutionary tree is based mainly on molecular sequence data. However, sequence data are sometimes insufficient to reliably resolve in particular deep branches. Thus, it is highly desirable to find novel, more reliable types of phylogenetic markers that can be derived from the wealth of genomic data. Here, we consider the gain of introns close to older preexisting ones. Because correct splicing is impeded by very small exons, nearby pairs of introns very rarely coexist, that is, the gain of the new intron is nearly always associated with the loss of the old intron. Both events may even be directly connected as in cases of intron migration. Therefore, it should be possible to identify one of the introns as ancient (plesiomorphic) and the other as novel (derived or apomorphic). To test the suitability of such near intron pairs (NIPs) as a marker class for phylogenetic analysis, we undertook an analysis of the evolutionary positions of bees and wasps (Hymenoptera) and beetles (Coleoptera) in relation to moths (Lepidoptera) and dipterans (Diptera) using recently completed genome project data. By scanning 758 putatively orthologous gene structures of Apis mellifera (Hymenoptera) and Tribolium castaneum (Coleoptera), we identified 189 pairs of introns, one from each species, which are located less than 50 nt from each other. A comparison with genes from 5 other holometabolan and 9 metazoan outgroup genomes resulted in 22 shared derived intron positions found in beetle as well as in butterflies and/or dipterans. This strongly supports a basal position of hymenopterans in the holometabolous insect tree. In addition, we found 31 and 12 intron positions apomorphic for A. mellifera and T. castaneum, respectively, which seem to represent changes inside these branches. Another 12 intron pairs indicate parallel intron gains or extraordinarily small exons. In conclusion, we show here that the analysis of phylogenetically nested, nearby intron pairs is suitable to identify evolutionarily younger intron positions and to determine their relative age, which should be of equal importance for the understanding of intron evolution and the reconstruction of the eukaryotic tree.
Key Words: intron evolution molecular phylogenetics near intron pair (NIP) intron gain insect phylogeny
| Introduction |
|---|
|
|
|---|
The overwhelming part of the literature in molecular phylogenetics is based upon the analysis of nucleic acid and amino acid sequences. Despite decades of effort by molecular systematists, evolutionary trees of eukaryotes still remain partly unresolved or inconsistent with each other (for review, see Roger and Hug 2006
Therefore, so-called genome-level characters (for review, see Boore 2006
) may have great potential for resolving crucial relationships for which no other data seem to be promising. The best available example for such characters appears to be the insertion of transposable elements in mammalian introns. Recently, work based on this character type has precisely resolved the relationships of placental mammals (Kriegs et al. 2006
; Nishihara et al. 2006
). In relation to sequence analysis, the signal-to-noise ratio of these analyses is outstanding. Boore (2006)
summarized that only one out of 128 analyzed mammalian retrotranspositions was found to be homoplastic. However, among ecdysozoans, a similar transposon insertion analysis will be severely impeded by the longer time of divergence and the generally higher sequence substitution rate.
Spliceosomal intron positions had been introduced as another promising class of phylogenetic markers for robustly resolving divergence (Venkatesh et al. 1999
; Rokas and Holland 2000
; Nguyen et al. 2005
; Roy and Gilbert 2005
; Zheng et al. 2007
). Introns have a slow rate of insertion and loss and evolve largely unaffected by the coding sequence (CDS) (Lynch and Richardson 2002
; Roy et al. 2003
; Yandell et al. 2006
). However, some other studies (e.g., Cho et al. 2004
; Krauss et al. 2005
) found that identical introns were frequently lost independently in separate evolutionary lineages. On the other hand, recent analyses (Sverdlov et al. 2005
; Yoshihama et al. 2006
) estimated that parallel intron gains at orthologous positions in different evolutionary lineages account for 1.3–3.0% of all intron positions. Under these conditions, methods to improve the reliability of intron markers are highly desirable (Krzywinski and Besansky 2002
; Wada et al. 2002
).
Therefore, we were interested to efficiently exclude homoplastic (parallel, reverse, or convergent) changes of intron positions from analysis. During our study of eIF2
intron evolution (Krauss et al. 2005
), we documented several cases of successive losses and gains of introns at only slightly different positions. When mapped onto the gene tree, this results in a phylogenetically nested distribution of evolutionarily newer introns (fig. 1). Therefore, we introduce the term "near intron pair" for 2 introns which exist in orthologous genes of different genomes at nearby locations less than 50 nt away from each other. Normally, 2 introns cannot coexist that close in 1 gene because exons smaller than about 50 nt are relatively rare (Saeys et al. 2007
) and functionally disadvantageous (Hwang and Cohen 1997
; Carlo et al. 2000
). This is also consistent with a study by Lynch and Kewalramani (2003)
, which shows that exon sizes are more uniform than expected under a random insertion model. Therefore, changes of intron positions over distances of less than 50 nt should represent reliable synapomorphic character states.
|
Here, as a proof of concept, we have analyzed the relationships of major holometabolous insect groups. Accounting for more than 50% of all animal species (Kristensen 1999
| Materials and Methods |
|---|
|
|
|---|
Compilation of the Data Set
We downloaded all 7180 predicted protein sequences of the Apis mellifera genome (Build 2.1, Honeybee Genome Sequencing Consortium 2006
Next, the CDS of all Apis proteins for which we identified at least one such corresponding human gene were downloaded. Again, genes were discarded if the Apis CDS contained no intron. At the end of this selection, we obtained 758 putatively orthologous groups each consisting of 1 gene of A. mellifera, 1 gene of T. castaneum, and at least 1 gene of Homo sapiens. Occasional gene duplications in the human lineage are supposed to have occurred after the divergence of the deuterostomian from the protostomian lineage (Holland 2003
; Putnam et al. 2007
), which allows more than one orthologous gene in vertebrates.
Subsequently, annotated versions of the identified Tribolium gene sequences were downloaded or, alternatively, generated by manual alignment of open reading frame translations to the Apis protein sequence. To find the exact position of a putative Tribolium intron, which corresponds to alignment gaps in Apis larger than 42 nt, we used the similarity of the coded amino acids together with 5' and 3' splice position weight matrices for U2-type introns of Drosophila (Sheth et al. 2006
). These manual alignments and annotations were done using the program MacVector 7.2 (Accelrys, San Diego, CA). The resulting gene structures remained sometimes incomplete because Tribolium intron positions located in unalignable regions of Apis could not be determined. Identified intron positions of Apis and Tribolium were named according to the homologous Apis triplet and the position inside this triplet (e.g., 127-2) and sampled. We assumed that all introns with identical positions in putatively orthologous genes are homologous to each other, although the corresponding sequences were generally too diverged to be aligned. The orthology of each identified Tribolium gene structure containing a near intron to an intron of Apis (see below) was tested by a reciprocal Blast search (BlastX) to the Apis protein set (Tatusov et al. 1997
).
Analysis of NIPs
Based on the sampled intron positions of Apis and Tribolium, we plotted the lengths of internal exons, that is, of exons residing between 2 introns inside the CDS. In human cells, the critical exon size seems to be 50 nt, suggested by the finding that the few smaller exons have developed specific mechanisms to increase their inclusion into the mRNA (Hwang and Cohen 1997
; Carlo et al. 2000
). A similar critical size was revealed by our data, which showed that small exons were only above the size of 50 nt more abundant than neighboring intron pairs (consisting of an Apis-specific intron and a Tribolium-specific intron) (fig. 2B). Accordingly, it is likely that many of these intron pairs with a distance greater than 50 nt have evolved by the differential loss of introns bordering former, smaller exons and will be phylogenetically uninformative. Therefore, we limited our analysis to the 189 intron pairs whose positions differed by between 1 and 49 nt. In the following, these intron pairs will be named NIPs.
|
For each NIP, we constructed an alignment of putatively orthologous metazoan proteins using BlastP search results (Altschul et al. 1997
Subsequently, the phylogenetic distribution of each intron of a pair was evaluated using TBlastN and manual annotation of the adjacent, putatively orthologous exons in the Arthropod genomes of Drosophila melanogaster (or Drosophila pseudoobscura), Anopheles gambiae, Aedes aegypti, Bombyx mori, Nasonia vitripennis, Pediculus humanus, Acyrthosiphon pisum, and Daphnia pulex; the Deuterostomia genomes of Gallus gallus, Danio rerio, Ciona intestinalis (or Ciona savigni), and Strongylocentrotus purpuratus (all at NCBI, http://www.ncbi.nlm.nih.gov), in the Platyhelminthes genome of Schistosoma mansoni (Sanger Institute, http://www.sanger.ac.uk/Projects/S_mansoni); and in the Cnidaria genome of Nematostella vectensis (Joint Genome Institute, http://genome.jgi-psf.org). For this purpose, we retrieved the as yet unassembled genome sequences of Acyrthosiphon and Daphnia from the NCBI trace site by discontiguous megablast and assembled the sequences of adjoining exons. For every species, we selected the hit containing the fragments with highest amino acid or nucleotide identity. To find the exact position of a putative intron, we used the similarity of the coded amino acids together with 5' and 3' splice position weight matrices for U2-type introns (Sheth et al. 2006
). All genome fragments whose intron positions could not be associated with a corresponding Apis triplet were excluded from analysis. Finally, orthology of all obtained gene fragments was tested by the reciprocal best Blast hit method (Tatusov et al. 1997
). If the corresponding BlastX comparison of a gene fragment to Apis proteins failed, it was discarded. Altogether, we excluded 54 of the initial 189 NIPs from further analyses because of insufficient sequence conservation or gene duplications.
Construction of Intron Distribution Matrices
From the 135 remaining intron pairs, an intron position matrix and a corresponding intron pair matrix were manually created and analyzed in MacClade 4.0 (Maddison DR and Maddison WP 2005
) (supplementary material 1 and 2, Supplementary Material online). Within the intron position matrix each intron of a pair is coded as "1," each empty position as "0," and no data as "?". Within the intron pair matrix, the upstream intron is coded as "1" and the downstream intron as "2," whereas intronless pair positions and no data are coded as "?".
Automated Analysis
To control and complement the manual analysis described above, we performed the following computational tasks. The identified 135 NIPs, which were obtained from 118 different genes, were automatically evaluated by constructing multiple alignments of the extracted CDS. For this purpose, DIALIGN2 (Morgenstern et al. 2006
) was used in translated mode. For some data sets, we also used transAlign (Bininda-Emonds 2005
), which translates CDSs to amino acid sequences, creates a ClustalW (Chenna et al. 2003
) protein alignment, and back-translates it to aligned DNA sequences. The procedure included a check for correct splice site dinucleotides (GT/AG and GC/AG) and an automatic adaption of CDS annotations. The intron positions of the different sequences were determined according to the homologous Apis triplets using translated BLAT (Kent 2002
). For all 135 NIPs, we found no differences between the results of the automatic and the manual analysis. At the sites of introns, the received CDS alignments were supplemented with the first 8 and the last 14 intronic nucleotides. These intron-containing nucleotide alignments were complemented with a schematic picture for each NIP (supplementary material 3, Supplementary Material online).
| Results |
|---|
|
|
|---|
Near Intron Positions with Distances Less than 50 nt Are More Abundant than Small Exons
Based on 2 NIPs in the eIF2
gene, we had preliminarily suggested that Coleoptera and not Hymenoptera is the sister group of the Mecopterida, which contains the Diptera and the Lepidoptera (Krauss et al. 2005The distribution of exon sizes shows, for both species, a maximum between 120 and 220 nt, a long tail of relatively few large exons and a few small exons (fig. 2). Only 39 of 4798 Apis exons and 8 of 1863 Tribolium exons are smaller than 50 nt. This distribution caused us to search for phylogenetically informative intron position pairs whose positions differ from each other only by between 1 and 49 nt (Material and Methods). This search resulted initially in 189 NIPs, each consisting of a specific Apis intron and a specific Tribolium intron. After confirming the corresponding intron positions using metazoan protein and arthropod EST sequences, we evaluated their phylogenetic distributions. This was done by the identification and annotation of the adjacent, putatively orthologous exons in 14 metazoan species (Materials and Methods). We used Drosophila, Anopheles, Aedes, and Bombyx as ingroup taxa; Nasonia as sister group to Apis; and Acyrthosiphon, Pediculus, Daphnia, Danio, Gallus, Ciona, Strongylocentrotus, Schistosoma, and Nematostella as outgroup. We did not include other vertebrate and Drosophila genomes because of the overwhelming similarity of occupied intron positions between used and those unused genomes.
Near Intron Positions Are Information-Rich Phylogenetic Markers
For 135 out of 189 identified NIPs, we obtained a gapless alignment of putatively orthologous CDSs around the 2 intron positions in at least some of the outgroup and ingroup species. On average, genomic regions corresponding to the 135 NIPs could be identified in 97.2% of the ingroup and 84.9% of the outgroup species (supplementary material 4, Supplementary Material online). Only 26.1% (ingroup) and 54.9% (outgroup) of these orthologous genomic regions contained at least one intron out of the analyzed pair. The relatively low abundance of introns in the ingroup is due to the intron-poor genomes of the dipterans. For the analyzed intron pairs, an intron position matrix and a corresponding intron pair matrix were manually created and analyzed in MacClade 4.0 (Maddison DR and Maddison WP 2005
). The collected intron data were mapped onto a tree (fig. 3B) according to the commonly accepted phylogeny of metazoans and the hypothesis that hymenopterans diverged at the base of radiation of holometabolous insects as supported by this analysis (see below) as well as previous ones (Krauss et al. 2005
; Savard et al. 2006
).
|
Using this tree, 102 out of 135 NIPs could be differentiated into a plesiomorphic (outgroup shared) and an apomorphic (derived) intron position relative to the separation of the evolutionary lineages of Apis and Tribolium (fig. 4). In all, 22 of these NIPs contain common derived (synapomorphic) introns of Tribolium with Diptera and/or Lepidoptera species (table 1). In contrast, no common derived intron of Apis with Diptera or Lepidoptera species could be found. This result supports both the suitability of NIP distributions for phylogenetic analysis and the existence of a monophyletic group consisting of Diptera + Lepidoptera + Coleoptera to the exclusion of the Hymenoptera.
|
|
Sources of Homoplasy in NIP Characters
The quality (the information content) of a phylogenetic marker is mirrored by the fraction of homoplastic character distributions along the tree. Concerning NIPs, inconsistent distributions of intron positions along the tree are an easily detectable form of homoplasy which may be caused by 1) ancient small exons which have been fused independently with neighboring exons in different lineages through the loss of different bordering introns or by 2) combined intron loss and gain, occuring independently at the same positions, in different lineages. Actually, whereas 102 NIPs appear phylogenetically informative (fig. 4), 12 NIPs revealed an inconsistent distribution, that is, both intron positions were found in at least one of the outgroup species (table 2). One of these inconsistencies (GI: 66546088) was surely caused by an ancient small exon, which was found, bordered by both introns of this NIP, in Nematostella. Three other cases also implicate an ancient small exon whose bordering introns have been lost differently in the analyzed lineages because otherwise the intron positions of those pairs had to have changed more than twice (GIs: 66512196, 66499842, and 48120807). However, in the remaining 8 cases, the inconsistency was caused by the occurrence of the putative apomorphic intron in only one of the outgroups. Therefore, we wondered whether the apparently apomorphic intron positions of these pairs might have been gained in parallel in holometabolous insects and in those outgroup lineages. We tested this possibility using additional intron positions, which were found during the sampling of NIPs less than 50 nt away from one or both introns of an analyzed pair in out- and ingroup species (supplementary material 3, Supplementary Material online). We collected these independent NIPs in a separate presence/absence matrix (supplementary material 5, Supplementary Material online) and analyzed all intron position changes, which could be unambiguously attributed to one branch of the tree (fig. 5).
|
|
Our result confirmed earlier reports by Raible et al. (2005)
In summary, we found evidence for both suspected types of homoplasy. However, the amount of homoplasy is clearly too small to interfere with the phylogenetic analysis. It is important to add that both the reliability of such analyses and the numbers of inconsistent distributions critically depend on the number and kind of outgroups used. The usage of fewer outgroups would have resulted in fewer inconsistent distributions, but probably also in some contrary, seemingly synapomorphic evidence. In addition, outgroups with a relatively fast intron evolution (such as Ciona and Schistosoma) will boost the amount of homoplasy.
Interestingly, we detected no example for the opposite type of inconsistent intron distribution, that is, we never found both intron positions in at least one of the ingroup species. This might be due to the much smaller evolutionary divergence of the ingroup in relation to the outgroup lineages, which is represented by the sum of corresponding branch lengths (fig. 5).
| Discussion |
|---|
|
|
|---|
In this study, we show that the determination of a plesiomorphic/apomorphic NIP is suitable 1) to reveal intron gain events, 2) to assign a relative age to apomorphic (gained) introns, and 3) to use combined intron loss and gain events to evaluate phylogenetic hypotheses. Whereas intron losses are common evolutionary events, only relatively few intron gains have been convincingly demonstrated (Rodriguez-Trelles et al. 2006
Inside this novel introns, we have not found any significant sequence conservation to other introns, to transposons, or to exons (data not shown). Consequently, the origin of this introns could not be determined. It remains possible that at least some of these novel intron positions resulted from intron migration (intron sliding) and did not involve the insertion of novel introns. However, because of the gapless conservation of the CDS around the NIPs, this would require some convergent base substitutions and, thus, appears rather unlikely especially for larger distances between the introns of one pair. Based on the structure of splice sites and available evidence, intron sliding processes might have occurred most probably at positions spaced by 1 (Rogozin et al. 2000
) or 3 nt (Krauss et al. 2005
; Hiller et al. 2007
). Interestingly, whereas NIPs with 1 nt distance are not overrepresented in our data (4 of 189), NIPs with 3 nt distance appear significantly more abundant than such with other spacing (13 of 189, P = 0.0003 by student's t-test). Thus, intron migration might be a pathway to the emergence of some NIPs, however, in our opinion, successive losses and gains of introns might have contributed more to the data. These hypothetical pathways to NIPs will be very difficult to dissect because of the high abundance of independent intron losses (see Roy and Penny 2007
and references therein), which renders the detection of an intronless NIP site, a potential intermediate of the second pathway, worthless.
Consistent with other studies (Carmel et al. 2007
and references therein), we observed highly divergent rates of intron evolution. According to a hypothesis of evolution of genomic complexity (Lynch and Conery 2003
), intron gain would have resulted in slightly deleterious alleles and thus may have occurred mainly during limited time spans, being driven by decreased population sizes. This is consistent with our data, if we compare 22 apomorphic intron positions that were gained in the ancestral line of Diptera, Lepidoptera, and Coleoptera (but not Hymenoptera) with only 12 intron positions, which emerged later in the evolutionary line of Tribolium (fig. 5). According to the fossil record (Grimaldi and Engel 2005
), such a common ancestor may not have existed longer than about 40 Myr (from the start of the late Carboniferous to 285 MYA), whereas the beetle line is at least 280 Myr old. The identified numbers of lineage-specific intron changes are also significantly different from the protein divergence rates between holometabolic insects (Zdobnov and Bork 2007
). Specifically, these authors found that the common branch of Diptera, Lepidoptera, and Coleoptera is about 20 times shorter than the Tribolium branch. In addition, we did not find any NIP-based differences between Apis and Nasonia, although the corresponding evolutionary lines separated at the latest 150 MYA (Grimaldi and Engel 2005
). Taken together, this is a significant finding as it means that NIPs may offer phylogenetic information complementary to sequence analyses.
Based on altogether 24 synapomorphic NIP distributions from the current analysis and the preliminary investigation on the eIF2
gene (Krauss et al. 2005
), our study supports without any contrary evidence a more basal position of the hymenopterans in relation to the beetles within the tree of holometabolous insects. This result contradicts most former phylogenetic hypotheses (for review, see Beutel and Pohl 2006
), but it is backed by paleontologists (Rohdendorf and Rasnitsyn 1980
), morphologists (Ross 1965
; Kukalová-Peck and Lawrence 2004
), sequence analysis of ESTs (Savard et al. 2006
), genomic sequences (Zdobnov and Bork 2007
), and application of mixed DNA/RNA models to 18S data (Misof et al. 2007
). Because this study is based exclusively on the analysis of intron position differences between Apis and Tribolium, we could not test the possibility of a basal position of Lepidoptera or Diptera inside the Holometabola. However, corresponding trees would be less parsimonious than our tree (fig. 5). Specifically, a Holometabola tree placing basal the Diptera, the Lepidoptera, or the Mecopterida (Diptera + Lepidoptera) would result in 8, 20, or 22 additional inconsistent NIP distributions, respectively (table 1), and would resolve none of the 12 other inconsistent distributions (table 2). In addition, to our knowledge, corresponding hypotheses have never been proposed (see e.g., Beutel and Pohl 2006
). It appears more important to expand the genome-scaled studies of holometabolous insects to include species of smaller groups, such as the Strepsiptera, Neuropterida, and Mecoptera. Without genome projects for such species, it still remains possible to use the apomorphic introns determined in this study to resolve the phylogenetic position of these groups by sequencing the corresponding gene fragments.
Finally, though we searched only for resolving the Coleoptera–Hymenoptera–Mecopterida trifurcation, we found 19 apomorphies on other branches of the used tree (fig. 5). This points to a general usability of near intron positions as novel phylogenetic marker in metazoans and, hopefully, in all eukaryotes.
| Supplementary Material |
|---|
|
|
|---|
Supplementary materials 1–5 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
| Acknowledgements |
|---|
|
|
|---|
We gratefully acknowledge the sequencing of the yet unpublished genomes of Danio rerio, Strongylocentrotus purpuratus, Schistosoma mansoni, Daphnia pulex, Acyrthosiphon pisum, Pediculus humanus, Nasonia vitripennis, and Tribolium castaneum. We thank 3 anonymous reviewers for their insightful comments that helped to improve the manuscript. This work was supported by the Deutsche Forschungsgemeinschaft (KR2065/2-1 to V.K. and STA850/6-1 to P.F.S.). PFS holds external affiliations with the Santa Fe Institute, the Instutite of Theoretical Chemistry of the University of Vienna, and the Fraunhofer Institute for Cell Therapy and Immunology.
| Footnotes |
|---|
Barbara Ruth Holland, Associate Editor
| References |
|---|
|
|
|---|
Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res (1997) 25:3389–3402.
Berney C, Pawlowski J. A molecular time-scale for eukaryote evolution recalibrated with the continuous microfossil record. Proc Biol Sci (2006) 273:1867–1872.
Beutel RG, Pohl H. Endopterygote systematics—where do we stand and what is the goal (Hexapoda, Arthropoda)? Syst Entomol (2006) 31:202–219.[CrossRef]
Bininda-Emonds ORP. transAlign: using amino acids to facilitate the multiple alignment of protein-coding DNA sequences. BMC Bioinformatics (2005) 6:156.[CrossRef][Medline]
Boore JL. The use of genome-level characters for phylogenetic reconstruction. Trends Ecol Evol (2006) 21:439–446.[CrossRef][Medline]
Carlo T, Sierra R, Berget SM. A 5' splice site-proximal enhancer binds SF1 and activates exon bridging of a microexon. Mol Cell Biol (2000) 20:3988–3995.
Carmel L, Wolf YI, Rogozin IB, Koonin EV. Three distinct modes of intron dynamics in the evolution of eukaryotes. Genome Res (2007) 17:1034–1044.
Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD. Multiple sequence alignment with the clustal series of programs. Nucleic Acids Res (2003) 31:3497–3500.
Cho S, Jin SW, Cohen A, Ellis RE. A phylogeny of Caenorhabditis reveals frequent loss of introns during nematode evolution. Genome Res (2004) 14:1207–1220.
Douzery EJ, Snell EA, Bapteste E, Delsuc F, Philippe H. The timing of eukaryotic evolution: does a relaxed molecular clock reconcile proteins and fossils? Proc Natl Acad Sci USA (2004) 101:15386–15391.
Gaunt MW, Miles MA. An insect molecular clock dates the origin of the insects and accords with palaeontological and biogeographic landmarks. Mol Biol Evol (2002) 19:748–761.
Grimaldi D, Engel MS. The evolution of the insects (2005) New York: Cambridge University Press.
Hiller M, Nikolajewa S, Huse K, Szafranski K, Rosenstiel P, Schuster S, Backofen R, Platzer M. TassDB: a database of alternative tandem splice sites. Nucleic Acids Res (2007) 35:D188–D192.
Holland PW. More genes in vertebrates? J Struct Funct Genomics (2003) 3:75–84.[CrossRef][Medline]
Honeybee Genome Sequencing Consortium. Insights into social insects from the genome of the honeybee Apis mellifera. Nature (2006) 443:931–949.[CrossRef][Medline]
Hwang DY, Cohen JB. U1 small nuclear RNA-promoted exon selection requires a minimal distance between the position of U1 binding and the 3' splice site across the exon. Mol Cell Biol (1997) 17:7099–7107.[Abstract]
Jeffroy O, Brinkmann H, Delsuc F, Philippe H. Phylogenomics: the beginning of incongruence? Trends Genet (2006) 22:225–231.[CrossRef][Web of Science][Medline]
Kent WJ. BLAT–the BLAST-like alignment tool. Genome Res (2002) 12:656–664.
Krauss V, Pecyna M, Kurz K, Sass H. Phylogenetic mapping of intron positions: a case study of translation initiation factor eIF2
. Mol Biol Evol (2005) 22:74–84.
Kriegs JO, Churakov G, Kiefmann M, Jordan U, Brosius J, Schmitz J. Retroposed elements as archives for the evolutionary history of placental mammals. PLoS Biol (2006) 4:e91.[CrossRef][Medline]
Kristensen NP. Phylogeny of endopterygote insects, the most successful lineage of living organisms. Eur J Entomol (1999) 96:237–253.
Krzywinski J, Besansky NJ. Frequent intron loss in the white gene: a cautionary tale for phylogeneticists. Mol Biol Evol (2002) 19:362–366.
Kukalová-Peck J, Lawrence JF. Relationships among coleopteran suborders and major endoneopteran lineages: evidence from hind wing characters. Eur J Entomol (2004) 101:95–144.
Longhorn SJ, Foster PG, Vogler AP. The nematode–arthropod clade revisited: phylogenomic analyses from ribosomal protein genes misled by shared evolutionary biases. Cladistics (2007) 23:130–144.[CrossRef][Web of Science]
Lynch M, Conery JS. The origins of genome complexity. Science (2003) 302:1401–1404.
Lynch M, Kewalramani A. Messenger RNA surveillance and the evolutionary proliferation of introns. Mol Biol Evol (2003) 20:563–571.
Lynch M, Richardson AO. The evolution of spliceosomal introns. Curr Opin Genet Dev (2002) 12:701–710.[CrossRef][Web of Science][Medline]
Maddison DR, Maddison WP. MacClade 4.08 (2005) Sunderland (MA): Sinauer Associates.
Misof B, Niehuis O, Bischoff I, Rickert A, Erpenbeck D, Staniczek A. Towards an 18S phylogeny of hexapods: accounting for group-specific character covariance in optimized mixed nucleotide/doublet models. Zoology (Jena) (2007) 110:409–429.[Medline]
Morgenstern B, Prohaska SJ, Pöhler D, Stadler PF. Multiple sequence alignment with user-defined anchor points. Algorithms Mol Biol (2006) 1:6.[CrossRef][Medline]
Nguyen HD, Yoshihama M, Kenmochi N. New maximum likelihood estimators for eukaryotic intron evolution. PLoS Comput Biol (2005) 1:e79.[CrossRef][Medline]
Nishihara H, Hasegawa M, Okada N. Pegasoferae, an unexpected mammalian clade revealed by tracking ancient retroposon insertions. Proc Natl Acad Sci USA (2006) 103:9929–9934.
Putnam NH, Srivastava M, Hellsten U. (19 co-authors). Sea anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization. Science (2007) 317:86–94.
Raible F, Tessmar-Raible K, Osoegawa K. (12 co-authors). Vertebrate-type intron-rich genes in the marine annelid Platynereis dumerilii. Science (2005) 310:1325–1326.
Rodriguez-Trelles F, Tarrio R, Ayala FJ. Origins and evolution of spliceosomal introns. Annu Rev Genet (2006) 40:47–76.[CrossRef][Medline]
Roger AJ, Hug LA. The origin and diversification of eukaryotes: problems with molecular phylogenetics and molecular clock estimation. Phil Trans R Soc B (2006) 361:1039–1054.[CrossRef][Medline]
Rogozin IB, Lyons-Weiler J, Koonin EV. Intron sliding in conserved gene families. Trends Genet (2000) 16:430–432.[CrossRef][Web of Science][Medline]
Rohdendorf BB, Rasnitsyn AP. Historical development of the class Insecta (1980) Moscow (Russia): Nauka Press.
Rokas A, Holland PWH. Rare genomic changes as a tool for phylogenetics. Trends Ecol Evol (2000) 15:454–459.[CrossRef][Medline]
Ross HH. A textbook of entomology (1965) New York: Wiley.
Roy SW, Fedorov A, Gilbert W. Large-scale comparison of intron positions in mammalian genes shows intron loss but no gain. Proc Natl Acad Sci USA (2003) 100:7158–7162.
Roy SW, Gilbert W. Resolution of a deep animal divergence by the pattern of intron conservation. Proc Natl Acad Sci USA (2005) 102:4403–4408.
Roy SW, Penny D. Smoke without fire: most reported cases of intron gain in nematodes instead reflect intron losses. Mol Biol Evol (2006) 23:2259–2262.
Roy SW, Penny D. Patterns of intron loss and gain in plants: intron loss-dominated evolution and genome-wide comparison of O. sativa and A. thaliana. Mol Biol Evol (2007) 24:171–181.
Saeys Y, Rouze P, Van de Peer Y. In search of the small ones: improved prediction of short exons in vertebrates, plants, fungi and protists. Bioinformatics (2007) 23:414–420.
Savard J, Tautz D, Lercher MJ. Genome-wide acceleration of protein evolution in flies (Diptera). BMC Evol Biol (2006) 6:e7.[CrossRef]
Savard J, Tautz D, Richards S, Weinstock GM, Gibbs RA, Werren JH, Tettelin H, Lercher MJ. Phylogenomic analysis reveals bees and wasps (Hymenoptera) at the base of the radiation of Holometabolous insects. Genome Res (2006) 16:1334–1338.
Sheth N, Roca X, Hastings ML, Roeder T, Krainer AR, Sachidanandam R. Comprehensive splice-site analysis using comparative genomics. Nucleic Acids Res (2006) 34:3955–3967.
Sverdlov AV, Rogozin IB, Babenko VN, Koonin EV. Conservation versus parallel gains in intron evolution. Nucleic Acids Res (2005) 33:1741–1748.
Tatusov RL, Koonin EV, Lipman DJ. A genomic perspective on protein families. Science (1997) 278:631–637.
Venkatesh B, Ning Y, Brenner S. Late changes in spliceosomal introns define clades in vertebrate evolution. Proc Natl Acad Sci USA (1999) 96:10267–10271.
Wada H, Kobayashi M, Sato R, Satoh N, Miyasaka H, Shirayama Y. Dynamic insertion-deletion of introns in deuterostome EF-1
genes. J Mol Evol (2002) 54:118–128.[CrossRef][Web of Science][Medline]
Wang L, Wang S, Li Y, Paradesi MSR, Brown SJ. BeetleBase: the model organism database for Tribolium castaneum. Nucleic Acids Res (2007) 35:D476–D479.
Yandell M, Mungall CJ, Smith C, Prochnik S, Kaminker J, Hartzell G, Lewis S, Rubin GM. Large-scale trends in the evolution of gene structures within 11 animal genomes. PLoS Comput Biol (2006) 2:e15.[CrossRef][Medline]
Yang Z. Among-site rate variation and its impact on phylogenetic analyses. Trends Ecol Evol (1996) 11:367–372.[CrossRef]
Yoshihama M, Nakao A, Nguyen HD, Kenmochi N. Analysis of ribosomal protein gene structures: implications for intron evolution. PLoS Genet (2006) 2:e25.[CrossRef][Medline]
Zdobnov EM, Bork P. Quantification of insect genome divergence. Trends Genet (2007) 23:16–20.[CrossRef][Web of Science][Medline]
Zheng J, Rogozin IB, Koonin EV, Przytycka TM. Support for the coelomata clade of animals from a rigorous analysis of the pattern of intron conservation. Mol Biol Evol (2007) 24:2583–2592.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




