Skip Navigation


MBE Advance Access originally published online on July 26, 2006
Molecular Biology and Evolution 2006 23(10):1984-1993; doi:10.1093/molbev/msl067
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
23/10/1984    most recent
msl067v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (4)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Kojima, K. K.
Right arrow Articles by Fujiwara, H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kojima, K. K.
Right arrow Articles by Fujiwara, H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

Research Article

Identification of rDNA-Specific Non-LTR Retrotransposons in Cnidaria

Kenji K. Kojima*, Kei-ichi Kuma*, Hiroyuki Toh{dagger} and Haruhiko Fujiwara{ddagger}

* Institute for Chemical Research, Kyoto University, Uji, Japan; {dagger} Medical Institute of Bioregulation, Kyushu University, Fukuoka, Japan; and {ddagger} Department of Integrated Biosciences, Graduate School of Frontier Sciences, University of Tokyo, Kashiwa, Japan

E-mail: kojimakk{at}kuicr.kyoto-u.ac.jp.


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Supplementary Material
 Acknowledgements
 References
 
Ribosomal RNA genes are abundant repetitive sequences in most eukaryotes. Ribosomal DNA (rDNA) contains many insertions derived from mobile elements including non–long terminal repeat (non-LTR) retrotransposons. R2 is the well-characterized 28S rDNA–specific non-LTR retrotransposon family that is distributed over at least 4 bilaterian phyla. R2 is a large family sharing the same insertion specificity and classified into 4 clades (R2-A, -B, -C, and -D) based on the N-terminal domain structure and the phylogeny. There is no observation of horizontal transfer of R2; therefore, the origin of R2 dates back to before the split between protostomes and deuterostomes. Here, we in silico identified 1 R2 element from the sea anemone Nematostella vectensis and 2 R2-like retrotransposons from the hydrozoan Hydra magnipapillata. R2 from N. vectensis was inserted into the 28S rDNA like other R2, but the R2-like elements from H. magnipapillata were inserted into the specific sequence in the highly conserved region of the 18S rDNA. We designated the Hydra R2–like elements R8. R8 is inserted at 37 bp upstream from R7, another 18S rDNA–specific retrotransposon family. There is no obvious sequence similarity between targets of R2 and R8, probably because they recognize long DNA sequences. Domain structure and phylogeny indicate that R2 from N. vectensis is the member of the R2-D clade, and R8 from H. magnipapillata belongs to the R2-A clade despite its different sequence specificity. These results suggest that R2 had been generated before the split between cnidarians and bilaterians and that R8 is a retrotransposon family that changed its target from the 28S rDNA to the 18S rDNA.

Key Words: non-LTR retrotransposon • rDNA • sequence specificity • R2 • R8 • Cnidaria


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Supplementary Material
 Acknowledgements
 References
 
Many sequence-specific insertions of mobile elements have been accumulated in the ribosomal RNA (rRNA) gene array. Most of the insertions belong to group I self-splicing introns (Haugen et al. 2005Go) and non–long terminal repeat (non-LTR) retrotransposons (Kojima and Fujiwara 2004Go). Group I introns have an ability to splice themselves from rRNA, which makes them neutral to the host. In contrast, non-LTR retrotransposons do not have the ability of self-splicing. Because the insertion tends to be harmful, sequence-specific non-LTR retrotransposons can be only inserted into repetitive sequences, such as ribosomal DNA (rDNA).

Eight rDNA-specific non-LTR retrotransposon families have been identified. Mutsu is inserted into the 5S rDNA (Kojima and Fujiwara 2004Go), and R7 is inserted into the 18S rDNA (Kojima and Fujiwara 2003Go; fig. 1). Other 6 retrotransposon families, R1, R2, R4, R5, R6, and RT, are inserted into the 28S rDNA (Burke et al. 1987Go, 1995Go, 2003Go; Xiong and Eickbush 1988Go; Besansky et al. 1992Go; Kojima and Fujiwara 2003Go; fig. 1). R1, R6, RT, and R7 are related to one another both structurally and phylogenetically and all classified into the R1 clade (Kojima and Fujiwara 2003Go). They encode apurinic/apyrimidinic endonuclease–like endonuclease upstream of reverse transcriptase in the same open reading frame (ORF). They are considered to share the common sequence-specific ancestor (Kojima and Fujiwara 2003Go). In contrast, R2, R4, and R5 encode restriction enzyme–like endonuclease (RLE) downstream of reverse transcriptase. Non-LTR retrotransposons originally encoded RLE and exchanged their endonucleases from restriction enzyme like to apurinic/apyrimidinic endonuclease like (Malik et al. 1999Go; Kojima and Fujiwara 2005aGo). Thus, the acquisition of sequence specificity of R2, R4, and R5 can be older than that of the R1 clade elements. Their insertion sites are very close (fig. 1), but their phylogenetic relationships are not so close. R4 is closely related to the TAA repeat–specific retrotransposon and elements with no obvious specificity (Van Dellen et al. 2002Go; Kojima and Fujiwara 2004Go). R5 is related to the spliced leader exon–specific retrotransposon (Burke et al. 2003Go). R2 constitutes a large group that includes many retrotransposons with the same sequence specificity.


Figure 1
View larger version (18K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 1.— Insertion sites of newly identified R8 and all other rDNA-specific non-LTR retrotransposons except the 5S rDNA–specific Mutsu. Retrotransposons encoding RLE are indicated as shaded arrows, whereas retrotransposons encoding apurinic/apyrimidinic endonuclease–like endonuclease are indicated as open arrows. Ribosomal DNA sequences are represented by that of Anopheles albimanus (L78065), and the insertion site of each retrotransposon is shown within parentheses. The cleavage site of R2 is represented by that of R2Bm.

 
R2 is distributed over at least 4 animal phyla—Arthropoda, Chordata, Echinodermata, and Platyhelminthes (Burke et al. 1998Go; Kojima and Fujiwara 2004Go, 2005bGo). Because obvious horizontal transfer of R2 has not been observed, R2 could date back to before the split between protostomes and deuterostomes. At present, R2 is the sequence-specific non-LTR retrotransposon family, whose origin can be traced back the earliest. In our previous study (Kojima and Fujiwara 2005bGo), we proposed the R2 "superclade." The R2 superclade can be classified into 4 "clades" based on the N-terminal zinc-finger motifs and the phylogeny (Kojima and Fujiwara 2005bGo). The R2-A, -C, and -D clades have 3, 2, and 1 zinc-finger motif(s), respectively. The N-terminal structure of the R2-B clade is undetermined. The members of the R2-A and the R2-D clades have been found from diverse phyla of bilaterian animals. Each clade is further divided into several "subclades" in which the R2 phylogeny agrees with the host phylogeny (Kojima and Fujiwara 2005bGo). One of the questions about the evolution of R2 is when R2 appeared. The answer could be helpful to understand the ancestral state of non-LTR retrotransposons. We tried to search for R2 from organisms belonging to Cnidaria as an approach to answer this question.

In this report, we describe new non-LTR retrotransposons from 2 cnidarian species, Nematostella vectensis and Hydra magnipapillata. We found that an element from Nematostella has the same specificity as R2. So it is an authentic R2 element and belongs to the R2-D clade. In contrast, 2 elements from Hydra show novel sequence specificity and are integrated into the 18S rDNA. We designated them R8. The sequence around the insertion site of R8 is not similar to that of R2. Although R8 has distinct sequence specificity from R2, the phylogenetic analyses revealed that R8 belongs to the R2-A clade. R8 is likely to be a retrotransposon family that changed its target from the 28S to the 18S rDNA in the past. Identification of R2 and R8 from Cnidaria suggests that the origin of R2 antedates the appearance of triploblasts.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Supplementary Material
 Acknowledgements
 References
 
Database Search and Element Reconstruction
Reconstructed retrotransposon sequences reported in this study are available from the authors' Web site (http://www.biol.s.u-tokyo.ac.jp/users/animal/kojima/sequence.html). Genomic sequence traces of the starlet sea anemone N. vectensis and the hydrozoan H. magnipapillata were downloaded from the National Center for Biotechnology Information (NCBI) Trace Archive (ftp://ftp.ncbi.nih.gov/pub/TraceDB/). We used approximately 2.01-Gb N. vectensis genomic sequence traces, whose size is 5.91 times as large as the 340-Mb genome, and 5.01-Gb H. magnipapillata genomic traces, which covers 3.85 times the 1.30-Gb genome. Search for R2-like retrotransposons were performed by TBlastN (Altschul et al. 1997Go) using the amino acid sequences of the reverse transcriptase domains of R2Ci-D, R2Nvit-B (AAD16096 [GenBank] ), R2Dr, R2Sm-A, R2Amar (AAC34903 [GenBank] ), R2Bm (AAB59214 [GenBank] ), R2Ci-A, and R2Dmel (CAA36225 [GenBank] ) as queries. The sequences of R2Ci-D, R2Dr, R2Sm-A, and R2Ci-A were reported in our previous studies (Kojima and Fujiwara 2004Go, 2005bGo) and are available from the authors' Web site. The hit sequence traces of N. vectensis were classified into only one group in which sequence traces have more than 95% identity to one another. The hit sequence traces of H. magnipapillata were grouped into 2. BlastN was performed using one representative sequence trace per group as a query in order to search the sequence traces that overlap with the query. The query sequence was assembled with one of the sequence traces that have 200-bp sequence with more than 95% sequence identity to the query and have a more than 200-bp 5' or 3' extension. BlastN was done iteratively using the assembled sequence as a query. The longest assembled sequence both ends of which adjoin to ribosomal RNA genes was defined as the full-length retrotransposon. If the assembled retrotransposon sequence contained inframe stop codons or frameshifts compared with known R2 elements, it was corrected using other sequence traces as references. All 3 assembled elements contained a full ORF with 5' and 3' UTRs. In order to identify 5' and 3' truncated copies of each element, using BlastN with each assembled retrotransposon sequence as a query, we collected sequence traces in which the region unrelated to the retrotransposon occupies more than 10% of the total length. We manually identified the 5' and the 3' boundaries of retrotransposons in these sequence traces.

Sequence Alignment and Phylogenetic Analyses
Amino acid sequences were aligned using MAFFT 5.6.4 (Katoh et al. 2005Go). The sequence alignments used for the phylogenetic analyses are available as Supplementary Material online. Maximum likelihood trees were constructed using Treefinder (Jobb et al. 2004Go). Modelgenerator (Keane et al. 2006Go) was used to obtain the model and parameters for the likelihood analysis for each data set. In all data sets, Modelgenerator selected RtREV + I + G + F as the best model for the maximum likelihood analyses based on the Akaike Information Criterion 1, Akaike Information Criterion 2, and Bayesian Information Criterion. Neighbor-Joining trees were constructed using MEGA3.1 (Kumar et al. 2004Go). Nonparametric bootstrap analyses for the maximum likelihood and the Neighbor-Joining trees were performed with 1,000 replicates. Bayesian phylogenetic inference trees were constructed using MrBayes 3.1 (Ronquist and Huelsenbeck 2003Go). Markov chain Monte Carlo chain length was 1,000,000 generations with trees sampled every 10 generations; the first 10,000 trees were discarded as burn-in.


    Results and Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Supplementary Material
 Acknowledgements
 References
 
Target Sequence Specificity of R2Nvec-A
We identified one non-LTR retrotransposon from the genomic sequence traces of the starlet sea anemone N. vectensis, at the NCBI Trace Archive (ftp://ftp.ncbi.nih.gov/pub/TraceDB/), by Blast search using the reverse transcriptase domain of R2 as a query. We designated it R2Nvec-A. R2Nvec-A is 3,922 bp long. All 5' and 3' flanking sequences of complete R2Nvec-A are the 28S rDNA (fig. 2A). The insertion site of R2Nvec-A is a bit different from other R2 elements. Although 5' boundaries of R2 are different by several basepairs among R2 elements, which is considered to be caused by the difference in top (sense) strand cleavage site (Burke et al. 1999Go), the 3' boundary is tightly conserved among R2. All R2Nvec-A, however, are followed by the 28S rDNA sequence AAGGTAGCCA, 4 bp upstream from the majority of 3' flanking sequences of R2, TAGCCAAATG (Burke et al. 1999Go; Kojima and Fujiwara 2005bGo). All the complete R2Nvec-A copies follow the 28S rDNA sequence TGACTCTCTT. Insertion of R2Nvec-A makes neither target-site duplications nor deletions. It is reported that the cleavage sites of the bottom (antisense) and the top strands determine the 5' and the 3' ends of target-site duplications, respectively, if the endonuclease cleaves the DNA generating a 3' overhang (Feng et al. 1998Go). If the endonuclease generates a 5' overhang, the cleavage sites of the bottom and the top strands are reported to determine the 3' and the 5' ends of target-site deletions (Luan et al. 1993Go). Therefore, the cleavage of R2Nvec-A endonuclease is considered to make a blunt end (fig. 2B, R2Nvec-A (a)). The boundary of R2Nvec-A can be explained in another way. It was reported that the R2 element of Drosophila sechellia terminates by polyA tail or polyA tail plus GG (Eickbush DG and Eickbush TH 1995Go). The 3' end of the latter sequence is AAGG, which is identical to the 28S rDNA sequence. Equally, it is possible that the endonuclease of R2Nvec-A cleaves the bottom strand at the same site as other R2 (fig. 2B, R2Nvec-A (b)) and that R2Nvec-A terminates with the sequence AAGG. The identical sequences between the R2Nvec-A 3' terminus and the 28S rDNA could hide the true bottom-strand cleavage site. Experimental evidences are needed to determine the cleavage sites of endonuclease.


Figure 2
View larger version (26K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 2.— Insertion sites of R2Nvec-A. (A) Boundaries of the full-length R2Nvec-A. Nucleotides identical to the 28S rRNA gene are shaded. Representative TI numbers in the NCBI Trace Archive on 1 September 2005 are shown at the left side of sequences. Numbers of sequence traces are shown at the right. The horizontal solid line shows the region used to make the phylogenetic tree in figure 5, whereas the broken line shows that in figure 6. ZF, zinc-finger motif; RT, reverse transcriptase; RLE, restriction enzyme–like endonuclease. (B) Putative cleavage sites of R2 endonucleases. Vertical lines represent the cleavage sites of bottom and top strands. Only the cleavage site of R2Bm has been characterized experimentally (Luan et al. 1993Go).

 

Figure 5
View larger version (11K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 5.— The 50% consensus tree based on maximum likelihood method. The phylogenetic tree was constructed using the amino acid sequences of the reverse transcriptase domains (figs. 2 and 3, horizontal solid line). The root was determined using 5 sequences of the same group II introns as the previous study (Malik et al. 1999Go). Elements newly identified in this study are in boldface. Elements specific for rDNA are italicized. The left number at each branch represents the percentage of maximum likelihood bootstrap value. The middle and the right numbers indicate the percentages of Neighbor-Joining bootstrap value and Bayesian posterior probability, respectively, for the corresponding cluster. Values below 50% are not shown. Clade names are shown at the right.

 

Figure 6
View larger version (14K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 6.— The 50% consensus tree based on maximum likelihood method. The phylogenetic tree was constructed using the amino acid sequences of the latter halves of the proteins (figs. 2 and 3, horizontal broken line). Elements newly identified in this paper are in boldface. The numbers at each branch represent the percentages of maximum likelihood bootstrap value, Neighbor-Joining bootstrap value, and Bayesian posterior probability value, respectively. Values below 50% are not shown. Subclade names are shown at the right.

 
Target Sequence Specificity of R8Hm-A and R8Hm-B
We identified 2 closely related non-LTR retrotransposons from the sequence traces of the hydrozoan H. magnipapillata at the NCBI Trace Archive (ftp://ftp.ncbi.nih.gov/pub/TraceDB/), by Blast search using the reverse transcriptase domain of R2 as a query. We named these retrotransposons R8Hm-A and R8Hm-B. R8Hm-A is 4,327 bp long, whereas the size of R8Hm-B is 4,265 bp. As their sequences have high similarity to those of R2, we checked the boundaries of these retrotransposons. Beyond our expectation that they share the target sequence with R2, these retrotransposons are not inserted into the 28S rDNA. They are not the authentic R2 elements but constitute a novel sequence-specific non-LTR retrotransposon family. Because R2 not only is a phylogenetic group but also constitutes a group with the same sequence specificity, we propose the new family R8 including R8Hm-A and R8Hm-B.

All the 3' ends of R8Hm-A are flanked with the 18S rRNA genes (fig. 3A). Most of the 5' flanking sequences of R8Hm-A are also the 18S rDNA sequences. As the full-length sequence of the 18S rRNA gene of H. magnipapillata was not found in the public database, we will use the numbering in the 18S rRNA gene of Hydra circumcincta (AF358080) below. R8Hm-A follows the position 1142 and is followed by the position 1134. The observation suggests that R8Hm-A makes 9-bp (1134AAGCTGAAA1142) target-site duplications. The sequence 1134–1142 corresponds to the conserved helix 31 in the 18S rRNA secondary structure (Van de Peer et al. 1997Go). Even though all the complete R8Hm-A copies are present after the position 1142, no 5' truncated copies neighbors on the same sequence (table 1, Supplementary Material online). Most of them are flanked with several bases upstream compared with the complete copies. It was indicated that the mechanisms of 5' integration of the full-length and the 5' truncated copies are different (Zingler et al. 2005Go). The difference in 5' flanking sequences between the complete and the 5' truncated R8Hm-A copies may reflect the different mechanisms of the 5' integration. Actually, 5' truncated R8Hm-A copies and the 18S rDNA share 1- to 4-bp nucleotides at their junctions (table 1, Supplementary Material online), which is the characteristic of microhomology-mediated end joining leading to the 5' truncation (Zingler et al. 2005Go). In the sequence traces of TI numbers 649453868 and 734195272, there are additional nucleotides between the 18S rRNA gene and R8Hm-A. Nontemplate nucleotide addition has also been observed in the 5' integration of R2 (Burke et al. 1999Go). Several sequences follow non-rRNA gene sequences, but they are considered to be the products of the nonhomologous recombination because all the 3' ends are flanked with the 18S rDNA.


Figure 3
View larger version (62K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 3.— Insertion sites of R8Hm-A and R8Hm-B. Numbers above the 18S rRNA gene sequence indicate positions in the 18S rRNA gene of Hydra circumcincta (AF358080). Putative target-site duplication sequences are boxed. Nucleotides identical to the 18S rRNA gene are shaded. Representative TI numbers in the NCBI Trace Archive on 1 September 2005 are shown at the left side of sequences. Numbers of sequence traces are shown at the right. The horizontal solid line shows the region used to make the phylogenetic tree in figure 5, whereas the broken line shows that in figure 6. Abbreviations are the same as figure 2. (A) Boundaries of the full-length R8Hm-A. (B) Boundaries of the full-length R8Hm-B.

 
Comparing with the case of R8Hm-A, the boundaries of R8Hm-B are complicated. All the 3' flanking sequences of R8Hm-B are also the 18S rDNA, but the R8Hm-B insertion site seems to be 15 bp upstream from the R8Hm-A insertion site (fig. 3B). No complete R8Hm-B copies follow the 18S rRNA gene (fig. 3B). The majority of the 5' end sequences are flanked with the fragments of the 18S rRNA gene. The boundary sequences of the complete R8Hm-B indicate that R8Hm-B follows the 18S rDNA sequence upstream from the position 1145 and is followed by the 18S rDNA sequence downstream from the position 1119. Boundaries of the 5' truncated R8Hm-B copies showed different features from those of the full-length copies (table 2, Supplementary Material online). Similar to the case of R8Hm-A, all the 5' truncated copies show either 1- to 2-bp microhomologies or non-rDNA additional nucleotides at the boundary. Several 5' truncated copies adjoin to the 18S rDNA sequences, but the integrated positions are much broader than those of R8Hm-A. For example, in the sequence trace 647071531, the 5' end of R8Hm-B is flanked with the 18S rDNA upstream from the position 1194, which indicates large target-site duplications (1119–1194). In the sequence trace 691890607, in contrast, the 5' end of R8Hm-B adjoins to the 18S rDNA upstream from the position 1090, which suggests that R8Hm-B replaced the 28-bp sequence (1091–1118) of the 18S rDNA. We found that the 3' truncated copies of R8Hm-B are more abundant than those of R8Hm-A (data not shown). At least three 5' truncated copies are also truncated at the 3' termini, which may be due to recombination after integration (table 2, Supplementary Material online).

A simple explanation for the boundaries of the full-length R8Hm-B is that R8Hm-B makes 27-bp target-site duplications, 1119GGTAGTATGGTTGCAAAGCTGAAACTT1145. It is, however, more likely that R8Hm-B share the sequence specificity with R8Hm-A because R8Hm-A and R8Hm-B are closely related to each other (discussed below). Namely, R8Hm-B would make 9-bp target-site duplications, 1134AAGCTGAAA1142, like R8Hm-A. If it were true, R8Hm-B would start with CTT and terminate with GGTAGTATGGTTGCA. Because there is no conclusive evidence to suggest which possibility is true, we here represent the insertion site of R8 by that of R8Hm-A. Sequence analyses indicated that the endonuclease of R8 cleaves the bottom strand between T1133 and T1134 and cleaves the top strand between A1142 and C1143, generating a 9-bp 3' overhang (fig. 1). There is no sequence similarity between R2 target site and R8 target site. R8 target site is similar to neither R4 nor R5 target sites.

Domain Structure of R2Nvec-A, R8Hm-A, and R8Hm-B
R2Nvec-A has 1 ORF. The putative protein is 1,138 amino acids long. The protein of R2Nvec-A contains one N-terminal zinc-finger motif, a reverse transcriptase domain, a C-terminal zinc-finger motif, and an RLE (figs. 2A and 4). R8Hm-A and R8Hm-B also have one ORF. The putative protein of R8Hm-A is 1,160 amino acids long, whereas that of R8Hm-B is 1,158 amino acids in length. The amino acid sequence identity between R8Hm-A and R8Hm-B proteins is 67%. The protein of R8Hm-A encodes 3 zinc-finger motifs at the N terminus (figs. 3A and 4). The protein of R8Hm-B also encodes 3 zinc-finger motifs, but the first motif is defective (figs. 3B and 4). They also have a reverse transcriptase domain, a C-terminal zinc finger, and an RLE.


Figure 4
View larger version (35K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 4.— N-terminal zinc-finger motifs of R2 and R8. Conserved residues are shaded. The spacing between zinc-finger motifs is shown in parenthesis. The first zinc-finger motif of R8Hm-B is defective.

 
Copy Number Estimation
We roughly estimated the copy number of each retrotransposon. We found 44 sequence traces that were more than 95% identical with the 5' 100-bp sequence of R8Hm-A from the 5.01-Gb H. magnipapillata genomic sequence traces we used. Because the genome size of H. magnipapillata is 1.30 Gb, the copy number of the 5' ends of R8Hm-A per haploid is estimated to be 11 (44 x 1.30/5.01). To confirm the estimation, we similarly estimated the copy number of single-copy genes. We used 10 genes that had been characterized as single copy in H. magnipapillata or Hydra vulgaris (Y11678 [GenBank] , Y11679 [GenBank] , AF209200 [GenBank] , AF085200 [GenBank] , X70840 [GenBank] , Y09797 [GenBank] , U22380 [GenBank] , AF140020 [GenBank] , AF183398 [GenBank] , and AY522556 [GenBank] ). The average of estimated copy number was 0.94, and the standard deviation was 0.58. Thus, our estimated copy number seems close to the actual one. The copy number of the 3' ends of R8Hm-A was estimated at 31. We estimated that the copy numbers of the 5' and the 3' ends of R8Hm-B are 6 and 12, respectively. R8 seems to be inserted into a small fraction of rRNA genes because the estimated copy number of the 18S rRNA genes in H. magnipapillata was 680. We also estimated the copy number of the 5' and the 3' ends of R2Nvec-A at 24 and 50, respectively.

Phylogeny of R2 and R8
In order to infer the phylogenetic relationships of 4 rDNA-specific retrotransposons, R2, R4, R5, and R8, we constructed the phylogenetic trees based on the amino acid sequences of the reverse transcriptase domain. The length of the reverse transcriptase domain is approximately 410 amino acids. We used 3 methods for inferring the phylogeny—maximum likelihood method, Neighbor-Joining method, and Bayesian phylogenetic inference. Because the phylogenies based on the 3 methods were nearly the same, only the consensus maximum likelihood tree is shown (fig. 5). Our phylogenetic analyses did not support the superclade-level phylogeny except the R2 superclade, including the R2-A, R2-B, R2-C, and R2-D clades. Perere-9, a transcribed non-LTR retrotransposon of the blood fluke Schistosoma mansoni (DeMarco et al. 2005Go), and R2Sm-A in our previous study (Kojima and Fujiwara 2005bGo) were revealed to be the same element. R2Nvec-A is located inside of the R2 superclade as expected. R8Hm-A and R8Hm-B are also present inside of R2. R8Hm-A and R8Hm-B are closely related with each other and are the most closely related to the R2-A clade, which is consistent with the N-terminal domain structure (fig. 4). R8 is closely related to neither R4 nor R5.

Further, to resolve the detailed relationships of R2 and R8, we made other phylogenetic trees using the C-terminal half of putative proteins. Figure 6 shows the 50% consensus maximum likelihood tree. Because there is more sequence information of the C-terminal half of R2 protein than that of the full reverse transcriptase domain, we can include 40 R2 and 2 R8 elements in figure 6, although we can use only 24 R2 and 2 R8 elements in figure 5. The regions used for phylogenetic analyses were approximately 470 amino acids in length. The intervening region between reverse transcriptase and RLE is less conserved to align sequences of all retrotransposons encoding RLE correctly. Thus, we analyzed the phylogeny only among R2 and R8.

R2Nvec-A is present inside of the R2-D4 subclade with high statistical supports (fig. 6). The phylogenetic position is consistent with the number of N-terminal zinc-finger motifs (fig. 4). Maximum likelihood and Bayesian phylogenetic inference trees suggest that R8 is closely related to the R2-A1 subclade, whereas the R2-A2 and the R2-A3 subclades are monophyletic (fig. 6). In figure 5, the monophyly of R2-A2/R2-A3 is not supported, but R8 is clustered with the R2-A1 subclade with high statistical supports. Our previous analysis (Kojima and Fujiwara 2005bGo) supported the monophyly of R2-A2 and R2-A3. These results indicate that the R2-A clade can be divided into 2 groups, R2-A2/R2-A3 and R2-A1/R8. These results show that R8 is likely to have been derived from R2 by changing its target specificity.

The Target Site of R8
The target sites of R1, R2, R4, R5, and R6 are very close to one another (fig. 1). These sites are located in one of the most conserved regions in the 28S rRNA gene (Ben Ali et al. 1999Go). The region around the RT insertion site is also highly conserved. The insertion site of R8 is 37 bp upstream from that of R7 (fig. 1). The region including the target sites of R7 and R8 is also conserved in the 18S rRNA gene (fig. 7; Van de Peer et al. 1997Go). The sequence around the R8 target site is basically conserved among all eumetazoans including hydra, arthropods, chordates, and Trichoplax, whereas the basic animal phyla Ctenophora (ctenophores), represented by Mnemiopsis leidyi, Porifera (sponges), represented by Xestospongia muta, and eukaryotes other than animals have two or more different nucleotides near the R8 insertion site (fig. 7). Thus, it is possible that other eumetazoans contain R8 even if R8 is not distributed among the basal animals and other eukaryotes. We previously suggested that at least dozens of copy number and high conservation at the nucleotide sequence level are necessary for the target of sequence-specific non-LTR retrotransposons (Kojima and Fujiwara 2004Go). The insertion site of R8 reinforces our suggestion.


Figure 7
View larger version (11K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 7.— Sequence conservation near the R7 and R8 insertion sites. Dot shows nucleotide identical to that of the 18S rDNA of Hydra. Accession numbers are as follows: Hydra (Cnidaria), AF358080; Homo, U13369; Xenopus, X02995; Drosophila, M21017; Anopheles, L78065; Caenorhabditis, X03680; Dugesia (Platyhelminthes), AF013153; Phoronis (Phoronida), U36271; Trichoplax (Placozoa), L10828; Mnemiopsis (Ctenophora), L10826; Xestospongia (Porifera), AY621510; Monosiga (choanoflagellate), AF084230; Saccharomyces, Z75578; Arabidopsis, X16077; and Entamoeba, X65163.

 
R8 is likely to have been derived from R2 by changing its target site from the 28S rDNA to the 18S rDNA. Changing target sites has been observed in the evolution of another group of sequence-specific non-LTR retrotransposons, the R1 clade. The 28S rDNA–specific retrotransposon RT and the 18S rDNA–specific retrotransposon R7 are closely related to each other (Kojima and Fujiwara 2003Go), although the direction of target change is unknown. The comparison between the pairs R2/R8 and R7/RT is expected to clarify the different features caused by the distinct mechanism of sequence specificity. The target sequences of RT and R7 are similar to each other (fig. 1), which suggests that the sequence similarity is important for target change in the R1 clade (Kojima and Fujiwara 2003Go). In contrast, we found no sequence similarities between the sequences around the insertion sites of R2 and R8, taking the distance from the cleavage site into consideration (fig. 1). It is reported that the R2 protein of the silkworm Bombyx mori recognizes the DNA 10–40 bp upstream and the 18 bp of target DNA downstream from the R2 insertion site (Christensen and Eickbush 2005Go). In the recognition of long sequence by the R2 protein, each nucleotide would have low importance for the target specificity. Two R2 protein molecules are involved in the recognition of such a long DNA. The protein bound to the upstream DNA is responsible for the bottom-strand cleavage, whereas the protein bound to the downstream DNA is responsible for the top strand cleavage. The recognition of a long sequence is probably one of the reasons why we could not find the similarity between the target sites of R2 and R8. The distinct recognition of the upstream and the downstream sequences could cause the differences of distances between the top and the bottom-strand cleavage sites. R2 is likely to cleave DNA with a 0- to 7-bp 5'-overhang (fig. 2B), whereas R8 would cleave DNA with a 9-bp 3' overhang (fig. 1). It contrasts the similar length of 3' overhangs between R7 and RT (Kojima and Fujiwara 2003Go; fig. 1).

The Origin and Diversification of R2
Although R8 shows target sequence specificity distinct from R2, R8 is still phylogenetically a member of the R2-A clade. R2Nvec-A belongs to the R2-D4 subclade with high statistical supports. We identified R2 and R8 from 2 distinct cnidarian classes. Hydra magnipapillata belongs to the class Hydrozoa, and N. vectensis belongs to the class Anthozoa. Identification from 2 distinct cnidarian classes indicates the wide distribution of the R2 superclade elements in Cnidaria.

The R2-D4 subclade includes R2 elements from Chordata, Echinodermata, and Cnidaria. The lineage constituted by the R2-A1 subclade and R8 also includes elements from Chordata, Arthropoda, and Cnidaria. The common ancestor of R2-A1 and R8 could have been 28S rDNA specific. Frequent extinction and diversification of R2 lineages might have complicated the evolution of R2 extremely, but the phylogenetic trees indicate that the origins of 2 R2 lineages can date back to before the split between cnidarians and bilaterians. Basal animals other than Cnidaria are ctenophores (Ctenophora), sponges (Porifera), and Trichoplax (Placozoa). Recent phylogenetic analyses suggest the supergroup Opisthokonta including animals, fungi, and choanoflagellates (Adl et al. 2005Go). R2 has not been found from any species of fungi, although many fungal genomes have been completely sequenced. One possible explanation is that R2 appeared after the split between animals and fungi, and the other is that R2 was extinct in the fungal common ancestor. At the present time, genome sequence projects of the Placozoa Trichoplax adhaerens, the sponge Reniera, and the choanoflagellate Monosiga ovata are ongoing. Searching R2 in such Opisthokonta organisms will provide some information for the time of origin and diversification of R2.


    Supplementary Material
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Supplementary Material
 Acknowledgements
 References
 
Supplementary tables 1 and 2 and the sequence alignments for phylogenetic analyses are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Supplementary Material
 Acknowledgements
 References
 
This work was supported by Research Fellowships for Young Scientists of the Japan Society for the Promotion Science and by Grant-in-Aid for Scientific Research on Priority Areas "Comparative Genomics" from the Ministry of Education, Culture, Sports, Science and Technology of Japan.


    Footnotes
 
Billie Swalla, Associate Editor


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Supplementary Material
 Acknowledgements
 References
 

    Adl SM, Simpson AG, Farmer MA, et al. (28 co-authors). 2005. The new higher level classification of eukaryotes with emphasis on the taxonomy of protists. J Eukaryot Microbiol 52:399–451.[CrossRef][ISI][Medline]

    Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–402.[Abstract/Free Full Text]

    Ben Ali A, Wuyts J, De Wachter R, Meyer A, Van de Peer Y. 1999. Construction of a variability map for eukaryotic large subunit ribosomal RNA. Nucleic Acids Res 27:2825–31.[Abstract/Free Full Text]

    Besansky NJ, Paskewitz SM, Hamm DM, Collins FH. 1992. Distinct families of site-specific retrotransposons occupy identical positions in the rRNA genes of Anopheles gambiae. Mol Cell Biol 12:5102–10.[Abstract/Free Full Text]

    Burke WD, Calalang CC, Eickbush TH. 1987. The site-specific ribosomal insertion element type II of Bombyx mori (R2Bm) contains the coding sequence for a reverse transcriptase-like enzyme. Mol Cell Biol 7:2221–30.[Abstract/Free Full Text]

    Burke WD, Malik HS, Jones JP, Eickbush TH. 1999. The domain structure and retrotransposition mechanism of R2 elements are conserved throughout arthropods. Mol Biol Evol 16:502–11.[Abstract]

    Burke WD, Malik HS, Lathe WC III, Eickbush TH. 1998. Are retrotransposons long-term hitchhikers? Nature 392:141–2.[CrossRef][Medline]

    Burke WD, Müller F, Eickbush TH. 1995. R4, a non-LTR retrotransposon specific to the large subunit rRNA genes of nematodes. Nucleic Acids Res 23:4628–34.[Abstract/Free Full Text]

    Burke WD, Singh D, Eickbush TH. 2003. R5 retrotransposons insert into a family of infrequently transcribed 28S rRNA genes of planaria. Mol Biol Evol 20:1260–70.[Abstract/Free Full Text]

    Christensen SM, Eickbush TH. 2005. R2 target-primed reverse transcription: ordered cleavage and polymerization steps by protein subunits asymmetrically bound to the target DNA. Mol Cell Biol 25:6617–28.[Abstract/Free Full Text]

    DeMarco R, Machado AA, Bisson-Filho AW, Verjovski-Almeida S. 2005. Identification of 18 new transcribed retrotransposons in Schistosoma mansoni. Biochem Biophys Res Commun 333:230–40.[CrossRef][ISI][Medline]

    Eickbush DG, Eickbush TH. 1995. Vertical transmission of the retrotransposable elements R1 and R2 during the evolution of the Drosophila melanogaster species subgroup. Genetics 139:671–84.[Abstract]

    Feng Q, Schumann G, Boeke JD. 1998. Retrotransposon R1Bm endonuclease cleaves the target sequence. Proc Natl Acad Sci USA 95:2083–8.[Abstract/Free Full Text]

    Haugen P, Simon DM, Bhattacharya D. 2005. The natural history of group I introns. Trends Genet 21:111–9.[CrossRef][ISI][Medline]

    Jobb G, von Haeseler A, Strimmer K. 2004. TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics. BMC Evol Biol 4:18.[CrossRef][Medline]

    Katoh K, Kuma K, Toh H, Miyata T. 2005. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res 33:511–8.[Abstract/Free Full Text]

    Keane TM, Creevey CJ, Pentony MM, Naughton TJ, Mclnerney JO. 2006. Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified. BMC Evol Biol 6:29.[CrossRef][Medline]

    Kojima KK, Fujiwara H. 2003. Evolution of target specificity in R1 clade non-LTR retrotransposons. Mol Biol Evol 20:351–61.[Abstract/Free Full Text]

    Kojima KK, Fujiwara H. 2004. Cross-genome screening of novel sequence-specific non-LTR retrotransposons: various multicopy RNA genes and microsatellites are selected as targets. Mol Biol Evol 21:207–17.[Abstract/Free Full Text]

    Kojima KK, Fujiwara H. 2005a. An extraordinary retrotransposon family encoding dual endonucleases. Genome Res 15:1106–17.[Abstract/Free Full Text]

    Kojima KK, Fujiwara H. 2005b. Long-term inheritance of the 28S rDNA-specific retrotransposon R2. Mol Biol Evol 22:2157–65.[Abstract/Free Full Text]

    Kumar S, Tamura K, Nei M. 2004. MEGA3: integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief Bioinform 5:150–63.[Abstract/Free Full Text]

    Luan DD, Korman MH, Jakubczak JL, Eickbush TH. 1993. Reverse transcription of R2Bm RNA is primed by a nick at the chromosomal target site: a mechanism for non-LTR retrotransposition. Cell 72:595–605.[CrossRef][ISI][Medline]

    Malik HS, Burke WD, Eickbush TH. 1999. The age and evolution of non-LTR retrotransposable elements. Mol Biol Evol 16:793–805.[Abstract]

    Ronquist F, Huelsenbeck JP. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572–4.[Abstract/Free Full Text]

    Van Dellen K, Field J, Wang Z, Loftus B, Samuelson J. 2002. Non-LTR retrotransposons and SINE-like elements of the protist Entamoeba histolytica. Gene 297:229–39.[CrossRef][ISI][Medline]

    Van de Peer Y, Jansen J, De Rijk P, De Wachter R. 1997. Database on the structure of small ribosomal subunit RNA. Nucleic Acids Res 25:111–6.[Abstract/Free Full Text]

    Xiong YE, Eickbush TH. 1988.The site-specific ribosomal DNA insertion element R1Bm belongs to a class of non-long-terminal-repeat retrotransposons. Mol Cell Biol 8:114–23.[Abstract/Free Full Text]

    Zingler N, Willhoeft U, Brose HP, Schoder V, Jahns T, Hanschmann KM, Morrish TA, Lower J, Schumann GG. 2005. Analysis of 5' junctions of human LINE-1 and Alu retrotransposons suggests an alternative model for 5'-end attachment requiring microhomology-mediated end-joining. Genome Res 15:780–9.[Abstract/Free Full Text]

Accepted for publication July 18, 2006.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Mol Biol EvolHome page
X. Zhang, J. Zhou, and T. H. Eickbush
Rapid R2 Retrotransposition Leads to the Loss of Previously Inserted Copies via Large Deletions of the rDNA Locus
Mol. Biol. Evol., January 1, 2008; 25(1): 229 - 237.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
T. H. Eickbush and D. G. Eickbush
Finely Orchestrated Movements: Evolution of the Ribosomal RNA Genes
Genetics, February 1, 2007; 175(2): 477 - 485.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
23/10/1984    most recent
msl067v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (4)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Kojima, K. K.
Right arrow Articles by Fujiwara, H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kojima, K. K.
Right arrow Articles by Fujiwara, H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?