Molecular Biology and Evolution 18:196-205 (2001)
© 2001 Society for Molecular Biology and Evolution
ARTICLES |
Identification of Waldo-A and Waldo-B, Two Closely Related Non-LTR Retrotransposons in Drosophila
*Institut de Génétique Humaine, Centre National de la Recherche Scientifique, Montpellier, France; and
Institute of Cytology and Genetics, Novosibirsk, Russia
| Abstract |
|---|
|
|
|---|
We have identified two novel, closely related subfamilies of non-long-terminal-repeat (non-LTR) retrotransposons in Drosophila melanogaster, the Waldo-A and Waldo-B subfamilies, that are in the same lineage as site-specific LTR retrotransposons of the R1 clade. Both contain potentially active copies with two large open reading frames, having coding capacities for a nucleoprotein as well as endonuclease and reverse transcriptase activities. Many copies are truncated at the 5' end, and most are surrounded by target site duplications of variable lengths. Elements of both subfamilies have a nonrandom distribution in the genome, often being inserted within or very close to (CA)n arrays. At the DNA level, the longest elements of Waldo-A and Waldo-B are 69% identical on their entire length, except for the 5' untranslated regions, which have a mosaic organization, suggesting that one arose from the other following new promoter acquisition. This event occurred before the speciation of the D. melanogaster subgroup of species, since both Waldo-A and Waldo-B coexist in other species of this subgroup.
| Introduction |
|---|
|
|
|---|
Non-long-terminal-repeat (non-LTR) retrotransposons are an almost constant component of eukaryotic genomes. A recent extensive phylogenetic study of the endonuclease and reverse transcriptase domains of many non-LTR retrotransposons allowed the investigators to distinguish 11 distinct clades (Malik, Burke, and Eickbush 1999
I elements and elements from the Jockey clade, except TART, insert at random locations, although they show a marked preference for AT-rich sites. TART elements insert preferentially at the ends of the chromosomes, where they are found associated with HeT-A elements, playing the role of telomeres (Levis et al. 1993
). R1 and R2 elements are site-specific and are mostly found inserted at the same positions within the 28S rDNA genes (Xiong and Eickbush 1988b
; Jakubczak, Xiong, and Eickbush 1990
).
Retrotransposition of these elements is believed to start with the synthesis of a full-length transcript that may serve both as the messenger for protein translation and as the transposition intermediate (Chaboissier et al. 1990
). The full-length transcript is produced from an internal promoter located within the 5' untranslated region (UTR) as was shown for Jockey (Mizrokhi, Georgieva, and Ilyin 1988
), I (McLean, Bucheton, and Finnegan 1993
), Doc, and F elements (Minchiotti and Di Nocera 1991
; Contursi, Minchiotti, and Di Nocera 1995
). In the case of R2Bm in Bombyx mori, reverse transcription initiates at the site of integration, using as a primer the 3' OH end of the target DNA that was liberated after cleavage by the endonuclease (Luan et al. 1993
). This target primed reverse transcription (TPRT) model is currently suggested for other non-LTR retrotransposons. Non-LTR retrotransposons often tend to lose their 5' ends upon transposition due to incomplete reverse transcription.
Until recently, the only means to recover new transposons in Drosophila were either to wait until a serendipitous study of a spontaneous mutation would reveal a new insertion or to perform PCR experiments with degenerated oligonucleotides designed from conserved sites like the reverse transcriptase domains of non-LTR retrotransposons. Nowadays, the genome of D. melanogaster is almost entirely sequenced (Adams et al. 2000
), and this, along with the availability of powerful tools allowing very rapid sequence searches and analyses, largely facilitates the identification of new transposable elements. Here, we describe two closely related new subfamilies of non-LTR retrotransposons in D. melanogaster, the Waldo-A and Waldo-B subfamilies.
| Materials and Methods |
|---|
|
|
|---|
Sequence Analyses
Searches for matches of amino acid sequences in the nonredundant database were done at the National Center for Biotechnology Infomation using BLASTP, version 2.0.12 (Altschul et al. 1997
Reverse transcriptase and apurinic/apyrimidinic endonuclease domains of Waldo-A and Waldo-B elements were aligned to previously established alignments DS36752 and DS36736 (Malik, Burke, and Eickbush 1999
) by the hmmalign program from the HMMER package, version 2.1.1, (http://hmmer.wustl.edu). Reconstruction of neighbor-joining phylogenetic trees and bootstrap analysis were carried out with MEGA, version 1.02 (Kumar, Tamura, and Nei 1993
).
Protein sequences of the regions of ORF1 containing CCHC motifs in non-LTR retrotransposons were aligned with the Multialin program (http://www.toulouse.inra.fr) (Corpet 1998
) and shaded with Boxshade, version 3.21 (http://www.ch.embnet.org).
PCR Amplifications
PCR amplifications were performed using standard conditions with Taq DNA polymerase (Promega). The Waldo ORF PCR fragment was amplified from clone J1 DNA using primers ww1 (5'-AGGTGGACAGAAACCACTCGACGGG-3') and ww2 (5'-CCTCCTTAGCTTTTTGGTAACAAGC-3'). The probes used in Southern blot hybridizations were amplified from genomic DNA from the Cha strain using primers Ber1up (5'-CGAGAGACAAAGGGCATAGCTTCC-3') and Ber1do (5'-GTTGCTGATCGATGCCCATAGCCG-3') for PCR Waldo-A ORF1, Ber2up (5'-TGTTATAAAAGCAGTGGCGCTGGG-3') and Ber2do (5'-CGCCACTCCGCATTAGGCTGAGAG-3') for PCR Waldo-A ORF2, 341up (5'-AGCGAGGATAGGGGCGTGCTAGTG-3') and 341do (5'-TAGGGAGCTCGGTGGCCGAATTCG-3') for PCR Waldo-B ORF1, and 342up (5'-GATCGTCAAAGCAGCCGCCACCGC-3') and 342do (5'-ATCCCTCACTCGCAAGGTATTTGC-3') for PCR Waldo-B ORF2.
Southern Blots
Digestion of genomic DNA, gel electrophoresis, transfer on Nytran N nylon membranes (Schleicher and Schuell), and hybridization with 32P-labeled DNA probes were performed following standard procedures (Sambrook, Fritsch, and Maniatis 1989
) and suppliers' specifications. Hybridizations were carried out overnight at 42°C in 50% formamide. Washes were in 2 x SSC, 0.1% SDS, followed by 0.1 x SSC, 0.1% SDS at 42°C.
Inverse PCR
After digestion with SspI that did not cut within known Waldo elements, genomic DNA from flies of the Cha strain was self-ligated and amplified (35 cycles of 94°C for 45 s, 50°C for 45 s, and 72°C for 2 min) with Taq DNA polymerase (Promega) using primers Rw172 (5'-ACCTTGACTGGCAGTCCCGGTGAGC-3') and Berdo87 (5'-GCTCTACTGTCGCAACACAACACTG-3') specific for Waldo-A elements or primers Rw172 and 34do76 (5'-TGCAGTTTACGGCTGACCGGACTCG-3') specific for Waldo-B elements. PCR fragments were cloned using the PCR-Script Amp cloning kit (Stratagene) and sequenced using standard procedures by Genome Express S.A.
| Results |
|---|
|
|
|---|
Identification of an Endonuclease Domain Encoded by a Repeated Mobile Sequence
The starting point of the present work was the identification, within the D. melanogaster genomic clone containing the Jockey J1 element (Priimagi, Mizrokhi, and Ilyin 1988
|
Two New Subfamilies of Non-LTR Retrotransposons
The DNA sequence of the Waldo ORF was used for a BLAST search in the genome sequences released by the BDGP. Two categories of sequences homologous to the Waldo ORF and with coding capacities came out of this search. The first category of sequences, contained in AC005734, AC006563, and AC007575, were 100% identical to the Waldo ORF; they were designated Waldo-A sequences. The second category of sequences, contained in AC005847 and AC004349, showed 69% similarity to the Waldo ORF; they were designated Waldo-B sequences. Further studies, presented below, showed that these sequences specified two distinct but closely related subfamilies of non-LTR retrotransposons.
The Waldo-A Subfamily
The sequences surrounding the Waldo ORF homology present in AC005734, AC006563, and AC007575 were analyzed using DNA Strider. Their organization was typical of that of non-LTR retrotransposons, with two large ORFs of 1,497 and 2,964 bp (fig. 2
). Conceptual translation of these ORFs indicated that the first ORF may encode a protein of 498 amino acids containing three CCHC motifs, and the second ORF may encode a protein of 987 amino acids containing endonuclease and reverse transcriptase domains analogous to those of other non-LTR retrotransposons, as well as one CCHC motif. No RNaseH domain could be identified. The two ORFs overlap by 24 bp. The Waldo-A element in AC006563 has a stop codon in frame at position 1093 of ORF1 and is presumably unable to encode a full-length product. The 3' UTR is 328 bp and terminates with a polyA stretch. Putative target site duplications (TSDs) were identified at the ends of each element. For simplicity, sequences lying between the putative 5' TSD and the first ATG of ORF1 will be referred to as the 5' UTR of the element considered.
|
The 3'-most 400 bp (excluding the polyA stretch) of the Waldo-A element was used for a BLAST search in the genomic sequences released by the BDGP. This allowed us to recover several other copies of the Waldo-A element that were variously truncated at the 5' end. They all terminated at the 3' end with a polyA stretch and were surrounded by putative TSDs. The sequence organization of those in AC007818, AC007669, AC007356, AC005430, AC007147, and AC004251 are shown in figure 2 . All of these elements were present within clones that were localized at dispersed sites on chromosomal arms (Hartl et al. 1994
Our BLAST searches for Waldo-Arelated sequences also identified, in addition to the elements presented in figure 2, a number of variously degenerated short elements with 75%93% sequence similarity, usually rearranged and devoid of coding capacities (data not shown). Some of them were located in region 4143 of chromosome II and were associated with other defective, rearranged copies of other known retrotransposons, either LTR or non-LTR (data not shown).
The Waldo-B Family
Analyses of the Waldo-B elements were conducted in the same way as analyses for the Waldo-A elements. Their sequence organizations were very similar (fig. 2
). Two long Waldo-B elements were identified within AC005847 and AC004349. They contained two ORFs of 1,467 and 2,970 bp that overlapped by 20 bp. These two ORFs may encode proteins of 488 and 989 amino acids with the same domain organization as in Waldo-A elements. BLASTs of BDGP sequences with the 3'-most 400 bp of the Waldo-B element in AC005847 allowed us to recover one 5' truncated copy of Waldo-B, present in AC007851. Other copies of Waldo-B were identified in this search but were not included in this study because their complete sequences were not available, and we did not know whether they are complete or truncated at the 5' end. The Waldo-B element present in AC004349 has a deletion of the 3' end starting 6 bp before the polyA sequence. It is not flanked by TSDs, suggesting that the 3' end of the element was deleted after insertion. The 3' UTR of the Waldo-B elements in AC005847 and AC007851 are 302 bp long and terminate with a polyA stretch, and putative short TSDs could be identified. All of these elements were present within clones that were localized at dispersed sites on chromosomal arms (Hartl et al. 1994
; Hoskins et al. 2000
). As in the case of Waldo-A, we also identified variously degenerated Waldo-B elements associated with other defective, rearranged copies of other known retrotransposons (data not shown).
The Waldo elements described in this work were all retrieved from the sequences released by the BDGP before the publication of the complete sequence of the Drosophila genome (Release 1) by Adams et al. (2000). We also searched Waldo elements in the sequences from Release 1. As expected, we found in Release 1 all of the Waldo copies that we had identified in the sequences released by the BDGP, along with a few more copies of Waldo-A and Waldo-B elements. However, none of the Waldo-A or Waldo-B elements found in the sequences of Release 1 are capable of encoding complete products of ORF1 or ORF2. In fact, all of the sequences that could be found in both the BDGP and the Release 1 databases contain several differences in the bodies of the Waldo elements, whereas the surrounding sequences are identical. This is assumed to be due to the high level of errors within repetitive sequences in Release 1 (Myers et al. 2000; see also http://www.celera.com/genomeanalysis/ and http://www.fruitfly.org/sequence/faq.html). Therefore, Waldo-A and Waldo-B elements that were found in sequences from Release 1 were not included in the present study.
The 5' UTRs of Waldo Elements
The sequence organization of the 5' UTRs of the three longer Waldo-A elements (in AC006563, AC005734, and AC007575) and of the Waldo-B element in AC005847 are shown in figure 3a.
The 5' UTR of the longest Waldo-A element (in AC006563) is 615 bp long. The 5' UTRs of the other two Waldo-A elements are truncated at the 5' ends, and the Waldo-A element in AC007575 also has an internal deletion between nucleotides -222 and -440. The putative 5' UTR of the Waldo-B element in AC005847 is 434 bp long and contains, between nucleotides -129 and -246, a short region of similarity (71%) with the sequences lying between nucleotides -287 and -404 in the Waldo-A element in AC006563. However, in this short region there are no similarities between the putative 5' UTR of Waldo-A and Waldo-B elements. Minchiotti, Contursi, and Di Nocera (1997)
have identified an 18-bp-long consensus sequence that is located around 20 nt from the transcription start in the 5' UTR of several non-LTR retrotransposons and that is required for proper transcription initiation. We searched within the 5' UTR of Waldo-A and Waldo-B for the presence of this consensus sequence. It appeared that the longest Waldo-A element (AC006563) contained, starting 20 nt from its 5' end (position -595), sequences that matched this consensus well (fig. 3b
). It is therefore likely that this Waldo-A element is full length. No such sequences were identified in Waldo-B.
|
The putative 5' UTRs of Waldo-A in AC006563 and of Waldo-B in AC005847 were used in BLAST search in expressed sequence tags (ESTs) of the BDGP. No EST similar to the Waldo-A 5' UTR was found. Three ESTs similar to the Waldo-B 5' UTR were found, corresponding to two cDNAs obtained with RNAs extracted from larvae and early pupae (clone LP01280) and from head, brain, and sensory organs (clone HL01331). These two cDNAs contain both Waldo-B sequences and 5' adjacent sequences and therefore probably correspond to read-through transcripts.
Genomic Organization of Waldo-A and Waldo-B Elements
We designed four PCR probes using relevant oligonucleotides, each of them specific to ORF1 or ORF2 of each element (fig. 2 ). These probes were hybridized under high stringency to Southern blots of genomic DNA from four strains of D. melanogaster digested with either SmaI or SalI/NcoI (fig. 4
). SmaI cuts both elements once, while SalI/NcoI digests should release two internal fragments containing either ORF1 or ORF2 from both elements. Under these conditions, each probe reveals a specific set of fragments, indicating that the Waldo-A and Waldo-B elements can be distinguished by hybridization. When the DNAs were digested with SmaI, many fragments hybridize with all probes, confirming that several copies of Waldo-A and Waldo-B elements are present in all four tested strains. The patterns of hybridization with each probe are largely similar, but some bands differ from strain to strain, as expected for mobile elements. When the DNAs were digested with SalI/NcoI, bands corresponding to internal fragments of the expected sizes were intensively revealed in all cases, in addition to several other bands giving a weaker signal and corresponding to higher- and lower-molecular-weight fragments. These results indicate that both the Waldo-A and the Waldo-B subfamilies comprise several potentially full length elements containing both ORF1 and ORF2, along with a number of defective elements. This is in agreement with the findings of our BLAST searches.
|
Nonrandom Distribution of Waldo-A and Waldo-B Elements
Strikingly, the 5' ends of many copies of Waldo-A (4/9) and Waldo-B (2/3) that we found in the BDGP sequences were located near (CA)n repeats or inserted within such sequences (fig. 5 ). Moreover, the original Waldo ORF was found very close to a long stretch of (CA)n. Other known non-LTR retrotransposons (I, F, Jockey, Doc, FB elements) are seldom located near such sequences. In order to verify whether this observation reflects a property of Waldo-A and Waldo-B elements, we identified by inverse PCR and sequenced the ends and adjacent DNA of some long copies of these elements present in the strain Cha of D. melanogaster. We thus recovered sequences from the ends of four Waldo-A elements and two Waldo-B elements (fig. 5 ). Some of these elements indeed appeared to be located close to (CA)n sequences, as we expected. Some others were not associated with (CA)n sequences, but with other kinds of repeats: (TTTACACA)n in the case of CBE2 and (CAACA)n in the case of C21. Finally, three elements, CBA4, CBE2, and C16C24, did not seem to be associated with any kind of repeated sequences. Therefore, the Waldo-A and Waldo-B elements certainly show a strong tendency to insert very close to microsatellite sequences mostly of the kind (CA)n, but this is not a mandatory rule.
|
Sequences adjacent to the Waldo elements recovered by inverse PCR were used for BLAST search in the Drosophila genome sequence (Adams et al. 2000). This allowed us to identify the empty sites for the copies of Waldo-A in CBA4, C8, and C21 (not shown), indicating that these are recently transposed copies.
Relationship Between Waldo-A, Waldo-B, and Other Non-LTR Retrotransposons
Waldo-A and Waldo-B are 69% similar to each other at the DNA level for their entire length except for their 5' UTRs. The putative protein products of ORF1 and ORF2 are 62.3% and 66.8% similar at the amino acid level, respectively, between Waldo-A and Waldo-B.
The reverse transcriptase and endonuclease domains of Waldo-A and Waldo-B were compared with the alignments of Malik, Burke, and Eickbush (1999)
. Both domains were found to be related to those of elements of the R1 clade (fig. 6
). This clade comprises site-specific non-LTR retrotransposons RT1 and RT2 in Anopheles gambiae (Besansky et al. 1992
), SART1 (Takahashi, Okazaki, and Fujiwara 1997
) and TRAS1 (Okazaki, Ishikawa, and Fujiwara 1995
) in Bombyx mori, and R1 in insects. On the reverse transcriptase tree (fig. 6a
) Waldo and RT elements are grouped together, whereas SART1 is an external branch. On the endonuclease tree (fig. 6b
), RT elements group with SART1 with high confidence, but the position of Waldo elements is uncertain and the Waldo branch appears equally away from all other site-specific elements in the clade.
|
Proteins encoded by ORF1 of non-LTR retrotransposons are usually much more divergent than those encoded by ORF2. The only domain that can be recognized within many ORF1 products is a short region containing three zinc fingerlike motifs of the CCHC type. Comparison of these domains between several non-LTR retrotransposons emphasized the close relationship between Waldo and other elements from the R1 clade (fig. 7 ). The CCHC domain of the I element from D. melanogaster added to the study is more divergent.
|
Coexistence of the Waldo-A and Waldo-B Subfamilies in Other Species of the D. melanogaster Subgroup
Given the strong similarity of Waldo-A and Waldo-B, they appear to be closely related, and they obviously define two subfamilies deriving from the same original non-LTR retrotransposon. As a preliminary attempt to trace their history and date the divergence, we looked for their presence in other Drosophila species from the D. melanogaster subgroup: three strains of D. melanogaster, two strains of D. simulans, and one strain each of D. mauritiana, D. teissieri, and D. yakuba. One strain of D. virilis belonging to a distant group was also added to the study. Genomic DNAs were digested with NcoI, which should release an internal fragment of 2,236 bp hybridizing with probe Waldo-A ORF2 and an internal fragment of 1,930 bp hybridizing with probe Waldo-B ORF2. One of the two NcoI sites lies within a 26-bp sequence perfectly conserved between the Waldo-A and Waldo-B elements. Southern blots are shown in figure 8 . With Waldo-A, a band of 2.2 kb was revealed with an intense hybridization signal in D. melanogaster, D. simulans, and D. mauritiana, in addition to many other bands with weaker signal. The overall intensities of the signal were very similar in D. melanogaster and the sibling species D. simulans and D. mauritiana. Sequences similar to the Waldo-A probe were also distinguishable in D. teissieri and D. yakuba as bands of a weaker intensity, but not in D. virilis. With Waldo-B, a band of 1.9 kb with an intense hybridization signal was revealed only in D. melanogaster. Several bands heterogenous in size were revealed in D. simulans and D. mauritiana, but their intensity was much weaker than that of those seen in D. melanogaster. In D. teissieri and D. yakuba, the signal was even weaker. There were no detectable signals in D. virilis. The patterns of hybridization with the Waldo-A ORF2 probe and with the Waldo-B ORF2 probes never overlapped, indicating that both subfamilies coexist in all species tested. Therefore, the divergence between the Waldo-A and the Waldo-B elements occurred before the emergence of these species.
|
| Discussion |
|---|
|
|
|---|
Waldo-A and Waldo-B are two previously undescribed transposable elements found in D. melanogaster. Sequence comparisons showed that they are very similar and represent two closely related subfamilies. Phylogenetic analyses of the endonuclease and reverse transcriptase domains of the product of ORF2 indicated that they belong to the R1 clade defined by Malik, Burke, and Eickbush (1999)
Copies of Waldo-A and Waldo-B are often inserted near or within repeats such as (CA)n or related sequences. This might be the reason why they were not discovered before, since these repeats are rarely found within genes. The frequent association between Waldo-A and Waldo-B elements and microsatellite-like sequences indicates a nonrandom distribution of Waldo elements in the genome. This might reflect a preference of integration of Waldo elements, not at the sequence level but possibly by interaction with some higher-order chromatin structures determined by microsatellite regions. Alternatively, this could result from a better conservation of the elements that integrated in these types of regions than of those that integrated elsewhere. However, in this case, one would expect degenerated elements to be associated with microsatellite-like sequences as well, and this does not appear to be the case. Noticeably, other non-LTR retrotransposons of Drosophila are not found preferentially associated with repeated DNA, so the distribution pattern of Waldo elements appears specific to this family.
The available data bring very little insight into the frequency of retrotransposition of Waldo-A and Waldo-B elements. Southern blots revealed that some genomic restriction fragments containing Waldo sequences are variable from strain to strain, suggesting that Waldo elements have recently transposed. Besides, inverse PCR analyses of Waldo elements in the Cha strain identified copies inserted within genomic sequences that are found empty in the strain that was used for sequencing by Adams et al. (2000)
. These observations indicate that the Waldo elements are capable of transposition. However, the intensity of their transpositional activity is difficult to estimate, although it is probably not very high. Since the production of a full-length transcript is a prerequisite for mobility of non-LTR retrotransposons, it would be of interest to determine whether Waldo-A and Waldo-B are transcriptionally active. Searches for ESTs did not allow us to identify such a candidate, but this might be due to the fact that their transcription might be restricted to particular tissues. In general, non-LTR retrotransposons are transcribed from an internal promoter located within their 5' UTRs. We were able to identify within the Waldo-A 5' UTR some sequences matching a consensus found in the promoter of other non-LTR retrotransposons (Minchiotti, Contursi, and Di Nocera 1997
). It therefore seems reasonable to speculate that Waldo also uses an internal promoter located within the 5' UTR. Further work, including Northern and RT-PCR analyses, will be necessary to address the question of the transcriptional activity of Waldo. However, such studies might not be very informative in view of retrotranspositional activity. Among D. melanogaster non-LTR retrotransposons, the I factor is the only one for which a strong correlation between transcription and retrotransposition has been established (Chaboissier et al. 1990
; McLean, Bucheton, and Finnegan 1993
). By contrast, Jockey, F, and Doc are actively transcribed in various tissues (Mizrokhi, Georgieva, and Ilyin 1988
; Minchiotti et al. 1994
; Zhao and Bownes 1998
) but undergo extremely low levels of retrotransposition.
Waldo-A and Waldo-B represent two closely related subfamilies that coexist within the same species. This situation is reminiscent of that of L1 elements in some mammals. In the mouse, several subfamilies of L1 elements coexist. Two of them, the A and TF subfamilies, contain retrotranspositionally active members (DeBernardinis et al. 1998
; Naas et al. 1998
). Full-length copies of the A and TF subfamilies are very similar, except within their 5' UTR, which are constituted by several monomeric repeats retaining promoter activity (Severyinse, Hutchison, and Edgell 1992
; DeBernardinis and Kazazian 1999
). Monomeric sequences of TF-type L1 5' UTR are different from those of the A type, and therefore the two subfamilies are under distinct transcriptional control. It is believed that a new subfamily of mouse L1 may be formed following occasional capture of a new 5' UTR with promoter activity (Adey et al. 1994
). The same could be true for the Waldo-A and Waldo-B subfamilies, with one having derived from the other after accidental acquisition of a new promoter. This probably resulted from complex events, given the mosaic structures of the 5' UTRs, which contain a short (
100 bp) region of similarity, surrounded by unrelated blocks. Possibly, the region that is conserved between the two 5' UTRs might contain some sequences that are important for the activity or regulation of the elements. This event can be dated to before the formation of the species of the D. melanogaster subgroup; since Waldo-A and Waldo-B coexist in all tested species from this subgroup, it is likely that they were both present within their common ancestor.
All non-LTR retrotransposon families that are currently known in D. melanogaster are old components of the genome that are also found in sibling species. Some of them, like R1 and R2, are common to all dipterans (Jakubczak, Burke, and Eickbush 1991
) and are also found outside (Malik, Burke, and Eickbush 1999
). At least one case of loss of a functional element followed by reinvasion has been documented: the I factor which existed in the common ancestor of the D. melanogaster subgroup was apparently lost in D. melanogaster and very efficiently reinvaded the species in the middle of the century (Bucheton et al. 1986, 1992
; Sezutsu, Nitasaka, and Yamazaki 1995
). Studies of the I factor family and, to a lesser extent, of other families of nonsite-specific, non-LTR retrotransposons have revealed that recently integrated copies (full-length or 5'-truncated) are found mostly in euchromatic sites, whereas defective, rearranged, inactive copies, corresponding to old components of the genome, have accumulated in the pericentromeric heterochromatic regions (Crozatier et al. 1988
; Simonelig et al. 1988
; Vaury, Bucheton, and Pélisson 1989
; Pimpinelli et al. 1995
). All Waldo-A and Waldo-B elements shown in figure 2 map to euchromatic sites. They are more than 99.5% similar within each subfamily, variably truncated at the 5' end, and, except for Waldo-B in AC004349, surrounded by target site duplications. Therefore, they most likely correspond to recently transposed copies. Our BLAST searches also identified more divergent elements, variously mutated and deleted, with many of them being present in clones for which no specific chromosomal location could be assigned by the BDGP. These degenerated elements could very well correspond to pericentromeric copies. It is interesting to note that these degenerated elements also fall into two subfamilies, one more related to Waldo-A and the other more related to Waldo-B. Determination of the sequences of the elements located in pericentromeric heterochromatin would allow thorough studies of these elements and bring insight into the evolutionary story of Waldo-A and Waldo-B subfamilies.
| Supplementary Material |
|---|
|
|
|---|
The sequences reported in this paper have been deposited in the GenBank database (accession numbers AF281636AF281649).
| Note Added in Proof |
|---|
|
|
|---|
The pilger non-LTR retrotransposon (GenBank accession number AJ278684) corresponds to the Waldo-B element in AC005847.
| Acknowledgements |
|---|
|
|
|---|
This paper is dedicated to the memory of Laurent "Doone" Caron, who inspired the name "Waldo." We thank Matthieu Seveau for his help in the study of Waldo elements in the Cha strain, and Christophe Terzian for critical reading of the manuscript. This work was supported by grants from the Centre National de la Recherche Scientifique (CNRS) and from the Association pour la Recherche sur le Cancer (ARC).
| Footnotes |
|---|
Pierre Capy, Reviewing Editor
1 Abbreviations: ORF, open reading frame; TSD, target site duplication; UTR, untranslated region. ![]()
2 Keywords: Drosophila
non-LTR retrotransposon
microsatellite
phylogeny ![]()
3 Address for correspondence and reprints: Isabelle Busseau, Institut de Génétique Humaine, Centre National de la Recherche Scientifique, 141 rue de la Cardonille, 34396 Montpellier cedex 05, France. E-mail: busseau{at}igh.cnrs.fr ![]()
| literature cited |
|---|
|
|
|---|
Abad, P., C. Vaury, A. Pélisson, M.-C. Chaboissier, I. Busseau, and A. Bucheton. 1989. A long interspersed repetitive elementthe I factor of Drosophila teissieriis able to transpose in different Drosophila species. Proc. Natl. Acad. Sci. USA 86:88878891.
Adams, M. D., S. E. Celniker, R. A. Holt et al. (195 co-authors). 2000. The genome sequence of Drosophila melanogaster. Science 287:21852195.
Adey, N. B., T. O. Tollefsbol, A. B. Sparks, M. H. Edgell, and C. A. Hutchison III. 1994. Molecular resurrection of an extinct ancestral promoter for mouse L1. Proc. Natl. Acad. Sci. USA 91:15691573.
Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403410.[Web of Science][Medline]
Altschul, S. F., T. L. Madden, A. A. Schäffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:33893402.
Besansky, N. J., S. M. Paskewitz, D. M. Mills-Hamm, and F. H. Collins. 1992. Distinct families of site-specific retroposons occupy identical positions in the rRNA genes of Anopheles gambiae. Mol. Cell. Biol. 12:51025110.
Bucheton, A., M. Simonelig, C. Vaury, and M. Crozatier. 1986. Sequences similar to the I transposable element involved in I-R hybrid dysgenesis in D. melanogaster occur in other Drosophila species. Nature 322:650652.
Bucheton, A., C. Vaury, M.-C. Chaboissier, P. Abad, A. Pélisson, and M. Simonelig. 1992. I elements and the Drosophila genome. Genetica 86:175190.
Chaboissier, M. C., I. Busseau, J. Prosser, D. J. Finnegan, and A. Bucheton. 1990. Identification of a potential RNA intermediate for transposition of the LINE-like element I factor in Drosophila melanogaster. EMBO J. 9:35573563.
Contursi, C., G. Minchiotti, and P. P. Di Nocera. 1995. Identification of sequences which regulate the expression of Drosophila melanogaster Doc elements. J. Biol. Chem. 270:2657026576.
Corpet, F. 1988. Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res. 16:1088110890.
Crozatier, M., C. Vaury, I. Busseau, A. Pélisson, and A. Bucheton. 1988. Structure and genomic organization of I elements involved in I-R hybrid dysgenesis in Drosophila melanogaster. Nucleic Acids Res. 16:91999213.
Dawson, A., E. Hartswood, T. Paterson, and D. J. Finnegan. 1997. A LINE-like transposable element of Drosophila, the I factor, encodes a protein with properties similar to those of retroviral nucleocapsids. EMBO J. 16:44484455.[Web of Science][Medline]
DeBernardinis, R. J., J. L. Goodier, E. M. Ostertag, and H. H. Kazazian. 1998. Rapid amplification of a retrotransposon subfamily is evolving the mouse genome. Nat. Genet. 20:288290.[Web of Science][Medline]
DeBernardinis, R. J., and H. H. Kazazian. 1999. Analysis of the promoter from an expanding mouse retrotransposon subfamily. Genomics 56:317323.
Fawcett, D. H., C. K. Lister, E. Kellett, and D. J. Finnegan. 1986. Transposable elements controlling I-R hybrid dysgenesis in D. melanogaster are similar to mammalian LINEs. Cell 47:10071015.
Feng, Q., J. V. Moran, H. H. Kazazian, and J. D. Boeke. 1996. Human L1 retrotransposon encodes a conserved endonuclease required for retrotransposition. Cell 87:905916.
Feng, Q., G. Schumann, and J. D. Boeke. 1998. Retrotransposon R1Bm endonuclease cleaves the target sequence. Proc. Natl. Acad. Sci. USA 95:20832088.
Hartl, D. L., D. I. Nurminsky, R. W. Jones, and E. R. Lozovskaya. 1994. Genome structure and evolution in Drosophila: applications of the framework P1 map. Proc. Natl. Acad. Sci. USA 91:68246829.
Hoskins, R. A., C. R. Nelson, B. P. Berman et al. (21 co-authors). 2000. A BAC-based physical map of the major autosomes of Drosophila melanogaster. Science 287:22712274.
Jakubczak, J. L., W. D. Burke, and T. H. Eickbush. 1991. Retrotransposable elements R1 and R2 interrupt the rRNA genes of most insects. Proc. Natl. Acad. Sci. USA 88:32953299.
Jakubczak, J. L., Y. Xiong, and T. H. Eickbush. 1990. Type I (R1) and type II (R2) ribosomal DNA insertions of Drosophila melanogaster are retrotransposable elements closely related to those of Bombyx mori. J. Mol. Biol. 212:3752.
Kumar, S., K. Tamura, and M. Nei. 1993. MEGA: molecular evolutionary genetics analysis. Version 1.02. Pennsylvania State University, University Park.
Levis, R. W., R. Ganesan, K. Houtchens, L. A. Tolar, and F. M. Sheen. 1993. Transposons in place of telomeric repeats at a Drosophila telomere. Cell 75:10831093.
Luan, D. D., M. H. Korman, J. L. Jakubczak, and T. H. Eickbush. 1993. Reverse transcription of R2Bm RNA is primed by a nick at the chromosomal target site: a mechanism for non-LTR retrotransposition. Cell 72:595605.
McLean, C., A. Bucheton, and D. J. Finnegan. 1993. The 5' untranslated region of the I factor, a long interspersed nuclear element-like retrotransposon of Drosophila melanogaster, contains an internal promoter and sequences that regulate expression. Mol. Cell. Biol. 13:10421050.
Malik, H. S., W. D. Burke, and T. H. Eickbush. 1999. The age and evolution of non-LTR retrotransposable elements. Mol. Biol. Evol. 16:793805.[Abstract]
Martin, F., C. Maranon, M. Olivares, C. Alonso, and M. C. Lopez. 1995. Characterization of a non-long terminal repeat retrotransposon cDNA (L1Tc) from Trypanosoma cruzi: homology of the first ORF with the Ape family of DNA repair enzymes. J. Mol. Biol. 247:4959.[Web of Science][Medline]
Minchiotti, G., C. Contursi, and P. P. Di Nocera. 1997. Multiple downstream promoter modules regulate the transcription of the Drosophila melanogaster I, Doc and F elements. J. Mol. Biol. 267:3746.[Web of Science][Medline]
Minchiotti, G., C. Contursi, F. Graziani, G. Gargiulo, and P. P. Di Nocera. 1994. Expression of Drosophila melanogaster F elements in vivo. Mol. Gen. Genet. 245:152159.
Minchiotti, G., and P. P. Di Nocera. 1991. Convergent transcription initiates from oppositely oriented promoters within the 5' end regions of Drosophila melanogaster F elements. Mol. Cell. Biol. 11:51715180.
Mizrokhi, L. J., S. G. Georgieva, and Y. V. Ilyin. 1988. Jockey, a mobile Drosophila element similar to mammalian LINEs, is transcribed from the internal promoter by RNA polymerase II. Cell 54:685691.
Myers, E. W., G. G. Sutton, A. L. Delcher et al. (29 co-authors). 2000. A whole-genome assembly of Drosophila. Science 287:21962204.
Naas, T. P., R. J. DeBerardinis, J. V. Moran, E. M. Ostertag, S. F. Kingsmore, M. F. Seldin, Y. Hayashizaki, S. L. Martin, and H. H. Kazazian. 1998. An actively retrotransposing, novel subfamily of mouse L1 elements. EMBO J. 17:590597.[Web of Science][Medline]
Okazaki, S., H. Ishikawa, and H. Fujiwara. 1995. Structural analysis of TRAS1, a novel family of telomeric repeat-associated retrotransposons in the silkworm, Bombyx mori. Mol. Cell. Biol. 15:45454552.
Pimpinelli, S., M. Berloco, L. Fanti, P. Dimitri, S. Bonaccorsi, E. Marchetti, R. Caizzi, C. Caggese, and M. Gatti. 1995. Transposable elements are stable structural components of Drosophila melanogaster heterochromatin. Proc. Natl. Acad. Sci. USA 92:38043808.
Priimagi, A. F., L. J. Mizrokhi, and Y. V. Ilyin. 1988. The Drosophila mobile element Jockey belongs to LINEs and contains coding sequences homologous to some retroviral proteins. Gene 70:253262.
Sambrook, J., E. F. Fritsch, and T. Maniatis. 1989. Molecular cloning, a laboratory manual. Cold Spring Harbor Laboratory Press, New York.
Severyinse, D. M., C. A. Hutchison III, and M. H. Edgell. 1992. Identification of transcriptional activity within the 5' A-type monomer sequence of the mouse LINE-1 retroposon. Mamm. Genome 2:4150.
Sezutsu, H., E. Nitasaka, and T. Yamazaki. 1995. Evolution of the LINE-like I element in the Drosophila melanogaster species subgroup. Mol. Gen. Genet. 249:168178.[Web of Science][Medline]
Simonelig, M., C. Bazin, A. Pélisson, and A. Bucheton. 1988. Transposable and nontransposable elements similar to the I factor involved in Inducer-Reactive (IR) hybrid dysgenesis in Drosophila melanogaster coexist in various Drosophila species. Proc. Natl. Acad. Sci. USA 85:11411145.
Takahashi, H., S. Okazaki, and H. Fujiwara. 1997. A new family of site-specific retrotransposons, SART1, is inserted into telomeric repeats of the silkworm, Bombyx mori. Nucleic Acids Res. 25:15781584.
Vaury, C., A. Bucheton, and A. Pélisson. 1989. The beta heterochromatic sequences flanking the I elements are themselves defective transposable elements. Chromosoma 98:215224.
Xiong, Y., and T. H. Eickbush. 1988a. Functional expression of a sequence-specific endonuclease encoded by the retrotransposon R2Bm. Cell 55:235246.
. 1988b. The site specific ribosomal DNA insertion element R1Bm belongs to a class of non-terminal repeat retrotransposons. Mol. Cell. Biol. 8:114123.
Zhao, D., and M. Bownes. 1998. The RNA product of the Doc retrotransposon is localized on the Drosophila oocyte cytoskeleton. Mol. Gen. Genet. 257:497504.[Web of Science][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
M. Fablet, E. Lerat, R. Rebollo, B. Horard, N. Burlet, S. Martinez, E. Brasset, E. Gilson, C. Vaury, and C. Vieira Genomic environment influences the dynamics of the tirant LTR retrotransposon in Drosophila FASEB J, May 1, 2009; 23(5): 1482 - 1489. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. d. C. Seleme, O. Disson, S.ép. Robin, C. Brun, D. Teninges, and A. Bucheton In vivo RNA localization of I factor, a non-LTR retrotransposon, requires a cis-acting signal in ORF2 and ORF1 protein Nucleic Acids Res., February 1, 2005; 33(2): 776 - 785. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. K. Kojima and H. Fujiwara Evolution of Target Specificity in R1 Clade Non-LTR Retrotransposons Mol. Biol. Evol., March 1, 2003; 20(3): 351 - 361. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. P. Blumenstiel, D. L. Hartl, and E. R. Lozovsky Patterns of Insertion and Deletion in Contrasting Chromatin Domains Mol. Biol. Evol., December 1, 2002; 19(12): 2211 - 2225. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||








