Molecular Biology and Evolution 17:908-914 (2000)
© 2000 Society for Molecular Biology and Evolution
Article |
Evolution of the Gypsy Endogenous Retrovirus in the Drosophila melanogaster Subgroup
Institut de Genetique Humaine, Montpellier, France
| Abstract |
|---|
|
|
|---|
We conducted a phylogenetic survey of the endogenous retrovirus Gypsy in the eight species of the Drosophila melanogaster subgroup. A 362-bp fragment from the integrase gene (int) was amplified, cloned, and sequenced. Phylogenetic relationships of the elements isolated from independent clones were compared with the host phylogeny. Our results indicate that two main lineages of Gypsy exist in the melanogaster subgroup and that vertical and horizontal transmission have played a crucial role in the evolution of this insect endogenous retrovirus.
| Introduction |
|---|
|
|
|---|
Retroelements form a large and diverse family of mobile elements that can be found in all eukaryotes. They all share a common mechanism of replication: they propagate by reverse transcription of RNA intermediates and integrate their genetic information into the genome of the host cell. Retroelements include endogenous retroviruses (EnRVs), which are long terminal repeat (LTR)containing elements, with cis-regulatory and coding sequences necessary for the production of potentially infectious particles. These coding sequences are the gag, pol, and env genes (fig. 1 ), which are also present in the "true" exogenous retroviruses (ExRVs). However, several major differences exist between EnRVs and ExRVs: (1) the EnRVs are distributed in vertebrate (Lower, Lower, and Kurth 1996
|
Insertion of an EnRV DNA copy (provirus) into the genome of the germ line is potentially mutagenic for its host. Then, EnRVs are submitted to selective pressures rendering their maintenance compatible with their hosts. Understanding the evolutionary histories of EnRVs in eukaryotic genomes remains a puzzling challenge. A better insight into patterns of EnRV evolution is likely to emerge from detailed studies of a set of closely related organisms whose phylogenetic relationships are well understood. The Gypsy retrovirus of Drosophila is useful for the study of EnRV evolution and the host-retrovirus interaction because Gypsy and its control by the host genome have been extensively studied (Prud'homme et al. 1995
Sampling procedures concerning the number of retroelements and the number of species to be analyzed must be carefully considered. Indeed, mistakes in the interpretation of the relationships between elements may result from the mixture of (1) orthologous sequences (from the same element lineage) when one or few randomly sampled elements from a broad spectrum of taxa are analyzed or (2) paralogous sequences (from different elements lineages) which are not phylogenetically comparable when a large sampling of elements from a single species is examined (Capy, Anxolabehere, and Langin 1994
).
Phylogenetic relationships for partial Gypsy sequences from the eight species of the melanogaster subgroup (Lachaise et al. 1988
) were examined in the context of the phylogeny of their hosts. These partial sequences concern the integrase domain (int) of Gypsy (fig. 1
). The integrase protein is required for the integration of the provirus into the host genome. Hence, integrase is crucial for the replication of Gypsy and its interaction with the host genome because it specifies the DNA target site preferences.
Our results demonstrate that two main Gypsy lineages exist within the melanogaster subgroup. The distribution of these lineages among the species suggests a complex pattern of vertical and horizontal transfers.
| Materials and Methods |
|---|
|
|
|---|
Strains and DNA Extraction
Drosophila erecta, Drosophila mauritiana, Drosophila orena, Drosophila sechellia, Drosophila simulans, Drosophila teissieri, and Drosophila yakuba were kindly provided by Dr. Françoise Lemeunier (Centre National de la Recherche Scientifique, Gif-sur-Yvette, France). The D. melanogaster strain used in this work was MSn1, which is permissive for Gypsy transposition and contains a high copy number of Gypsy proviruses (Kim et al. 1994
PCR Amplification, Cloning, and Sequencing
Three different pairs of primers were designed from conserved regions based on alignments of published sequences for the Gypsy elements from D. melanogaster (M12927; Marlor, Parkhurst and Corces 1986
), D. subobscura (X72390; Alberola and de Frutos 1996
), and D. virilis (M38438; Mizrokhi and Mazo 1991
). Primers were as follows (using the D. melanogaster Gypsy position numbers): primer A (plus strand), positions 489513; primer B (minus strand), positions 66916667; primer C (plus strand), positions 10051029; primer D (minus strand), positions 65336509; primer E (plus strand), positions 47184740; and primer F (minus strand), positions 51225103. A first round of amplification was performed using primers A and B in order to amplify an extra large Gypsy fragment (XL PCR; fig. 1 ). Reaction volumes were 50 µl and contained 2.5 U Taq Plus polymerase (Stratagene, La Jolla, Calif.), 5 µl Taq Plus low-salt buffer, 200 nM each dNTP, 400 nM each primer, and approximately 50 ng template DNA. Reaction parameters were as follows: 95°C for 5 min, followed by 20 cycles of 93°C for 30 s, 60°C for 30 s, and 72°C for 7 min, followed by 72°C for 10 min. One tenth of the reaction products were assayed on 1% agarose. A second round of amplification (int PCR) was performed from 1 µl of the XL PCR products using primers E and F in order to amplify the integrase domain. Reaction volumes were same as for XL PCR, except that 2.5 U of Pfu polymerase (Stratagene) and Pfu buffer was used. Reaction parameters were as follows: 95°C for 5 min, followed by 25 cycles of 93°C for 30 s, 55°C for 30 s, and 72°C for 1 min, followed by 72°C for 20 min. This product was cloned into the pCR-Script Amp SK(+) cloning vector according to the manufacturer's instructions (Stratagene).
The XL PCR and the following int PCR did not give any positive result for the D. simulans, D. mauritiana, and D. sechellia species. New PCRs were performed using primers C and D in order to amplify a large Gypsy fragment (L PCR; fig. 1 ). Reaction volumes were same as for XL PCR. Reaction parameters were as follows: 95°C for 5 min, followed by 2 cycles of 93°C for 1 min, 58°C for 30 s, and 72°C for 6 min, followed by 20 cycles of 93°C for 30 s, 55°C for 30 s, and 72°C for 7 min, followed by 72°C for 10 min. A second round of amplification was then performed from 1 µl of each L PCR product using primers E and F in order to amplify the int domain. These products were then purified and cloned into the pCR-Script Amp SK(+).
At least four clones from each species were sequenced in both directions using the reverse and forward M13 primers with an Applied Biosystems 373A automated sequencer according to the manufacturer's protocols.
Data Analyses
A multiple alignment of the nucleic acid sequences was performed using CLUSTAL X (Thompson et al. 1997
). Jukes-Cantor nucleotide distances were estimated using Distances (GCG, version 10.0). Phylogenetic analyses were done using the bootstrap neighbor-joining tree method (1,000 replicates) as implemented in CLUSTAL X, the maximum-likelihood method as implemented in the program PUZZLE, version 4.0 (Strimmer and von Haeseler 1997
), and the maximum-parsimony method using the PAUPSearch and PAUPDisplay programs as implemented in GCG, version 10.0 (Swofford 1991
). Diverge (GCG, version 10.0) was used to estimate the numbers of synonymous (Ks) and nonsynonymous (Ka) substitutions per site between two sequences coding for proteins (Li 1993
). Trees based on sequence data were obtained using Treeview (Rod Page, http://taxonomy.zoology.gla.ac.uk/rod). The sequences reported in this paper have been deposited in the EMBL database (accession numbers AJ279868AJ279900).
| Results and Discussion |
|---|
|
|
|---|
Phylogenetic Distribution of Gypsy Is Incongruent with Host Species Phylogeny
In order to study the evolution of Gypsy, we employed a PCR-based assay using genomic DNAs from the eight species of the D. melanogaster subgroup. Because many deleted Gypsy elements are present in the genome of Drosophila (Lambertsson, Andersson, and Johansson 1989
Eight clones from D. melanogaster were obtained: four were issued from XL PCR and four were issued from L PCR. They were all identical, and this is why only one sequence (me) appears in the subsequent analyses. Only two clones have deletions: one from D. mauritiana (ma1), which contains two deletions of 21 bp and 22 bp, and one from D. sechellia (se3), which contains a 20-bp deletion. These deletions inactivate ma1 and se3 copies. The other sequences encode a polypeptide of 120 amino acids (fig. 2 ).
|
Figure 3 shows the neighbor-joining unrooted tree, including Gypsy sequences from D. subobscura (X72390) and D. virilis (M38438). This tree is clearly incongruent with the host species phylogeny. The main result is that three major clusters of sequences are fully supported by maximal bootstrap values: the simulans (si)/sechellia (se)/mauritiana (ma), the melanogaster(me)/teissieri (t)/yakuba (y)/orena (o)/erecta (e), and the subobscura (sub)/virilis (vir) clusters.
|
Within the melanogaster/teissieri/yakuba/orena/erecta cluster, we note that (1) the e3, e4, and e5 sequences from D. erecta form a distinct group (e345); (2) e1 and e2 (e12) group together with the orena sequences, whereas the t and y sequences form a separate group; and (3) the clustering of the me sequences to the teissieri/yakuba group is supported by a lower but still significant bootstrap value (707). In order to specify the Drosophila melanogaster int sequences branching point, maximum-parsimony analysis and maximum-likelihood analyses were performed. The maximum-parsimony analysis gave the same topology concerning the melanogaster/teissieri/yakuba cluster (bootstrap value = 870, 100 replicates), whereas the maximum-likelihood analysis built a tree in which me clusters with o and e12 sequences (quartet puzzling reliability = 92%).
Two Gypsy Lineages Exist in the melanogaster Subgroup Species
The fact that me clusters with [t, y, o, e12] is the first hint of disagreement between the Drosophila and Gypsy phylogenies. Such inconsistency suggests horizontal transfer events. However, it was shown that disagreements between phylogenies can also be explained by other evolutionary processes, such as faster evolution in certain lineages and/or ancestral polymorphism (Capy, Anxolabehere, and Langin 1994
). In order to test the possibility of horizontal transfers of Gypsy, we estimated the Jukes-Cantor nucleotide distances between int sequences from the eight species and compared them with the Jukes-Cantor distances between the 3' untranslated regions of R1 sequences (Eickbush et al. 1995
). R1 is a non-LTR retroelement found in many insects (Eickbush et al. 1995
). Comparison with R1 is useful for two reasons: (1) it was shown that R1 evolves at rates similar to those of nuclear genes in the melanogaster species subgroup (Eickbush et al. 1995
); (2) moreover, its replication involves, like Gypsy, a reverse transcriptase enzyme which is known to be error-prone. By comparing the distances values of int and R1, we can test horizontal versus vertical transfer: if int is vertically transmitted and evolves at the same rate as R1, the nucleotide distances for all pairwise comparisons of int and R1 from the eight species should be equal.
Results are presented in figure 4 . Pairwise distances of int and R1 from D. simulans, D. sechellia, and D. mauritiana are roughly equal. However, int distances between the [si, se, ma] and [me, t, y, o, e12] clusters are larger than the R1 distances, whereas int distances between species from the [me, t, y, o, e12] cluster are smaller than R1 distances. This result is the second hint of disagreement between the Drosophila and Gypsy phylogenies and rules out the presence of an ancestral polymorphism in the common ancestor of the melanogaster subgroup species. This strongly suggests horizontal transfers between D. erecta, D. orena, D. teissieri, D. yakuba, and D. melanogaster.
|
The three e345 sequences have a special status in the int phylogeny: the int e345/[si, se, ma] Jukes-Cantor distances are equivalent to the R1 D. erecta/[D. simulans, D. sechellia, D. mauritiana] distances, suggesting that these sequences were transmitted like R1 in these species, i.e., vertically from a common ancestor (fig. 4 ). In order to further study this point, we compared Ks values between species estimated from the int sequences and from the R1 coding sequences (table 2 ). R1 and int Ks values are comparable between D. erecta and [D. simulans, D. sechellia, D. mauritiana], suggesting that si, se, ma, and e345 may have evolved like R1 in D. simulans, D. sechellia, D. mauritiana, and D. erecta.
|
We also note the high Ks/Ka ratios between [si, se, ma] sequences and the other sequences (table 1 ). This result indicates that int is subjected to purifying selection, which is indicative of frequent replications of Gypsy.
|
Thus, we have shown the presence of two major lineages in the melanogaster subgroup: one lineage (GypA) is present in the D. simulans, D. sechellia, and D. mauritiana species, whereas the second lineage (GypB) is present in the other species. We have also shown that GypB sequences [me, ma, t, y, o, e12] result clearly from multiple horizontal transfer events. Because the number of sequences analyzed may be not representative of the entire population of Gypsy in the different genomes, we validated our sampling procedure by designing a set of primers specific for GypA and GypB in order to detect the presence or absence of these two lineages in the L PCR products from these four species (table 3 ). The results indicate unambiguously that GypA is not present in D. melanogaster and that GypB is absent in D. simulans, D. sechellia, and D. mauritiana.
|
However, the e345 sequences bring complexity into the phylogenetic pattern of Gypsy. We should first notice the high polymorphism level of Gypsy sequences within D. erecta, knowing that the D. erecta strain was founded by a couple of females more than 20 years ago (D. Lachaise, personal communication). Although e345 sequences are closer to [me, ma, t, y, o, e12] than to [si, se, ma], the R1/int comparisons between Jukes-Cantor distances and Ks values suggest that e345 sequences are representative of GypA which evolved in D. simulans, D. sechellia, D. mauritiana, and D. erecta through vertical transmission of an ancestral form (fig. 5A ). This ancestral form disappeared in D. melanogaster, D. teissieri, D. yakuba, and D. orena by stochastic loss or inactivation after the split of the melanogaster and simulans lineages 2.5 MYA and prior to the split of the three sibling species lineages. This assumption is consistent with the biogeographic hypothesis of Lachaise et al. (1998)
|
However, one can argue that Jukes-Cantor distances and Ks comparisons are not sufficient to classify e345 in the GypA lineage. Thus, an alternative hypothesis (fig. 5B ) is that Gypsy was absent in the ancestral species of the melanogaster subgroup and that two major events of horizontal transfers of GypA and GypB occurred after the split of the melanogaster and simulans lineages, when these two species were not yet sympatric (Lachaise et al. 1988
| Conclusions |
|---|
|
|
|---|
Significant evidences of horizontal transmission of transposable elements were restricted to class II or DNA-to-DNA transposons (Daniels et al. 1990
Interestingly, the absence of GypB and the presence of GypA within the simulans complex species correlates with its biogeographic history. Moreover, the fact that we found only two defective sequences, ma1 and se3, in this screening for Gypsy elements in the melanogaster subgroup and that Ks/Ka values between [si, se, ma] sequences and the other sequences are relatively high suggest that these elements are active within the melanogaster subgroup. It would be worth knowing if the Gypsy elements from other species are potentially infectious like their counterpart in D. melanogaster (Kim et al. 1994
; Song et al. 1994
; Teysset et al. 1998
). Gypsy is normally repressed in D. melanogaster by a host gene called flamenco, which controls the transposition and infective properties of Gypsy (Bucheton 1995
). We do not yet know the status of flamenco in the other species, but it would be useful to determine if the alleles of genes homologous to flamenco are permissive or restrictive for Gypsy expression in species other than D. melanogaster in order to gain further insights into the evolutionary history of the relationship between endogenous retroviruses and their hosts.
| Acknowledgements |
|---|
|
|
|---|
We are grateful to Alain Pelisson for his valuable suggestions and comments. We would like to thank Daniel Lachaise, Françoise Lemeunier, and Michel Veuille for helpful discussions and comments on the manuscript. This work was supported by the Programme "Génome" of the Centre National de la Recherche Scientifique and by the Association pour la Recherche contre le Cancer.
| Footnotes |
|---|
Pierre Capy, Reviewing Editor
1 Abbreviations: EnRV, endogenous retroviruses; ExRV, exogenous retroviruses; int, integrase; LTR, long terminal repeat. ![]()
2 Keywords: Gypsy,
endogenous retrovirus
Drosophila,
evolution
phylogeny ![]()
3 Address for correspondence and reprints: Christophe Terzian, Institut de Genetique HumaineCentre National de la Recherche Scientifique, 141 rue de la Cardonille, F-34396 Montpellier cedex, France. E-mail: christophe.terzian{at}igh.cnrs.fr ![]()
| literature cited |
|---|
|
|
|---|
Alberola, T. M., and R. de Frutos. 1996. Molecular structure of a Gypsy element of Drosophila subobscura (GypsyDs) constituting a degenerate form of insect retroviruses. Nucleic Acids Res. 24:914923.
Bucheton, A. 1995. The relationship between the flamenco gene and Gypsy in Drosophila: how to tame a retrovirus. Trends Genet. 11:349353.[Web of Science][Medline]
Bucheton, A., M. Simonelig, C. Vaury, and M. Crozatier. 1986. Sequences similar to the I transposable element involved in I-R hybrid dysgenesis in Drosophila melanogaster occur in other Drosophila species. Nature 322:650652.
Capy, P., D. Anxolabehere, and T. Langin. 1994. The strange phylogenies of transposable elements: are horizontal transfers the only explanation? Trends Genet. 10:712.
Capy, P., R. Vitalis, T. Langin, D. Higuet, and C. Bazin. 1996. Relationships between transposable elements based upon the integrase-transposase domains: is there a common ancestor? J. Mol. Evol. 42:359368.[Web of Science][Medline]
Chalvet, F., L. Teysset, C. Terzian, N. Prud'homme, P. Santamaria, A. Bucheton, and A. Pelisson. 1999. Proviral amplification of the Gypsy endogenous retrovirus of Drosophila melanogaster involves env-independent invasion of the female germline. EMBO J. 18:26592669.[Web of Science][Medline]
Daniels, S. B., K. R. Peterson, L. D. Strausbaugh, M. G. Kidwell, and A. Chovnick. 1990. Evidence for horizontal transmission of the P transposable element between Drosophila species. Genetics 124:339355.
Eickbush, D. G., W. R. Lathe, M. P. Francino, and T. H. Eickbush. 1995. R1 and R2 retrotransposable elements of Drosophila evolve at rates similar to those of nuclear genes. Genetics 139:685695.
Jordan, I. K., L. V. Matyunina, and J. F. McDonald. 1999. Evidence for the recent horizontal transfer of long terminal repeat retrotransposon. Proc. Natl. Acad. Sci. USA 96:1262112625.
Kim, A., C. Terzian, P. Santamaria, A. Pelisson, N. Prud'homme, and A. Bucheton. 1994. Retroviruses in invertebrates: the Gypsy retrotransposon is apparently an infectious retrovirus of Drosophila melanogaster. Proc. Natl. Acad. Sci. USA 91:12851289.
Lachaise, D., M. L. Cariou, J. R. David, F. Lemeunier, L. Tsacas, and M. Ashburner. 1988. Historical biogeography of the Drosophila melanogaster species subgroup. Pp. 152225 in M. K. Hecht, B. Wallace, and G. T. Prance, eds. Evolutionary biology. Vol. 22. Plenum, NY.
Lambertsson, A., S. Andersson, and T. Johansson. 1989. Cloning and characterization of variable-sized Gypsy mobile elements in Drosophila melanogaster. Plasmid 22:2231.
Li, W. H. 1993. Unbiased estimation of the rates of synonymous and nonsynonymous substitution. J. Mol. Evol. 36:9699 [letter].[Web of Science][Medline]
Lower, R., J. Lower, and R. Kurth. 1996. The viruses in all of us: characteristics and biological significance of human endogenous retrovirus sequences. Proc. Natl. Acad. Sci. USA 93:51775184.
Marlor, R. L., S. M. Parkhurst, and V. G. Corces. 1986. The Drosophila melanogaster Gypsy transposable element encodes putative gene products homologous to retroviral proteins. Mol. Cell. Biol. 6:11291134.
Maruyama, K., and D. L. Hartl. 1991. Evidence for interspecific transfer of the transposable element mariner between Drosophila and Zaprionus. J. Mol. Evol. 33:514524.[Web of Science][Medline]
Mizrokhi, L. J., and A. M. Mazo. 1991. Cloning and analysis of the mobile element Gypsy from D. virilis. Nucleic Acids Res. 19:913916.
Pelisson, A., L. Teysset, F. Chalvet, A. Kim, N. Prud'homme, C. Terzian, and A. Bucheton. 1997. About the origin of retroviruses and the co-evolution of the Gypsy retrovirus with the Drosophila flamenco host gene. Genetica 100:2937.
Prud'homme, N., M. Gans, M. Masson, C. Terzian, and A. Bucheton. 1995. Flamenco, a gene controlling the Gypsy retrovirus of Drosophila melanogaster. Genetics 139:697711.
Schlotterer, C., M. T. Hauser, A. von Haeseler, and D. Tautz. 1994. Comparative evolutionary analysis of rDNA ITS regions in Drosophila. Mol. Biol. Evol. 11:513522.[Abstract]
Song, S. U., T. Gerasimova, M. Kurkulos, J. D. Boeke, and V. G. Corces. 1994. An env-like protein encoded by a Drosophila retroelement: evidence that Gypsy is an infectious retrovirus. Genes Dev. 8:20462057.
Strimmer, K., and A. von Haeseler. 1997. Likelihood-mapping: a simple method to visualize phylogenetic content of a sequence alignment. Proc. Natl. Acad. Sci. USA 94:68156819.
Swofford, D. L. 1991. PAUP: phylogenetic analysis using parsimony. Illinois Natural History Survey, Campaign.
Teysset, L., J. C. Burns, H. Shike, B. L. Sullivan, A. Bucheton, and C. Terzian. 1998. A Moloney murine leukemia virus-based retroviral vector pseudotyped by the insect retroviral Gypsy envelope can infect Drosophila cells. J. Virol. 72:853856.
Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins. 1997. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25:48764882.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
F. Heredia, E. L. S. Loreto, and V. L. S. Valente Complex Evolution of gypsy in Drosophilid Species Mol. Biol. Evol., October 1, 2004; 21(10): 1831 - 1842. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Lerat, C. Rizzon, and C. Biemont Sequence Divergence Within Transposable Element Families in the Drosophila melanogaster Genome Genome Res., August 1, 2003; 13(8): 1889 - 1896. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Mejlumian, A. Pelisson, A. Bucheton, and C. Terzian Comparative and Functional Studies of Drosophila Species Invasion by the gypsy Endogenous Retrovirus Genetics, January 1, 2002; 160(1): 201 - 209. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Zupunski, F. Gubensek, and D. Kordi Evolutionary Dynamics and Evolutionary History in the RTE Clade of Non-LTR Retrotransposons Mol. Biol. Evol., October 1, 2001; 18(10): 1849 - 1863. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Friesen, A. Brandes, and J. S. P. Heslop-Harrison Diversity, Origin, and Distribution of Retrotransposons (gypsy and copia) in Conifers Mol. Biol. Evol., July 1, 2001; 18(7): 1176 - 1188. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-N. Volff, C. Körting, J. Altschmied, J. Duschl, K. Sweeney, K. Wichert, A. Froschauer, and M. Schartl Jule from the Fish Xiphophorus Is the First Complete Vertebrate Ty3/Gypsy Retrotransposon from the Mag Family Mol. Biol. Evol., February 1, 2001; 18(2): 101 - 111. [Abstract] [Full Text] |
||||
![]() |
B. C. Meyers, S. V. Tingey, and M. Morgante Abundance, Distribution, and Transcriptional Activity of Repetitive Elements in the Maize Genome Genome Res., October 1, 2001; 11(10): 1660 - 1676. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. J. Bowen and J. F. McDonald Drosophila Euchromatic LTR Retrotransposons are Much Younger Than the Host Species in Which They Reside Genome Res., September 1, 2001; 11(9): 1527 - 1540. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||






