Molecular Biology and Evolution 19:884-890 (2002)
© 2002 Society for Molecular Biology and Evolution
Evolution of a Polymorphic Regulatory Element in Interferon-
Through Transposition and Mutation
*Wellcome Trust Centre for Human Genetics, Oxford, United Kingdom;
University Department of Paediatrics, Oxford, United Kingdom
| Abstract |
|---|
|
|
|---|
Mammalian transposable elements have intrinsic regulatory elements that can activate neighboring genes, and it is speculated that they can also carry extrinsic transactivating DNA sequences to new genomic locations. We have identified a polymorphic segment of the human interferon-
promoter region where two adjacent binding sites for NF-
B and NFAT originated from the insertion of an Alu element approximately 2234 MYA. Both binding sites lie outside the Alu consensus sequence but within the boundaries of the insertion, suggesting that this segment of DNA was comobilized when the Alu element moved from another part of the genome. Sequence comparisons and examination of DNA-protein interactions across nine different primate species indicate that the inserted sequence contained the intact NFAT binding site, whereas the ability to bind NF-
B evolved through a series of mutations after the insertion. These observations are consistent with the notion that retropseudogenes can comobilize intact regulatory sequences to new locations and thereby influence the evolution of gene regulatory networks; however, the extent to which such events have shaped the evolution of gene regulation remains unknown. | Introduction |
|---|
|
|
|---|
Transposable elements, which comprise about 42% of the human genome (Jurka 1998
There are three known mechanisms by which a transposition event might affect gene regulation: (1) The insertion may disrupt existing regulatory elements (Wallace et al. 1991
); (2) the regulatory sequences of the transposable element itself may act on genes that are close to the site of insertion (Willoughby, Vilalta, and Oshima 2000)
; or (3) the insertion may simply provide an additional sequence within which gene regulatory elements can subsequently evolve (Hambor et al. 1993
; Britten 1996
).
This study considers a further proposed mechanism, whereby a transposable element comobilizes an adjacent regulatory element to a new genomic location (Britten and Davidson 1969, 1971
; Moran, DeBerardinis, and Kazazian 1999
). If such a mechanism existed, it would be of considerable evolutionary significance because unlike sequences which have arisen through the accumulation of mutations over millions of years, a regulatory element introduced in this manner would have immediate functional potential. When coupled with natural selection, this might favor the evolution of complex gene regulatory networks.
Specifically, we examine the evolution of a complex regulatory element in the human interferon-
promoter region. Interferon-
is a cytokine secreted by activated lymphocytes, acting as a first line of host defense against infectious agents by activating macrophages as well as playing a critical role in antibody production and cytotoxic T-cell function (Thèze 1999
). In humans, the expression of interferon-
is regulated by a 20-bp segment of DNA containing binding sites for the transcription factors NFAT and NF-
B, both of which have been shown to contribute to transcriptional activation of the gene (Sica et al. 1997
). The present study was prompted by the observation that this segment of DNA lies close to the boundary of an Alu insertion.
| Materials and Methods |
|---|
|
|
|---|
Identification of Repetitive Elements
Using the Perkin-Elmer dye-primer method, 1.5 kb of the 5' interferon-
promoter was sequenced from human genomic DNA. The mouse interferon-
5' promoter sequence (3.5 kb) was acquired from GenBank (M28381). Repetitive DNA elements were identified using RepeatMasker, a program that uses Repbase Update to identify interspersed repetitive elements (Jurka 2000)
Detection of Alu Insertion by PCR
Forward and reverse primers (forward primer 5'-ACT CAC AAT CAT ATA GCT AG-3', reverse primer 5'-AAG TCT CCT GAG GAT TAC GT-3') were designed to amplify across both the MER33 and the AluSg repetitive elements. Amplification was performed in a 15-µl reaction with 4.0 mM MgCl2, 200 µM of each dNTP, 1 x Opti buffer (Bioline), 3.1 µM of each primer, 0.75 units of BIO-X-ACT Taq polymerase (Bioline), and 1 ng of genomic DNA. These primers amplify a specific 1.5-kb fragment from human genomic DNA using the following cycling conditions: 94°C 2 min; then 10 cycles of 94°C 15 s, 57°C 30 s, 68°C 2 min, followed by 20 cycles of 94°C 15 s, 57°C 30 s, 68°C 2 min plus 10 s/cycle; and a final 5 min extension at 68°C. Species that do not have the AluSg insertion amplify a 1.2-kb fragment. Genomic DNA samples from one human, two Common Chimpanzees (Pan troglodytes), one Gorilla (Gorilla gorilla), two Orangutans (Pongo pygmaea), one Lar Gibbon (Hylobates lar), two Sulawesi Macaques (Macaca nigra), one Patas (Erythrocebus patas), two Black-handed Spider Monkeys (Ateles paniscus), and one Golden-headed Lion Tamarin (Leontopithecus chrysomelas) were amplified and visualized under ultraviolet light after electrophoresis through a 1% agarose gel with ethidium bromide staining.
Sequencing the Interferon-
Promoter in Nonhuman Primates
Samples
All 13 of the individuals from the PCR experiment plus one Hanuman Langur (Presbytis entellus) and one Abyssinian Colobus (Colobus guereza) were sequenced.
Sequencing Catarrhines
A second round PCR using forward primer 5'-ACT CAC AAT CAT ATA GCT AG-3' and reverse primer 5'-AAT GAC CAG AAA GCA AGG AAA G-3' amplified a 1,053-bp fragment under the following reaction conditions: a 15-µl reaction with 2.5 mM Mg, 20 µM of each dNTP, 15 mM Tris-HCl, 50 mM KCl, pH 8.0, 1.0 µM of each primer, 0.5 units of Taq Gold polymerase (Perkin-Elmer), and 1 µl of a 1:20 dilution of the 1.5-kb PCR product described earlier. Thermocycling conditions were: 94° 10 min followed by 30 cycles of 94° 30 s, 54° 30 s, 72° 30 s, and a final extension for 5 min at 72°. Dye-terminator sequencing was performed as prescribed by Perkin-Elmer using a nested forward primer: 5'-TGA GAC GGA ATC TAC TCT GT-3'.
Sequencing Platyrrhines
The second round PCR was the same as for the catarrhines but produced a 750-bp fragment. A nested primer (5'-GTC CTT CAT CAG AGT TGG TTA G-3') was used in a dye-terminator sequencing reaction under conditions prescribed by Perkin-Elmer, except for the annealing temperature which was 58°C.
Confirmatory Resequencing
All samples were reamplified using M13-tailed primers; PCR products were purified using QiaQuick spin columns (Qiagen) and sequenced using the Perkin-Elmer dye-primer chemistry. No discrepancies from the original sequence were found.
Sequence Analysis
Sixteen different sequences (GenBank AF323472AF323487, alignment available from PopSet 13447766) representing nine catarrhine species and two platyrrhine species were aligned using ClustalX (Thompson et al. 1997
). To allow for the Alu insertion in the catarrhines, the gap extension penalty was reduced when aligning the platyrrhine sequences. An alignment of the nine catarrhine species provided 300 bp of sequence that spans the NF-
BNFAT regulatory element. Variable sites were counted through a 10-bp sliding window and presented graphically to illustrate the clustering of mutations between positions -778 and -795.
Electrophoretic Mobility Shift Assay
Oligonucleotide probes were radiolabeled with [a-32P]-dCTP (Amersham Pharmacia Biotech, see table 1
for sequences of probes). COS-7 cells were transfected with CMV-p50expressing and CMV-p65expressing constructs, and total protein extracts were prepared by lysing cells in lysis buffer (20 mM Tris-Cl, pH 8.0, 300 mM NaCl, 0.1% NP-40, 10% glycerol) supplemented with protease inhibitors (Boehringer Mannheim). The binding reaction contained 12 mM HEPES, pH 7.8, 80100 mM KCl, 1 mM EDTA, 1 mM EGTA, 12% glycerol, and 0.5 mg of poly dI-dC (Amersham Pharmacia Biotech). Protein extracts (14 µg) were mixed in an 8-µl reaction with 0.20.5 ng of radiolabeled probe (15 x 104 CPM) and incubated at room temperature for 10 min. The reaction was analyzed by electrophoresis in a nondenaturing 5% polyacrylamide gel at 4°C in 0.5 x TBE buffer. Where indicated, gels were quantified using the Phosphorimager (Molecular Dynamics).
|
| Results |
|---|
|
|
|---|
Insertion of AluSg into the IFN-
Promoter Region in PrimatesInspection of the 5' flanking sequence of the human interferon-
gene (GenBank AF330164) revealed a block of repetitive elements located from -542 to -1207 nt relative to the human transcription start site. It comprises a MER33 element (Jurka 1990To obtain a more precise estimate of when the Alu insertion occurred, we performed PCR amplification in various primate species, using primers based on human sequence flanking the MER element (fig. 1 ). Apes (gibbon, orangutan, gorilla, chimpanzee, human) and Old World monkeys (patas, macaque) gave a 1.5-kb PCR product, corresponding to the predicted size if both MER and AluSg elements were present. In contrast, New World monkeys (spider monkey and lion tamarin) gave a 1.2-kb PCR product corresponding to the predicted size if the AluSg element was missing. Sequencing of the PCR products confirmed that this AluSg insertion occurred in the catarrhine lineage after the divergence of catarrhines and platyrrhines, approximately 2234 MYA (fig. 2 ) (GenBank: AF323472AF323487).
|
|
Alu is inserted into the genome by an enzymatic mechanism which duplicates the MER target sequence, producing direct repeats at the boundaries of the insertion (Jagadeeswaran, Forget, and Weissman 1981
Alu insertion are defined by the 13-bp target site duplication TACACTGTATTTC (corresponding to bases 97109 of the MER33 consensus). Within these boundaries is the AluSg element, 10% divergent from the consensus AluSg sequence, plus a 20-bp segment with no similarity to either MER33 or AluSg (fig. 3
). This latter segment contains the NF-
B and NFAT binding sites that have been shown to enhance transcriptional activity of the human interferon-
gene (Sica et al. 1997
|
Origin of NFAT and NF-
B Sites Within the Inserted SequenceThe above observations raise the question of whether the NF-
B and NFAT binding sites were brought with the Alu element to the interferon-
promoter region or whether they evolved from sequence provided by the insertion event. We sequenced this region in nine catarrhine species: the apes and Old World monkeys described above plus two additional Old World monkeys (langur and colobus) (fig. 3
) (Catarrhine sequences: GenBank AF323475323487). All 10 nt comprising the NFAT binding site were identical across the nine catarrhine species, indicating that this intact segment was introduced at the time of insertion. In contrast, 6 out of 10 nt at the NF-
B binding site varied between species (fig. 3 ).
To determine how this compares with interspecific nucleotide diversity in the surrounding region, we counted variable sites between -886 and -593 nt through a 10-bp sliding window. A distinct peak of variation was observed around -790 nt, i.e., the point where the AluSg sequence abuts the NF-
B binding site (fig. 4
). Across the 295-bp sequence analyzed, there were 50 variable sites of which 11 lay within a 16-bp segment (-793 to -778) which includes the 10-bp NF-
B binding site (-788 to -779). To determine whether this degree of clustering was likely to happen by chance, a computer simulation was designed to uniquely place 50 mutations randomly across 295 bases. The observed clustering of variable sites occurred nine times in 200,000 simulations. Thus, it appears that the NFAT binding site was inserted with the Alu element and has subsequently been preserved, whereas the adjacent NF-
B binding site has arisen through base substitutions subsequent to insertion.
|
Effect of Nucleotide Variation on NF-
B BindingWe investigated how the observed nucleotide variation might affect the ability of this site to bind NF-
B. Because the consensus binding motif of NF-
B and Dorsal (its functional homologue in Drosophila) is known to be remarkably well conserved across the vertebrate and invertebrate kingdoms (Gonzalez-Crespo and Levine 1994
B should provide a reasonable estimate of the functional binding properties of nucleotide sequences from different primate species. Electrophoretic mobility shift assay (EMSA) was used to test 26-bp oligoduplexes (table 1
) corresponding to the different primate sequences for binding to NF-
B p65-p50 and p50-50 (fig. 5
).
|
Both forms of NF-
B can be found at different time points in activated immune cells. The p65-p50 heterodimer is the classical form of NF-
B and is a potent transcriptional activator, whereas p50-p50 lacks an activation domain and may, in some circumstances, inhibit transcription (Udalova et al. 2000)
B.
Comparison of the human and macaque sequences with the NF-
B consensus motif suggests that a critical evolutionary event may have been a double transversion at the nucleotides corresponding to positions 3 and 4 of the NF-
B site. The clade containing macaque, patas, colobus, and langur has CC at this position, whereas the humans, gorilla, chimpanzees, orangutans, and gibbon all have AA. To examine the specific effect of this double transversion on binding affinity, we synthesized an oligoduplex that was identical to the human sequence apart from these two nucleotides which were changed from AA to CC. This resulted in almost complete loss of p65-p50 and p50-p50 binding (fig. 5
).
Analysis of the human interferon-
promoter region has revealed that three single nucleotide polymorphisms lie extremely close to the NFAT and NF-
B binding sites (J. Hull et al, unpublished data). They are a C to G substitution at -765 nt (flanking the 3' end of the NFAT binding site), a C to T substitution at -778 nt (flanking the 3' end of the NF-
B binding site), and a C to T substitution at -793 nt (5 positions 5' of the NF-
B binding site) (fig. 3B
). Table 2
summarizes the variation detected at these three positions by dideoxy sequencing of genomic DNA from 108 individuals drawn from two human populations (36 Europeans and 72 West Africans). The -793T allele was found in seven West Africans and no Europeans; the -778T allele was found in one West African and no Europeans; and the -765G allele was found in five Europeans and no West Africans.
|
All of these human polymorphisms correspond to variable sites identified in an analysis of nine catarrhine species (table 2 ). At both -765 and -793 nt the common human allele corresponds to the sequence observed in close evolutionary relatives (chimpanzee, gorilla and orangutan), whereas more distant relatives (colobus, langur, and patas) have the rare human allele. A strikingly different pattern was observed at -778 nt. Here, three different variants were observed: C was the common allele in humans (shared with chimpanzee, gorilla, and orangutan); A was the common allele in macaques (shared with patas and langur); and T, the rare human allele, was not observed in the other species.
| Discussion |
|---|
|
|
|---|
The Alu element which integrated into the interferon-
promoter region of primates approximately 2234 MYA has had complex effects on gene regulation. It appears to have brought with it an intact binding site for the transcription factor NFAT, presumably copied from an Alu flanking region elsewhere in the genome. The molecular mechanism of this extension to the Alu element is open to speculation. A 5' extension would imply some rearrangement of the internal promoter of the Alu element, whereas a 3' extension may have arisen during the process of reverse transcription through switching onto an RNA template containing the novel regulatory sites. The NFAT binding sequence has been conserved, but the adjacent sequence has undergone a high rate of nucleotide variation during primate evolution and remains polymorphic in human populations. One consequence of this variation has been the evolution of a binding site for NF-
B, separated from the NFAT binding site by only three nucleotides. The NFAT and NF-
B binding sites both participate in the regulation of interferon-
expression in human lymphocytes. An important conclusion is that Alu, the most ubiquitous family of transposable elements, can modify the regulation of a gene by carrying with it an intact transcription factor binding site from another part of the genome. Although such a mechanism has been proposed it has not previously been observed in nature. This finding lends weight to the hypothesis that transposable elements have served to disperse transcription factor binding sites across the genome, thereby facilitating the evolution of regulatory networks.
After the NFAT binding site was inserted into the interferon-
promoter region, the adjacent sequence evolved into a binding site for NF-
B. Different primate species show a remarkable degree of nucleotide diversity in and around the NF-
B binding site. On the basis of sequence comparisons and analysis of DNA-protein interactions, it appears that the critical event leading to NF-
B binding was a CC to AA double transversion that occurred in the ancestor to the Apes (fig. 2
). Our evolutionary model (fig. 6
) suggests that the first NF-
B binding sequence has been preserved in gorilla and human, whereas subsequent mutations in orangutan and chimpanzee have led to decreased binding affinity. This raises the possibility that the capacity to bind NF-
B at this site may have been subjected to evolutionary pressures.
|
Three human polymorphisms (table 2 ) are located sufficiently close to the NFAT and NF-
B binding sites to be of potential functional relevance (fig. 3B
). At positions -793 and -765 nt, the common human allele is fixed in our close evolutionary relatives, as is commonly observed (Hacia et al. 1999
It has been suggested that repetitive DNA may facilitate the dispersion of regulatory elements, allowing natural selection to fashion new networks from existing regulatory sequences (Britten and Davidson 1971
). Previous examples have shown that transposable elements may (1) disrupt existing regulatory mechanisms, (2) introduce new regulatory elements intrinsic to their consensus sequence, and (3) provide relatively unconstrained sequence for the evolution of transcription factor binding sites through base substitution. Our findings indicate that Alu, the most ubiquitous family of transposable elements, is capable of comobilizing regulatory elements extrinsic to the Alu sequence itself. It remains to be seen whether this mechanism is widespread. If so, it may have acted together with natural selection as a force for the evolution of complex regulatory networks.
| Acknowledgements |
|---|
|
|
|---|
We thank Dr. Nick Mundy for providing the primate DNA samples and valuable discussions and Ben Kremer for the C program to simulate clustering of mutational events. This work was funded by Medical Research Council grant G9505090 to D.K. H.C.A. was supported by the Rhodes Trust.
| Footnotes |
|---|
Thomas Eickbush, Reviewing Editor
Address for correspondence and reprints: Hans Ackerman, 64 Linnaean Street, #572, Harvard University, Cambridge, Massachusetts 02138. ackerman{at}fas.harvard.edu
. ![]()
| References |
|---|
|
|
|---|
Britten R. J., 1996 DNA sequence insertion and evolutionary variation in gene regulation Proc. Natl. Acad. Sci. USA 93:9374-9377
Britten R. J., E. H. Davidson, 1969 Gene regulation for higher cells: a theory Science 165:349-357
. 1971 Repetitive and non-repetitive DNA sequences and a speculation on the origins of evolutionary novelty Q. Rev. Biol 46:111-138[Medline]
Cohen S. N., 1976 Transposable genetic elements and plasmid evolution Nature 263:731-738[Medline]
Doolittle W. F., C. Sapienza, 1980 Selfish genes, the phenotype paradigm and genome evolution Nature 284:601-603[Medline]
Gonzalez-Crespo S., M. Levine, 1994 Related target enhancers for dorsal and NF-kappa B signaling pathways Science 264:255-258
Hacia J. G., J. B. Fan, O. Ryder, et al. (16 co-authors) 1999 Determination of ancestral alleles for human single-nucleotide polymorphisms using high-density oligonucleotide arrays Nat. Genet 22:164-167[Web of Science][Medline]
Hambor J. E., J. Mennone, M. E. Coon, J. H. Hanke, P. Kavathas, 1993 Identification and characterization of an Alu-containing, T-cellspecific enhancer located in the last intron of the human CD8 alpha gene Mol. Cell. Biol 13:7056-7070
Hamdi H. K., H. Nishio, J. Tavis, R. Zielinski, A. Dugaiczyk, 2000 Alu-mediated phylogenetic novelties in gene regulation and development J. Mol. Biol 299:931-939[Web of Science][Medline]
Jagadeeswaran P., B. G. Forget, S. M. Weissman, 1981 Short interspersed repetitive DNA elements in eucaryotes: transposable DNA elements generated by reverse transcription of RNA pol III transcripts? Cell 26:141-142[Web of Science][Medline]
Jurka J., 1990 Novel families of interspersed repetitive elements from the human genome Nucleic Acids Res 18:137-141
. 1997 Sequence patterns indicate an enzymatic involvement in integration of mammalian retroposons Proc. Natl. Acad. Sci. USA 94:1872-1877
. 1998 Repeats in genomic DNA: mining and meaning Curr. Opin. Struct. Biol 8:333-337[Web of Science][Medline]
. 2000 Repbase update: a database and an electronic journal of repetitive elements Trends Genet 16:418-420[Web of Science][Medline]
Jurka J., P. Klonowski, 1996 Integration of retroposable elements in mammals: selection of target sites [letter] J. Mol. Evol 43:685-689[Web of Science][Medline]
Kawashima I., K. Mita-Honjo, Y. Takiguchi, 1992 Characterization of the primate-specific repetitive DNA element MER1 DNA Seq 2:313-318[Medline]
Moran J. V., R. J. DeBerardinis, H. H. Kazazian Jr, 1999 Exon shuffling by L1 retrotransposition [see comments] Science 283:1530-1534
Nevers P., H. Saedler, 1977 Transposable genetic elements as agents of gene instability and chromosomal rearrangements Nature 268:109-115[Medline]
Orgel L. E., F. H. Crick, 1980 Selfish DNA: the ultimate parasite Nature 284:604-607[Medline]
Rozmahel R., H. H. Heng, A. M. Duncan, X. M. Shi, J. M. Rommens, L. C. Tsui, 1997 Amplification of CFTR exon 9 sequences to multiple locations in the human genome Genomics 45:554-561[Web of Science][Medline]
Sica A., L. Dorman, V. Viggiano, M. Cippitelli, P. Ghosh, N. Rice, H. A. Young, 1997 Interaction of NF-kappaB and NFAT with the interferon-gamma promoter J. Biol. Chem 272:30412-30420
Singer M. F., 1982 SINEs and LINEs: highly repeated short and long interspersed sequences in mammalian genomes Cell 28:433-434[Web of Science][Medline]
Smit A. F., 1999 Interspersed repeats and other mementos of transposable elements in mammalian genomes Curr. Opin. Genet. Dev 9:657-663[Web of Science][Medline]
Thèze J., 1999 The cytokine network and immune functions Oxford University Press, Oxford
Thompson J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, D. G. Higgins, 1997 The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools Nucleic Acids Res 25:4876-4882
Udalova I. A., A. Richardson, A. Denys, C. Smith, H. Ackerman, B. Foxwell, D. Kwiatkowski, 2000 Functional consequences of a polymorphism affecting NF-kappaB p50-p50 binding to the TNF promoter region Mol. Cell Biol 20:9113-9119
Van Arsdell S. W., R. A. Denison, L. B. Bernstein, A. M. Weiner, T. Manser, R. F. Gesteland, 1981 Direct repeats flank three small nuclear RNA pseudogenes in the human genome Cell 26:11-17[Web of Science][Medline]
Wallace M. R., L. B. Anderson, A. M. Saulino, P. E. Gregory, T. W. Glover, F. S. Collins, 1991 A de novo Alu insertion results in neurofibromatosis type 1 Nature 353:864-866[Medline]
Willoughby D. A., A. Vilalta, R. G. Oshima, 2000 An Alu element from the K18 gene confers position-independent expression in transgenic mice J. Biol. Chem 275:759-768
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. J. P. Thomson, F. G. Goh, H. Banks, T. Krausgruber, S. V. Kotenko, B. M. J. Foxwell, and I. A. Udalova The role of transposable elements in the regulation of IFN-{lambda}1 gene expression PNAS, July 14, 2009; 106(28): 11564 - 11569. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. A. Schlenke and D. J. Begun Strong selective sweep associated with a transposon insertion in Drosophila simulans PNAS, February 10, 2004; 101(6): 1626 - 1631. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Contente, H. Zischler, A. Einspanier, and M. Dobbelstein A Promoter that Acquired p53 Responsiveness During Primate Evolution Cancer Res., April 15, 2003; 63(8): 1756 - 1758. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Santiago, C. Herraiz, J. R. Goni, X. Messeguer, and J. M. Casacuberta Genome-wide Analysis of the Emigrant Family of MITEs of Arabidopsis thaliana Mol. Biol. Evol., December 1, 2002; 19(12): 2285 - 2293. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||








