MBE Advance Access originally published online on December 20, 2005
Molecular Biology and Evolution 2006 23(3):479-481; doi:10.1093/molbev/msj076
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Letter |
The Contribution of LTR Retrotransposon Sequences to Gene Evolution in Mus musculus
Department of Genetics, University of Georgia
E-mail: john.mcdonald{at}biology.gatech.edu.
| Abstract |
|---|
|
|
|---|
Approximately 1.5% of mouse genes (Mus musculus) contain long terminal repeat retrotransposon sequences (LRS). Consistent with earlier findings in Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens, LRS are more likely to be associated with newly evolved genes. Evidence is presented that LRS are often recruited as novel exons or as spliced additions to existing exons. These novel gene configurations may be expressed initially as alternative transcripts providing an opportunity for the evolution of new gene function.
Key Words: LTR retrotransposon gene evolution Mus musculus mouse
| Background |
|---|
|
|
|---|
Once considered parasitic sequences of little or no functional significance (Doolittle and Sapienza 1980
4% of protein-coding regions (Nekrutenko and Li 2001
27% of untranslated regions (van de Lagemaat et al. 2003
25% of promoter regions (Jordan et al. 2003| LRS Are Components of Many Mouse Genes |
|---|
|
|
|---|
Recently, 21 families of long terminal repeat (LTR) retrotransposons have been identified in the mouse genome, including 13 not previously described (McCarthy and McDonald 2004
1.5%) were found to contain LRS. It has been reported that
0.9% of genes in humans contain LRS (Nekrutenko and Li 2001
As a rule, genes involved in basic cellular functions are relatively conserved across taxa, while more recently evolved, specialized genes are taxa specific (e.g., van de Lagemaat et al. 2003
; Castillo-Davis et al. 2004
). To determine if mouse LRS are more likely to be associated with newly evolved genes, we queried the ortholog information provided by the Homologene data set (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=homologene). Homologenes are genes associated with functions that are generally conserved among even phylogenetically diverse groups of species (Wheeler et al. 2003
). Of the 18,374 Unigenes examined, 11,341 had a homolog assigned by the Homologene data set, 88 (
0.8%) of these contained an LRS. Thus, LRS are associated with homologenes about half as frequently as they are with all mouse genes.
| Many LRS Are Located in Mouse Exons |
|---|
|
|
|---|
Our results indicate LRS are located within the coding regions of many mouse genes. Such associations are believed to have arisen directly by insertion of an LRS into an existing exon or indirectly by exon recruitment of an LRS from an adjacent intron or untranslated leader region (ULR) (Nekrutenko and Li 2001
95% LRS. Exons composed almost exclusively of LRS are considered to have been recruited as novel exons from LRS located in adjacent introns or ULRs (e.g., Nekrutenko and Li 2001
The addition of LRS to preexisting exons may be expected to disrupt gene function and be eliminated by natural selection. One possible hypothesis to explain the maintenance of relatively high number of LRS associated with exons is they are tolerated by natural selection at loci encoding multiple alternative transcripts. Under such a scenario, an inserted sequence would, due to the presence of appropriate splice acceptor/donor sites, be associated with generation of one or more novel alternative transcripts while the native transcript maintains the original gene function. Over evolutionary time, a novel transcript containing the LRS may evolve to encode a function favored by natural selection and thus be selectively maintained in conjunction with, or in place of, the original transcript. Under this hypothesis, alternative transcripts generated by TE insertions may provide an opportunity for the evolution of new gene functions in a manner similar to what has been proposed for gene duplications (Ohno 1970
). Consistent with this view, we found that those mouse genes confirmed to encode alternative transcripts rarely contained LRS in all transcripts (table 1).
|
| LRS Are Preferentially Associated with Genes Encoding Metabolic Functions |
|---|
|
|
|---|
Functional information from the Gene Ontology (GO) Consortium (http://www.geneontology.org/) was used to investigate possible functional trends among genes associated with LRS. GO networks are composed of three main functional classifications: molecular function, cellular component, and biological process. A number of subclasses are listed under each of these classifications. Based on the observed frequency of all experimentally verified genes in the Unigene database that group under each of the GO classifications, we computed the number of LRS-associated genes expected to group under each GO classification. This expected number was compared with the observed number to identify significant differences within the subclasses of each main classification. Only subclasses within the "biological process" classification demonstrated a significant difference between observed and expected numbers of associations (chi square = 30.05, df = 6, P
0.025).
The results presented in table 2 show mouse genes grouped under the biological process classification that encode physiological functions associated with LRS more frequently (251 observed, 0.68 success probability, 307 trials, P
0.025), while genes encoding cellular processes are associated with LRS less frequently (39 observed, 0.24 success probability, 307 trials, P
0.025) than expected. Significant deviations from expected numbers were also observed in two additional subclasses of physiological function. Genes associated with LRS that encode cell growth and maintenance functions are less frequent, while LRS-associated genes encoding metabolic functions were more frequent than expected.
|
| Summary and Conclusions |
|---|
|
|
|---|
Consistent with earlier studies of Homo sapiens (Nekrutenko and Li 2001
| Footnotes |
|---|
1 Present address: Department of Biology, University of North Carolina.
2 Present address: School of Biology, Georgia Institute of Technology. ![]()
| References |
|---|
|
|
|---|
Bowen, N. J., I. K. Jordan, J. A. Epstein, V. Wood, and H. L. Levin. 2003. Retrotransposons and their recognition of pol II promoters: a comprehensive survey of the transposable elements from the complete genome sequence of Schizosaccharomyces pombe. Genome Res. 13:19841997.
Brosius, J. 1999. Genomes were forged by massive bombardments with retroelements and retrosequences. Genetica 107:209238.[CrossRef][ISI][Medline]
Castillo-Davis, C. I., F. A. Kondrashov, D. L. Hartl, and R. J. Kulathinal. 2004. The functional genomic distribution of protein divergence in two animal phyla: coevolution, genomic conflict, and constraint. Genome Res. 14:802811.
Charlesworth, B., P. Sniegowski, and W. Stephan. 1994. The evolutionary dynamics of repetitive DNA in eukaryotes. Nature 371:215220.[CrossRef][Medline]
Deininger, P. L., J. V. Moran, M. A. Batzer, and H. H. Kazazian Jr. 2003. Mobile elements and mammalian genome evolution. Curr. Opin. Genet. Dev. 13:651658.[CrossRef][ISI][Medline]
Doolittle, W. F., and C. Sapienza. 1980. Selfish genes, the phenotype paradigm and genome evolution. Nature 284:601603.[CrossRef][Medline]
Ganko, E. W., V. Bhattacharjee, P. Schliekelman, and J. F. McDonald. 2003. Evidence for the contribution of LTR retrotransposons to C. elegans gene evolution. Mol. Biol. Evol. 20:19251931.
Ganko, E. W., C. S. Greene, J. A. Lewis, V. Bhattacharjee, and J. F. McDonald. 2006. LTR retrotransposon-gene associations in Drosophila melanogaster. J. Mol. Evol. (in press).
Jordan, I. K., I. B. Rogozin, G. V. Glazko, and E. V. Koonin. 2003. Origin of a substantial fraction of human regulatory sequences from transposable elements. Trends Genet. 19:6872.[CrossRef][ISI][Medline]
Kazazian, H. H. Jr. 2004. Mobile elements: drivers of genome evolution. Science 303:16261632.
Kidwell, M. G., and D. R. Lisch. 2001. Perspective: transposable elements, parasitic DNA, and genome evolution. Evol. Int. J. Org. Evol. 55:124.
Makalowski, W. 2003. Genomics. Not junk after all. Science 300:12461247.
McCarthy, E. M., and J. F. McDonald. 2004. Long terminal repeat retrotransposons of Mus musculus. Genome Biol. 5:R14.[CrossRef][Medline]
McDonald, J. F. 1993. Evolution and consequences of transposable elements. Curr. Opin. Genet. Dev. 3:855864.[CrossRef][Medline]
Nekrutenko, A., and W. H. Li. 2001. Transposable elements are found in a large number of human protein-coding genes. Trends Genet. 17:619621.[CrossRef][ISI][Medline]
Ohno, S. 1970. Evolution by gene duplication. Springer-Verlag, New York.
Orgel, L. E., and F. H. C. Crick. 1980. Selfish DNA: the ultimate parasite. Nature 284:604607.[CrossRef][Medline]
Smit, A. F. 1999. Interspersed repeats and other mementos of transposable elements in mammalian genomes. Curr. Opin. Genet. Dev. 9:657663.[CrossRef][ISI][Medline]
van de Lagemaat, L. N., J. R. Landry, D. L. Mager, and P. Medstrand. 2003. Transposable elements in mammals promote regulatory variation and diversification of genes with specialized functions. Trends Genet. 19:530536.[CrossRef][ISI][Medline]
Waterston, R. H., K. Lindblad-Toh, E. Birney et al. (222 co-authors). 2002. Initial sequencing and comparative analysis of the mouse genome. Nature 420:520562.[CrossRef][Medline]
Wheeler, D. L., D. M. Church, S. Federhen et al. (11 co-authors). 2003. Database resources of the National Center for Biotechnology. Nucleic Acids Res. 31:2833.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
B. Xue, A. P. Rooney, M. Kajikawa, N. Okada, and W. L. Roelofs Novel sex pheromone desaturases in the genomes of corn borers generated through gene duplication and retroposon fusion PNAS, March 13, 2007; 104(11): 4467 - 4472. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
