Skip Navigation


MBE Advance Access originally published online on March 10, 2006
Molecular Biology and Evolution 2006 23(6):1097-1100; doi:10.1093/molbev/msj122
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrow All Versions of this Article:
23/6/1097    most recent
msj122v2
msj122v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Gibbs, M. J.
Right arrow Articles by Efimov, B. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gibbs, M. J.
Right arrow Articles by Efimov, B. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

Letter

Two Families of Rep-Like Genes That Probably Originated by Interspecies Recombination Are Represented in Viral, Plasmid, Bacterial, and Parasitic Protozoan Genomes

Mark J. Gibbs*, Vladimir V. Smeianov{dagger}, James L. Steele{dagger}, Peter Upcroft{ddagger} and Boris A. Efimov§

* School of Botany and Zoology, The Australian National University, Canberra, Australia; {dagger} Food Science, University of Wisconsin at Madison; {ddagger} The Australian Centre for International and Tropical Health and Nutrition, Queensland Institute of Medical Research, Brisbane, Queensland, Australia; and § Department of Microbiology, Russian State Medical University, Moscow, Russia

E-mail: mark.gibbs{at}anu.edu.au.


    Abstract
 TOP
 Abstract
 Supplementary Material
 References
 
Two families of genes related to, and including, rolling circle replication initiator protein (Rep) genes were defined by sequence similarity and by evidence of intergene family recombination. The Rep genes of circoviruses were the best characterized members of the "RecRep1 family." Other members of the RecRep1 family were Rep-like genes found in the genomes of the Canarypox virus, Entamoeba histolytica, and Giardia duodenalis and in a plasmid, p4M, from the Gram-positive bacterium, Bifidobacterium pseudocatenulatum. The "RecRep2 family" comprised some previously identified Rep-like genes from plasmids of phytoplasmas and similar Rep-like genes from the genomes of Lactobacillus acidophilus, Lactococcus lactis, and Phytoplasma asteris. Both RecRep1 and RecRep2 proteins have a nucleotide-binding domain significantly similar to the helicases (2C proteins) of picorna-like viruses. On the N-terminal side of the nucleotide binding domain, RecRep1 proteins have a domain significantly similar to one found in nanovirus Reps, whereas RecRep2 proteins have a domain significantly similar to one in the Reps of pLS1 plasmids. We speculate that RecRep genes have been transferred from viruses or plasmids to parasitic protozoan and bacterial genomes and that Rep proteins were themselves involved in the original recombination events that generated the ancestral RecRep genes.

Key Words: interspecies recombination • gene family • circovirus • RecRep1 • pLS1 plasmid • parasitic protozoan

Protein families often have complex origins and histories. Genes are created, duplicated, and lost, and similarly, gene subsequences that encode protein domains are duplicated, reordered, and lost. Operons and complete genes are also exchanged between different species by horizontal gene transfer or interspecies recombination. Until recently, there was no evidence that sequences that encoded only part of a protein, rather than a complete protein, could be exchanged between organisms to create new genes and, hence, proteins. Gibbs and Weiller (1999)Go reported that the replication initiator proteins (Reps) of circoviruses contained evidence of this process. These protein sequences had an N-terminal part (115 aa out of 312 aa) related to the Reps of nanoviruses and a C-terminal part (125 aa) related to the 2C proteins of picorna-like viruses. Nanoviruses and circoviruses have similar circular single-stranded DNA genomes (Meehan et al. 1997Go), but picorna-like viruses have linear single-stranded RNA genomes and do not produce a DNA form at any stage of their replication. Picorna-like viruses are not related in any way to nanoviruses and only through the 2C protein sequence to circoviruses. Hence, the circovirus Rep genes appeared to have originated through recombination, combining gene segments from unrelated viruses. In another report, Oshima et al. (2001)Go characterized a plasmid, pOYW, from the onion yellows phytoplasma, that had a Rep-like gene. The N-terminal half of the protein sequence (192 aa out of 377 aa) was related to the Rep proteins of pLS1 family plasmids and a C-terminal region (100 aa) was related to the Rep genes of circoviruses and the helicases of picorna-like viruses.

The Reps of circoviruses, nanoviruses, and the pLS1 family plasmids are probably functionally similar (Gruss and Ehrlich 1989Go; Hafner et al. 1997Go; del Solar et al. 1998Go, Cheung 2004Go). They initiate rolling circle replication by binding at an origin of replication sequence, catalyzing a break (nick) in the plus strand, from which a host-encoded DNA polymerase extends to copy the complementary circle. The Rep probably becomes linked to the nicked DNA and cuts and ligates the copied DNA to reform single-stranded circles. Since the initial analyses of the circovirus and pOYW Reps, the genomes of many more organisms have been sequenced. Here we report the presence of related Rep-like genes in the genomes of a disparate set of organisms and plasmids. We have analyzed the relationships of the newly discovered genes and found that they belong to two families with unusual recombinant relationships; we have also identified conserved motifs in the proteins they encode.

We identified Rep-like genes by searching the GenBank databases (releases from 2000 to mid 2005) using the BlastP and PSI-Blast programs to detect protein sequences translated from the genes (Altschul et al. 1997Go). The circovirus and pOYW Rep proteins were used as the query sequences, as were the protein sequences translated from each of the new Rep-like genes as they were identified. From these searches, 11 new Rep-like genes were identified (table 1). Related sequences detected in these searches included the 174 known circovirus Reps and four known Rep-like genes in plasmids from phytoplasma closely similar to pOYW. The new Rep-like genes included three apparently complete genes in each of the genomes of Entamoeba histolytica and Lactobacillus acidophilus. Single, apparently complete, Rep-like genes were also found in the genomes of Giardia duodenalis, Lactococcus lactis, Phytoplasma asteris, and the Canarypox virus and in a small double-stranded DNA plasmid, p4M, from B. pseudocatenulatum. Other open reading frames, five in all, were also detected that were most closely related to the complete new Rep-like genes but that lacked long regions of sequence at the 5' or 3' end. Such apparently truncated genes were found in all the listed genomes and one of the plasmids from a phytoplasma.


View this table:
[in this window]
[in a new window]
 
Table 1 Rep and Rep-Like Genes and Proteins and the Organisms in Which They Were Identified

 
Expect values (E values) obtained from BlastP database searches of the complete "nr" database (January 2005) were used to estimate the significance of similarities between the protein sequences translated from the Rep-like genes. This process was used to assess if the proteins, and therefore the genes, were homologues (Brenner, Chothia, and Hubbard 1998Go; Korf, Yandell, and Bedell 2003Go). The E values showed that the complete new Rep-like genes probably belonged to two groups (table 2). The Rep-like genes from the plasmid p4M, G. duodenalis, E. histolytica, and the Canarypox virus formed one group, the "RecRep1 family," that were most closely related to circovirus Reps (E < 1 x 10–14) and showed no homology to plasmid Reps. The complete Rep-like genes from pOYW, L. lactis, L. acidophilus, and P. asteris formed a second distinct group, the "RecRep2 family," as they were closely related to each other, were related to some plasmid genes, and were only very distantly related to circovirus Reps (E > 0.01).


View this table:
[in this window]
[in a new window]
 
Table 2 E Values from BlastP Searches Indicating the Significance of the Similarities Among the Rep-Like Genes and the Rep Genes of Circoviruses and the pLS1 Family Plasmids

 
Evidence of recombination was detected by searching for matching conserved domains using the Conserved Domains database (Marchler-Bauer and Bryant 2004Go) and by searching the GenBank databases. The searches showed that the C-terminal regions of RecRep1 and RecRep2 proteins have similar affinities, but the N-terminal regions of the proteins are unrelated and have distinct affinities. When searches were done using complete RecRep1 or RecRep2 proteins, significant similarity was detected between a C-terminal region (116–123 aa long) and the conserved domain pfam00910 (table 3). Pfam00910 is comprised of the RNA helicases (2C proteins) of viruses in the picorna-like virus supergroup, which includes the Comoviridae, Caliciviridae, Picornaviridae, and Sequiviridae. Matches with individual helicase sequences from the supergroup were also detected, although they were generally weak. When using RecRep2 sequences, matches were also detected with individual circovirus Rep C-terminal sequences. The C-terminal sequence provided the only evidence of similarity between the RecRep1 and RecRep2 proteins.


View this table:
[in this window]
[in a new window]
 
Table 3 The Affinities of the N-Terminal and C-Terminal Regions of the RecRep Proteins As Estimated by E Value

 
N-terminal sequences provided a different picture of distinct affinities. Significant similarities were detected between an N-terminal region (95–110 aa long) of the RecRep1 proteins and the conserved domain pfam02407 (table 3). Pfam02407 is comprised of Reps from nanoviruses and other related viruses in the Nanoviridae. Significant matches were also identified with individual nanovirus and circovirus Reps across the same N-terminal region. By contrast, when searches were made with N-terminal regions of RecRep2 proteins, no homology to pfam02407, pfam00910, or individual nanovirus, circovirus, or picorna-like virus sequences were detected. Instead, significant matches were detected between an N-terminal region (110–130 aa long) and pfam01719 and with individual Reps from pLS1 family plasmids. Pfam01719 is comprised of replication proteins of the pLSI (type 2) family of plasmids from Gram-positive bacteria.

Alignments of the protein sequences translated from the RecRep1 and RecRep2 genes (see Figures 1 and 2, Supplementary Material online) made using T-Coffee (Notredame, Higgins, and Heringa 2000Go) supported the evidence of recombination and were consistent with the groupings indicated by the Blast analysis. The alignments showed that all the protein sequences included an NTP-binding site (Walker A motif) within the C-terminal domain (Walker et al. 1982Go). Both families also appeared to have the "Walker B" motif, although some lacked the residue that starts the motif, that is, arginine or lysine, and the spacing of the elements of the motif was inconsistent. Ilyina and Koonin (1992)Go reported three other motifs common to plasmids and some DNA viruses that replicate by the rolling circle mechanism, and it was widely believed that the circovirus and nanovirus Reps had the three motifs (Hafner et al. 1997Go; Mankertz and Hillenbrand 2001Go). Our alignments of the RecRep1 family proteins, including the circovirus Reps, showed that only the last of the three motifs was present, generally as "EYCSKE." The regions that were previously identified as matching motifs deviated substantially from the motif descriptions. Nanovirus Reps also lack the first two motifs. Oshima et al. (2001)Go noted that the pOYW Rep-like gene encoded a protein with the motifs described by Ilyina and Koonin (1992)Go. We found that only the second and third of the three motifs were conserved in the RecRep2 proteins. We did identify some other conserved amino acid motifs. They included a motif of the form "(I/L)H(D/N)KD" that lay on the N-terminal side of the two-His motif in the RecRep2 family and two moderately conserved motifs in the N-terminal domain of the RecRep1 family, that is, "(K/R)RWxFT(I/L)NN" and "IxGxEx4–5TPHLQG."

The distribution of RecRep genes in a variety of different plasmids and viruses and the existence of domain homologues in other extrachromosomal elements suggests that the RecRep gene lineages originated in such elements. The presence of RecRep genes in the genomes of a small number of disparate cellular organisms is probably explained by their spread with extrachromosomal elements, followed by integration. It seems likely that the RecRep proteins were themselves involved in the integration events as they probably have DNA binding, cutting, and ligating activity. It is also possible that Rep proteins had some role in the recombination events that generated the first ancestral RecRep1 and RecRep2 genes. There are, of course, alternatives to all these speculations about origins, and we are unsure about the functions of most of the Rep-like proteins. It is surprising to find representatives of these gene families in the genomes of parasitic protozoans. No viruses or plasmid-like DNAs from the groups mentioned here have been found associated with protozoans, although the presence of the RecRep genes suggests that the protozoans have been exposed to extrachromosomal elements that carry them. A search for such elements should now be made.


    Supplementary Material
 TOP
 Abstract
 Supplementary Material
 References
 
Supplementary Figures 1 and 2 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).


    Footnotes
 
Laura Katz, Associate Editor


    References
 TOP
 Abstract
 Supplementary Material
 References
 

    Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389–3402.[Abstract/Free Full Text]

    Brenner, S. E., C. Chothia, and T. J. Hubbard. 1998. Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships. Proc. Natl. Acad. Sci. USA 95:6073–6078.[Abstract/Free Full Text]

    Cheung, A. K. 2004. Detection of template strand switching during initiation and termination of DNA replication of porcine circovirus. J. Virol. 78:4268–4277.

    del Solar, G., R. Giraldo, M. J. Ruiz-Echevarria, M. Espinosa, and R. Diaz-Orejas. 1998. Replication and control of circular bacterial plasmids. Microbiol. Mol. Biol. Rev. 62:434–464.[Abstract/Free Full Text]

    Gibbs, M. J., and G. F. Weiller. 1999. Evidence that a plant virus switched hosts to infect a vertebrate and then recombined with a vertebrate-infecting virus. Proc. Natl. Acad. Sci. USA 96:8022–8027.[Abstract/Free Full Text]

    Gruss, A., and S. D. Ehrlich. 1989. The family of highly interrelated single-stranded deoxyribonucleic acid plasmids. Microbiol. Rev. 53:231–241.[Abstract/Free Full Text]

    Hafner, G. J., M. R. Stafford, L. C. Wolter, R. M. Harding, and J. L. Dale. 1997. Nicking and joining activity of banana bunchy top virus replication protein in vitro. J. Gen. Virol. 78:1795–1799.[Abstract]

    Ilyina, T. V., and E. V. Koonin. 1992. Conserved sequence motifs in the initiator proteins for rolling circle DNA replication encoded by diverse replicons from eubacteria, eucaryotes and archaebacteria. Nucleic Acids Res. 20:3279–3285.[Abstract/Free Full Text]

    Korf, I., M. Yandell, and J. Bedell. 2003. BLAST: an essential guide to the basic alignment search tool. O'Reilly and Associates, Sebastopol, Calif.

    Mankertz, A., and B. Hillenbrand. 2001. Replication of porcine circovirus type 1 requires two proteins encoded by the viral rep gene. Virology 279:429–438.[CrossRef][ISI][Medline]

    Marchler-Bauer, A., and S. H. Bryant. 2004. CD-Search: protein domain annotations on the fly. Nucleic Acids Res. 32:W327–W331.[Abstract/Free Full Text]

    Meehan, B. M., J. L. Creelan, M. S. McNulty, and D. Todd. 1997. Sequence of porcine circovirus DNA: affinities with plant circoviruses. J. Gen. Virol. 78:221–227.[Abstract]

    Notredame, C., D. G. Higgins, and J. Heringa. 2000. T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302:205–217.[CrossRef][ISI][Medline]

    Oshima, K., S. Kakizawa, H. Nishigawa, T. Kuboyama, S. Miyata, M. Ugaki, and S. Namba. 2001. A plasmid of phytoplasma encodes a unique replication protein having both plasmid- and virus-like domains: clue to viral ancestry or result of virus/plasmid recombination? Virology 285:270–277.[CrossRef][Medline]

    Walker, J. E., M. Saraste, M. J. Runswick, and N. J. Gay. 1982. Distantly related sequences in the alpha- and beta-subunits of ATP synthase, myosin, kinases and other ATP-requiring enzymes and a common nucleotide binding fold. EMBO J. 1:945–951.[ISI][Medline]

Accepted for publication March 1, 2006.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrow All Versions of this Article:
23/6/1097    most recent
msj122v2
msj122v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (1)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Gibbs, M. J.
Right arrow Articles by Efimov, B. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gibbs, M. J.
Right arrow Articles by Efimov, B. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?