MBE Advance Access originally published online on September 8, 2005
Molecular Biology and Evolution 2006 23(1):86-92; doi:10.1093/molbev/msj010
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Article |
Evolutionary History of the Coccolithoviridae


* Plymouth Marine Laboratory, Prospect Place, The Hoe, Plymouth, United Kingdom;
Marine Biological Association, Citadel Hill, Plymouth, United Kingdom; and
The Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom
E-mail: whw{at}pml.ac.uk.
| Abstract |
|---|
|
|
|---|
We recently determined the genome sequence of the Coccolithoviridae strain Emiliania huxleyi virus 86 (EhV-86), a giant double-stranded DNA (dsDNA) algal virus from the family Phycodnaviridae that infects the marine coccolithophorid E. huxleyi. Here, we determine the phylogenetic relationship between EhV-86 and other large dsDNA viruses. Twenty-five core genes common to nuclear-cytoplasmic large dsDNA virus genomes were identified in the EhV-86 genome; sequence from eight of these genes were used to create a phylogenetic tree in which EhV-86 was placed firmly with the two other members of the Phycodnaviridae. We have also identified a 100-kb region of the EhV-86 genome which appears to have transferred into this genome from an unknown source. Furthermore, the presence of six RNA polymerase subunits (unique among the Phycodnaviridae) suggests both a unique evolutionary history and a unique lifestyle for this intriguing virus.
Key Words: Phycodnaviridae Coccolithoviridae virus evolution NCLDV Emiliania huxleyi
| Introduction |
|---|
|
|
|---|
The category of virus is traditionally defined by biological characteristics, rather than by evolutionary roots. Classification was originally based on host range and morphology; however, it is now common to classify, primarily, according to the type of genome (single- or double-stranded [ds] RNA or DNA) (Bamford, Burnett, and Stuart 2002
Recently, whole genome comparisons have identified a group of large dsDNA viruses that are likely to have shared a common ancestor (Iyer, Aravind, and Koonin 2001
). The nuclear-cytoplasmic large double-stranded DNA virus (NCLDV) group is composed of at least five families that replicate in the nucleus and/or cytoplasm of eukaryotic cells (Poxviridae, Iridoviridae, Asfarviridae, Phycodnaviridae, and a newly proposed member, Mimiviridae). These diverse families are likely to have shared a common ancestor which encoded complex systems for DNA replication and transcription, a redox protein, and a possible inhibitor of apoptosis. Nine genes are found to be shared by genomes from all family members (Group I), and a further 22 are found in at least three of the four families (Groups II and III) (Iyer, Aravind, and Koonin 2001
). It is thought that the ancestral NCLDV was likely to have had both nuclear and cytoplasmic phases of its life cycle (Iyer, Aravind, and Koonin 2001
). Lineage-specific gene loss and gain within the NCLDV families is thought to contribute to the highly diverse characteristics of present-day forms.
Poxviruses, asfarviruses, and iridoviruses encode their own transcription and replication machinery and undergo their replication cycle entirely in the cytoplasm (poxviruses) or start in the nucleus and complete in the cytoplasm (asfariviruses and iridoviruses). Less is known about the members of the highly diverse Phycodnaviridae family (which contains the genera Chlorovirus, Coccolithovirus, Prasinovirus, Prymnesiovirus, Phaeovirus, and Raphidovirus [Wilson et al. 2005b
]). Preliminary analysis on a limited number of genomes (the phaeovirus, Ectocarpus siliculosus virus 1 [ESV-1], and the chlorovirus, Paramecium bursaria chlorella virus 1 [PBCV-1]) suggested that members of the Phycodnaviridae were characterized by the loss of genes encoding for RNA polymerases, leading to the hypothesis that they have predominantly nuclear life phases (Iyer, Aravind, and Koonin 2001
). The recent sequencing of the Emiliania huxleyi virus 86 (EhV-86) genome (a coccolithovirus that infects the haptophyte E. huxleyi) has cast doubt on this assertion (Wilson et al. 2005a
). EhV-86 encodes its own RNA polymerase, hence the Phycodnaviridae family must be even more diverse than previously thought and the EhV-86 lineage diverged from the ancestral Phycodnaviridae earlier than the Phaeovirus and Chlorovirus families. Furthermore, the recent sequencing of the mimivirus (Raoult et al. 2004
) and its putative placement on a branch diverging prior to divergence of the Phycodnaviridae (based upon the absence of RNA polymerase genes) add further complexity to the history of NCLDV evolution. Here, we determine the evolutionary relationships between members of the NCLDV family including, for the first time, data from the recently sequenced EhV-86 genome. The phylogenetic analysis is based on six members of the Group I core proteins and the two large subunit RNA polymerases from Group III identified as being in the ancestral NCLDV genome (Iyer, Aravind, and Koonin 2001
).
| Materials and Methods |
|---|
|
|
|---|
Viral Genome and Protein Sequence
Nucleotide sequences of the complete genomes of large dsDNA viruses and the corresponding predicted protein sequences were downloaded from the Virus Genomes division of the Entrez system (National Center for Biotechnology Information, http://www.ncbi.nlm.nih.gov/genomes/VIRUSES/viruses.html). The complete genomes of 19 viruses representing the diverse five families of NCLDV included in this analysis were from the following viruses: Asfarviridae: African swine fever virus (Yanez et al. 1995
Sequence and Phylogenetic Analysis
Protein sequences were compared using the BlastP and PSI-Blast programs (http://www.ncbi.nlm.nih.gov/BLAST). Conserved domains within the six members of the Group I proteins (D5-like ATPase, Pfam PF03288; DNA polymerase, Pfam PF00136; A32-like ATPase, SMART SM00382; A18-like helicase, Pfam PF00270; thiol-oxidoreductase; and D6R-like helicase, Pfam PF00176) and the two large RNA polymerase subunits (rpb1, SMART SM00663; rpb2, Pfam PF00562) from Group III were identified from the 19 viral genomes, and these were concatenated for phylogenetic analysis (www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml). Multiple alignments were performed using ClustalW (http://clustalw.genome.jp). Because only highly conserved domains were used, no further editing was required after alignment. Phylogenetic analysis of all the concatenate alignments was constructed using the various programs in PHYLIP (phylogeny inference package) version 3.6b (Felsenstein 1989
), and the robustness of the alignments was tested with the bootstrapping option (SeqBoot). Genetic distances, applicable for distance matrix phylogenetic inference, were calculated using the Protdist program in the PHYLIP package. Phylogenetic inferences based on the distance matrix (Neighbor) and parsimony (Protpars) algorithms were applied to the alignments. In both trees, the best tree or majority rule consensus tree was selected using the consensus program (Consense). The trees were visualized and drawn using the TREEVIEW software version 2.1 (Page 1996
).
| Results and Discussion |
|---|
|
|
|---|
We have previously used phylogenetic analysis of the DNA polymerase gene to propose that EhV-86 belongs to the then new genus "Coccolithovirus," within the family of algal viruses, Phycodnaviridae (Schroeder et al. 2002
|
Identification of Core NCLDV Genes
BlastP searches of EhV-86 coding sequence (CDS) against NCLDV genomes were performed in order to identify NCLDV core genes. Homologues were identified for 9/9 Group I genes, 7/8 Group II genes, and 9/14 Group III genes in the EhV-86 genome (table 1). The pattern of presence/absence of Group I, II, and III core genes for the three sequenced phycodnavirus genomes (EhV-86, PBCV-1, and ESV-1), the mimivirus, and two examples from the iridoviruses (IIV-6 and LCDV), entomopoxviruses (AMEV and MSEV), and chordopoxviruses (VACV and MOCV) is shown in table 1.
Homologues for Group I genes were easily identified by a basic BlastP search, except for ehv141 which showed a single significant match (E = 1 x 1044) to a Chilo iridescent virus genome helicase, 030L. A further PSI-Blast search revealed a significant hit to mimivirus R350, previously identified as a VV6R-type helicase (Raoult et al. 2004
). The only class II gene missing in EhV-86 encodes a putative myristolyated protein. A BlastP search did reveal a weak hit (E = 1.5) to a putative myristolyated protein in the mimivirus, but further PSI-Blast searches failed to confirm this. Therefore, in common with ESV-1, a significant myristolyated protein homologue has not been identified in the EhV-86 genome (table 1).
In common with the other NCLDV genomes, EhV-86 has a distinctive pattern for the presence/absence of Group III NCLDV genes. Of the five Group III genes that both the two previously sequenced Phycodnaviridae are missing, the EhV-86 genome contains homologues for four of them. Three of these genes are core transcriptionalrelated genes (RNA polymerase subunits rpb1, rpb2, and rpb10 encoded by ehv064, ehv434, and ehv167, respectively) which have hitherto not been found before in the Phycodnaviridae. The identification of these genes indicates that the transcription of at least some EhV-86 genes can occur in the cytoplasm, an enticing hypothesis added further credence by the confirmation of expression of CDSs found with a putative unique promoter element within the EhV-86 genome (Allen, Schroeder, and Wilson 2005
; Wilson et al. 2005a
). Indeed, not only are these three Group III core transcriptionrelated genes present but other RNA polymerase subunit homologues rpb3 (ehv399), rpb5 (a Group IV NCLDV gene, ehv108), and rpb6 (ehv458) are also contained within the EhV-86 genome. Phylogenetic analysis of the RNA polymerase subunit genes shows that they are derived from the ancestral NCLDV RNA polymerase genes and are unlikely to have been acquired by horizontal gene transfer (fig. 1). The identification of RNA polymerase in EhV-86 also has implications for the present classification system of core genes. The genes for RNA polymerase subunits 1, 2, and 10 (and also the baculovirus inhibitor of apoptosis protein repeat [BIR] domaincontaining gene) should now be regarded as Group II and not Group III core NCLDV genes because they are now known to be present in the Phycodnaviridae family (table 1).
|
The distinctive pattern of core gene loss/retention in NCLDV genomes suggests a complicated history of independent gene loss events and makes the reconstruction of the Phycodnaviridae lineage in particular (on the basis of gene loss events) difficult (table 1). In order to shed light on the history of the Phycodnaviridae, we performed phylogenetic characterization using the conserved domains from conserved core genes. This approach has been successfully used in many studies to determine phylogenetic relationships among the NCLDV genomes (Iyer, Aravind, and Koonin 2001
|
It appears as if the ancestral Phycodnaviridae lineage diverged with one branch giving rise to EhV-86 and the second branch giving rise to the PBCV-1 and ESV-1 lineages (fig. 2). We suggest that the trigger for this divergence was the loss of RNA polymerase function (through the loss of one or many RNA polymerase subunits). The change in lifestyle represented by this loss (i.e., nuclear-independent to nuclear-dependent transcription) could account for the high diversity among present-day Phycodnaviridae genera. Due to the presence of the RNA polymerase subunits, we believe, of all the phycodnaviruses sequenced to date, EhV-86 represents the virus with the lifestyle most similar to the ancestral phycodnavirus. The sequencing of more Phycodnaviridae genomes, in particular from the Prasinovirus, Raphidovirus, and Prymnesiovirus genera, will shed further light on this topic.
Distribution of NCLDV Homologues
We identified 25 Group I, II, and III genes in the EhV-86 genome. Intriguingly, these genes are physically located between 0156 kbp and 330407 kbp, with none found in the region 156330 kb (fig. 3). This core genesparse region was previously identified as containing noncoding repeat elements, likely to be promoters, directly upstream of the start site of 87 predicted CDSs (Allen, Schroeder, and Wilson 2005
; Wilson et al. 2005a
) (fig. 3). Because strong expression has been shown for CDSs in this 100-kb region, these CDSs must play a crucial role(s) during infection by EhV-86 (Wilson et al. 2005a
). Annotation of genes in this region reveals the vast majority of CDSs is of unknown function with little or no homology to anything in the GenBank database. Only one CDS has significant sequence similarity in the other NCLDV genomes; ehv230 which has a single hit (E = 2 x 1015) to PBCV-1 A50L. This region, however, has similar G + C content and codon usage to the rest of the genome. We postulate that an ancestral EhV-86 genome acquired this region from an as yet unknown source at some point after the Coccolithoviridae lineage diverged from the ancestral phycodnavirus.
|
Although we cannot as yet discount the possibility that transfer occurred from the host to the virus, it is unlikely because no signal for these CDSs was detected in uninfected E. huxleyi cells during an EhV-86 microarray analysis (Wilson et al. 2005a
| Conclusions |
|---|
|
|
|---|
The presence of six RNA polymerase subunits in the EhV-86 genome clearly shows a unique lifestyle for this Coccolithovirus. Whereas the previously sequenced Phycodnaviridae viruses appear to have (on the basis of their genomic content) predominantly nuclear lifestyles, it appears that EhV-86 has the capacity, at least, to transcribe parts of its genome in the cytoplasm. This clearly shows the presence of distinct subfamilies within the Phycodnaviridae family. We predict the Coccolithoviridae will eventually be renamed as the Coccolithovirinae to clearly identify them as a subfamily within the Phycodnaviridae. Furthermore, we have identified a 100-kbp region of the EhV-86 genome in which the CDSs have little or no homology to anything in the databases. No conserved core genes are found in this region. It is therefore likely that this region was acquired by an ancestral Coccolithovirus genome at some point after the divergence of the Coccolithovirus genus from the other Phycodnaviridae genera. Clearly, the evolution of the Coccolithovirus is complex, and the relevance of this large transfer of genomic information must be determined. The sequencing of more Coccolithoviridae genomes will provide further insights into this unique virus genus. As further genomic characterization of the Phycodnaviridae family is performed, we hypothesize that the current genera within the family will be recognized as distinct subfamilies in their own right on the basis of their high diversity.
| Acknowledgements |
|---|
|
|
|---|
The research was supported by the Environmental Genomics community program, funded by the Natural Environmental Research Council of the United Kingdom (NERC), through award number NE/A509332/1 to W.H.W. D.C.S. is a Marine Biological Association of the United Kingdom Research Fellow funded by grant in aid from the NERC. W.H.W. is supported through the NERC-funded core strategic research program of the Plymouth Marine Laboratory.
| Footnotes |
|---|
Charles Delwiche, Associate Editor
| References |
|---|
|
|
|---|
Afonso, C. L., E. R. Tulman, Z. Lu, E. Oma, G. F. Kutish, and D. L. Rock. 1999. The genome of Melanoplus sanguinipes entomopoxvirus. J. Virol. 73:533552.
Afonso, C. L., E. R. Tulman, Z. Lu, L. Zsak, G. F. Kutish, and D. L. Rock. 2000. The genome of fowlpox virus. J. Virol. 74:38153831.
Afonso, C. L., E. R. Tulman, Z. Lu, L. Zsak, F. A. Osorio, C. Balinsky, G. F. Kutish, and D. L. Rock. 2002. The genome of swinepox virus. J. Virol. 76:783790.
Allen, M. J., D. C. Schroeder, and W. H. Wilson. 2005. Identification and preliminary characterisation of three distinct repeat families within the genome of Emiliania huxleyi Virus 86. Arch. Virol. (in press).
Bamford, D. H., R. M. Burnett, and D. I. Stuart. 2002. Evolution of viral structure. Theor. Popul. Biol. 61:461470.[CrossRef][Web of Science][Medline]
Bawden, A. L., K. J. Glassberg, J. Diggans, R. Shaw, W. Farmerie, and R. W. Moyer. 2000. Complete genomic sequence of the Amsacta moorei entomopoxvirus: analysis and comparison with other poxviruses. Virology 274:120139.[CrossRef][Web of Science][Medline]
Brunetti, C. R., H. Amano, Y. Ueda, J. Qin, T. Miyamura, T. Suzuki, X. Li, J. W. Barrett, and G. McFadden. 2003. Complete genomic sequence and comparative analysis of the tumorigenic poxvirus Yaba monkey tumor virus. J. Virol. 77:1333513347.
Cameron, C., S. Hota-Mitchell, L. Chen, J. Barrett, J. X. Cao, C. Macaulay, D. Willer, D. Evans, and G. McFadden. 1999. The complete DNA sequence of myxoma virus. Virology 264:298318.[CrossRef][Web of Science][Medline]
Delaroque, N., D. G. Muller, G. Bothe, T. Pohl, R. Knippers, and W. Boland. 2001. The complete DNA sequence of the Ectocarpus siliculosus virus EsV-1 genome. Virology 287:112132.[CrossRef][Web of Science][Medline]
Delhon, G., E. R. Tulman, C. L. Afonso, Z. Lu, A. de la Concha-Bermejillo, H. D. Lehmkuhl, M. E. Piccone, G. F. Kutish, and D. L. Rock. 2004. Genomes of the parapoxviruses ORF virus and bovine papular stomatitis virus. J. Virol. 78:168177.
Felsenstein, J. 1989. PHYLIPphylogeny inference package (version 3.2). Cladistics 5:164166.
Filee, J., P. Forterre, and J. Laurent. 2003. The role played by viruses in the evolution of their hosts: a view based on informational protein phylogenies. Res. Microbiol. 154:237243.[Medline]
Goebel, S. J., G. P. Johnson, M. E. Perkus, S. W. Davis, J. P. Winslow, and E. Paoletti. 1990. The complete DNA sequence of vaccinia virus. Virology 179:247266, 517263.[CrossRef][Web of Science][Medline]
Iyer, L. M., L. Aravind, and E. V. Koonin. 2001. Common origin of four diverse families of large eukaryotic DNA viruses. J. Virol. 75:1172011734.
Jakob, N. J., K. Muller, U. Bahr, and G. Dara. 2001. Analysis of the first complete DNA sequence of an invertebrate Iridovirus: coding strategy of the genome of Chilo iridescent virus. Virology 286:182196.[CrossRef][Web of Science][Medline]
Jancovich, J. K., J. Mao, V. G. Chinchar et al. (11 co-authors). 2003. Genomic sequence of a ranavirus (family Iridoviridae) associated with salamander mortalities in North America. Virology 316:90103.[CrossRef][Web of Science][Medline]
Li, Y., Z. Lu, L. Sun, S. Ropp, G. F. Kutish, D. L. Rock, and J. L. Van Etten. 1997. Analysis of 74 kb of DNA located at the right end of the 330-kb chlorella virus PBCV-1 genome. Virology 237:360377.[CrossRef][Web of Science][Medline]
Page, R. D. 1996. TreeView: an application to display phylogenetic trees on personal computers. Comput. Appl. Biosci. 12:357358.
Raoult, D., S. Audic, C. Robert, C. Abergel, P. Renesto, H. Ogata, B. La Scola, M. Suzan, and J. M. Claverie. 2004. The 1.2-megabase genome sequence of Mimivirus. Science 306:13441350.
Schroeder, D. C., J. Oke, G. Malin, and W. H. Wilson. 2002. Coccolithovirus (Phycodnaviridae): characterisation of a new large dsDNA algal virus that infects Emiliania huxleyi. Arch. Virol. 147:16851698.[CrossRef][Web of Science][Medline]
Senkevich, T. G., J. J. Bugert, J. R. Sisler, E. V. Koonin, G. Darai, and B. Moss. 1996. Genome sequence of a human tumorigenic poxvirus: prediction of specific host response-evasion genes. Science 273:813816.[Abstract]
Shackelton, L. A., and E. C. Holmes. 2004. The evolution of large DNA viruses: combining genomic information of viruses and their hosts. Trends Microbiol. 12:458465.[CrossRef][Web of Science][Medline]
Tan, W. G., T. J. Barkman, V. Gregory Chinchar, and K. Essani. 2004. Comparative genomic analyses of frog virus 3, type species of the genus Ranavirus (family Iridoviridae). Virology 323:7084.[CrossRef][Web of Science][Medline]
Tidona, C. A., and G. Darai. 1997. The complete DNA sequence of lymphocystis disease virus. Virology 230:207216.[CrossRef][Web of Science][Medline]
Tulman, E. R., C. L. Afonso, Z. Lu, L. Zsak, J. H. Sur, N. T. Sandybaev, U. Z. Kerembekova, V. L. Zaitsev, G. F. Kutish, and D. L. Rock. 2002. The genomes of sheeppox and goatpox viruses. J. Virol. 76:60546061.
Wahlund, T. M., A. R. Hadaegh, R. Clark, B. Nguyen, M. Fanelli, and B. A. Read. 2004. Analysis of expressed sequence tags from calcifying cells of marine coccolithophorid (Emiliania huxleyi). Mar. Biotechnol. (NY). 6:278290.
Wilson, W. H., D. C. Schroeder, M. J. Allen et al. (17 co-authors). 2005a. Complete genome sequence and lytic phase transcription profile of a Coccolithovirus. Science 309:10901092.
Wilson, W. H., J. L. Van Etten, D. S. Schroeder, K. Nagasaki, C. Brussaard, N. Delaroque, G. Bratbak, and C. Suttle. 2005b. Family: Phycodnaviridae. Pp. 163175 in C. M. Fauquet, M. A. Mayo, J. Maniloff, U. Dusselberger, and L. A. Ball, eds. Virus taxonomy, VIIIth ICTV report. Elsevier/Academic Press, London.
Yanez, R. J., J. M. Rodriguez, M. L. Nogal, L. Yuste, C. Enriquez, J. F. Rodriguez, and E. Vinuela. 1995. Analysis of the complete nucleotide sequence of African swine fever virus. Virology 208:249278.[CrossRef][Web of Science][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
G. A. Wu, S.-R. Jun, G. E. Sims, and S.-H. Kim Whole-proteome phylogeny of large dsDNA virus families by an alignment-free method PNAS, August 4, 2009; 106(31): 12826 - 12831. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Monier, A. Pagarete, C. de Vargas, M. J. Allen, B. Read, J.-M. Claverie, and H. Ogata Horizontal gene transfer of an entire metabolic pathway between a eukaryotic alga and its DNA virus Genome Res., August 1, 2009; 19(8): 1441 - 1449. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. B. Larsen, A. Larsen, G. Bratbak, and R.-A. Sandaa Phylogenetic Analysis of Members of the Phycodnaviridae Virus Family, Using Amplified Fragments of the Major Capsid Protein Gene Appl. Envir. Microbiol., May 15, 2008; 74(10): 3048 - 3057. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Allen and W. H. Wilson The coccolithovirus microarray: an array of uses Brief Funct Genomic Proteomic, December 1, 2006; 5(4): 273 - 279. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Allen, T. Forster, D. C. Schroeder, M. Hall, D. Roy, P. Ghazal, and W. H. Wilson Locus-specific gene expression pattern suggests a unique propagation strategy for a giant algal virus. J. Virol., August 1, 2006; 80(15): 7699 - 7705. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||







