Skip Navigation


MBE Advance Access originally published online on April 7, 2008
Molecular Biology and Evolution 2008 25(7):1321-1332; doi:10.1093/molbev/msn080
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Data
Right arrow All Versions of this Article:
25/7/1321    most recent
msn080v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Comeau, A. M.
Right arrow Articles by Krisch, H. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Comeau, A. M.
Right arrow Articles by Krisch, H. M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2008. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

Research Articles

The Capsid of the T4 Phage Superfamily: The Evolution, Diversity, and Structure of Some of the Most Prevalent Proteins in the Biosphere

André M. Comeau and Henry M. Krisch

Laboratoire de Microbiologie et Génétique Moléculaires, Centre National de la Recherche Scientifique—Université Paul Sabatier-Toulouse III, Toulouse, France

E-mail: krisch{at}ibcg.biotoul.fr.


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusion and Perspectives
 Supplementary Material
 Acknowledgements
 References
 
The Escherichia coli bacteriophage T4 has served as a classic system in phage biology for more than 60 years. Only recently have phylogenetic analyses and genomic comparisons demonstrated the existence of a large, diverse, and widespread superfamily of T4-like phages in the environment. We report here on the T4-like major capsid protein (MCP) sequences that were obtained by targeted polymerase chain reaction (PCR) of marine environmental samples. This analysis was then expanded to include 1,000s of new sequences of T4-like capsid genes from the metagenomic data obtained during the Sorcerer II Global Ocean Sampling (GOS) expedition. This data compilation reveals that the diversity of the major and minor capsid proteins from the GOS metagenome follows the same general patterns as the sequences from cultured phage genomes. Interestingly, the new MCP sequences obtained by PCR targeted to MCP sequences in environmental samples are more divergent (deeper branching) than the vast majority of the MCP sequences coming from the other sources. The marine T4-like phage population appears to be largely dominated by the T4-like cyanophages. Using ~1,400 T4-like MCP sequences from various sources, we mapped the degree of sequence conservation on a structural model of the T4-like MCP. The results indicate that within the T4 superfamily there are some clear phylogenetic groups with regard to the more conserved and more variable domains of the MCP. Such differences can be correlated with variations in capsid morphology, the arrangement of the MCP lattice, and the presence of different capsid accessory proteins between the subgroups of the T4 superfamily.

Key Words: bacteriophage T4 • major capsid protein • evolution • structure • diversity • metagenomics


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusion and Perspectives
 Supplementary Material
 Acknowledgements
 References
 
Bacteriophage T4 is a remarkably complex nanomachine that parasitizes and proficiently kills Escherichia coli. The T4 virion is assembled from a series of modular components including a large elongated head (~111 x 78 nm), a tail structure (~113 x 16 nm) that can be triggered to contract, and an intricate baseplate that incorporates the tail triggering mechanism and also the 6 long tail fibers that specifically bind to receptors on the surface of the host bacteria (reviewed in Leiman et al. 2003Go). Over 150 different phages infecting over 30 gram-negative bacterial species have a related morphology (Ackermann and Krisch 1997Go). The T4 genome is completely sequenced and the functions of the majority of its genes are known (Miller, Kutter, et al. 2003Go). The structure and function of many of T4's essential components have been dissected in genetic, biochemical, enzymological (Karam and Konigsberg 2000Go; Fokine et al. 2004Go, 2006Go; Letarov et al. 2005Go), crystallographic (Shamoo et al. 1995Go; Karam and Konigsberg 2000Go; Thomassen et al. 2003Go; Fokine et al. 2005Go), and electron micrographic studies (Leiman et al. 2003Go; Fokine et al. 2004Go, 2006Go; Rossmann et al. 2004Go).

Apart from T4 itself, a number of other distant T4-like phages have been studied in recent years, including Aeromonas spp. phages (Chow and Rouf 1983Go; Petrov et al. 2006Go; Comeau et al. 2007Go; Gibb and Edgell 2007Go), vibriophages (Matsuzaki et al. 1998Go, 1999Go, 2000Go; Miller, Heidelberg, et al. 2003Go), and cyanophages (Hambly et al. 2001Go; Mann et al. 2005Go; Sullivan et al. 2005Go; Weigele et al. 2007Go). These divergent T4-like phages infect hosts that are evolutionarily distant from the Enterobacteriaceae, the classical host of T4, and its closest relatives. They also vary somewhat from the classical T4 morphology, as the length of the head varies from smaller isometric forms (~85 x 85 nm) in the cyanophages (Hambly et al. 2001Go), to the more elongated forms (~140 x 80 nm) in some vibriophages and Aeromonas phages (Miller, Heidelberg, et al. 2003Go; Comeau et al. 2007Go) which contain larger genomes (>230 kb). There is variation in tail length as well, with the cyanophages having tails of up to ~180 nm in length (Hambly et al. 2001Go; Weigele et al. 2007Go). One of the criteria used to choose T4-like candidate phages for full genome sequencing was to assure representation of such morphological variants in the T4-like phage database and to include them in a full comparative analysis of the T4 superfamily (Nolan et al. 2006Go; Petrov et al. 2006Go; Comeau et al. 2007Go). Extensive phylogenetic analyses of these genomic sequences (Tétart et al. 2001Go; Desplats and Krisch 2003Go; Filée et al. 2006Go) have confirmed the existence of a large and extremely divergent superfamily of T4-like phages, a situation very similar to that now emerging for the unrelated T7 podovirus supergroup (Rohwer et al. 2000Go; Chen and Lu 2002Go; Hardies et al. 2003Go; Scholl et al. 2004Go). Until quite recently, the T4 superfamily had been believed to contain only 4 subgroups—the "true" T-evens which are coliphages very closely related to T4; the morphologically similar Pseudo T-evens which are, nonetheless, phylogenetically diverged from T4 and infect a broader range of hosts; the Schizo T-evens which are yet more divergent, morphologically distinguishable from T4, and infect Aeromonas and Vibrio spp.; and finally, the Exo T-evens which are extremely distant from T4, morphologically distinct, and infect cyanobacteria and thermophilic eubacteria (Desplats and Krisch 2003Go). However, this "simple" scenario had to be expanded as yet more T4-like phages were found in the marine environment (Filée et al. 2005Go). It became clear that the different subgroups of T4 phages that were previously thought to be narrowly restricted in their environmental range actually have a much more widespread distribution. In addition, the examination of these marine sequences related to the T4 gp23 major capsid protein (MCP) has resulted in the identification of a much larger set of thus far uncharacterized subgroups of the T4 superfamily.

The environmental MCP sequencing project reported here was targeted on the most divergent of T4-like phages. Simultaneously with this effort, a massive amount of untargeted, metagenomic data became available from the Sorcerer II Global Ocean Sampling (GOS) expedition (Rusch et al. 2007Go). This metagenomic data gave us a unique opportunity to look at the diversity and evolution of an enormous and unselected compilation MCP sequences. In addition, we could compare this metagenomic data with sequences obtained from our large-scale cultured phage genome sequencing project (Nolan et al. 2006Go; Petrov et al. 2006Go; Comeau et al. 2007Go) and also with the >100 sequences collected by our earlier targeted analysis of MCP sequences in diverse ocean samples (Filée et al. 2005Go). The resulting comparative analysis of more than a thousand homologs of the T4 MCP has allowed us to obtain a more complete representation of the vast diversity within the T4 superfamily.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusion and Perspectives
 Supplementary Material
 Acknowledgements
 References
 
Aquatic Samples
Thirteen samples of the concentrated viral size fraction (Suttle et al. 1991Go) were prepared from seawater originating from the Gulf of Mexico, the western Arctic Ocean, the Chukchi Sea, and the fjords and bays of coastal British Columbia. Ten of these samples were identical to those used in our previous study (Filée et al. 2005Go); the additional 3 new samples came from the coast of British Columbia (#364—Salmon Arm; #420/481—Jericho Pier). Five freshwater samples were also examined, 2 from lakes Cultus and Chilliwack in British Columbia and one from the Canal du Midi near Toulouse in the southeast of France (43°33'39''N 1°28'29''E). Additional information on sample sites and sampling techniques are available elsewhere (Short and Suttle 2005Go; Comeau et al. 2006Go).

Environmental Polymerase Chain Reaction and Sequencing
Degenerate polymerase chain reaction (PCR) primers ScExoT-F (5'-CWC GTC AAY TGA AAG CTC AA-3'; positions 788–807 in T4 g23 NC_000866 [GenBank] ) and ScExoT-R (5'-AWT TKM AYA CCG TAR CGA GT-3'; positions 1423–1442) were designed on the basis of alignments of cultured T4 superfamily phages to preferentially target Schizo T-even and noncyanophage Exo T-even phages. Control PCR reactions with the resulting primers amplified Schizo T-even and not T-even/Pseudo T-even phages nor cyanophages. PCR amplification, DNA purification, cloning, and sequencing of the amplified environmental sequences were carried out as described previously (Filée et al. 2005Go), except PCR cycling conditions which were modified as follows: initial denaturation at 94 °C for 1 min; followed by 35 cycles of denaturation at 94 °C for 30 s, annealing at 45 °C for 1 min, and extension at 72 °C for 1 min; followed by final extension at 72 °C for 9 min. GenBank accession numbers for the novel sequences presented in this paper are EU236767 [GenBank] –EU236786.

Comparative Genomics and Phylogenetic Analyses
Environmental sequences were compared with complete (http://www.ncbi.nlm.nih.gov/genomes/static/phg.html) and draft (http://phage.bioc.tulane.edu) T4 superfamily genomes. The GOS data set (Rusch et al. 2007Go) was queried using the built-in Blast function of the CAMERA database (Seshadri et al. 2007Go). Sequence manipulations were carried out in BioEdit (http://www.mbio.ncsu.edu/BioEdit/bioedit.html); and alignments of small data sets were conducted using ClustalW (Thompson et al. 1994Go). Alignments of large data sets could not be handled by traditional alignment programs and had to be performed using the MUSCLE program (Edgar 2004Go) using a gap opening penalty of –3 and a gap extension penalty of –0.275. The degree of conservation in alignments (amino acid identity) was calculated using ProtSkin (http://www.mcgnmr.ca/ProtSkin). Protein secondary structure was predicted using NNPREDICT (Kneller et al. 1990Go) through the STRAP interface (Gille and Frommel 2001Go). The MCP phylogenetic tree was constrained, given the limitations imposed by the size of the data set, to the protein distance and Neighbor-Joining methods, all carried out using the PHYLIP v3.66 package (http://evolution.genetics.washington.edu/phylip.html). The Jones-Taylor-Thornton (JTT) model was used for protein distance calculations and a nearly full-position alignment was used for the MCP tree construction given the availability of complete, or near full-length, sequences (for both cultured phages and the metagenome) and given the sheer depth of coverage of overlapping fragments for the GOS data. However, 8 common gaps at final alignment positions 1–25 (N-terminus), 156–166, 185–212, 369–380, 390–394, 476–480, 511–520, and 624–651 (C-terminus) were removed before construction of the MCP tree (527 residue final length). During construction of the tree, sequences from 4 cultured phages (MV9a/b, MV12, and MV13) as well as 2 environmental sequences (CS43 and GOS-1131) had to be removed from the analysis due to the creation of unusually long branches, indicating possible PCR chimeras which were verified using Bellerophon (Huber et al. 2004Go). The MCP tree was rooted using the RM378 sequence, primarily not only for presentational reasons but also because this sequence was the most divergent characterized MCP sequence currently available.

MCP Homology Modeling
The homology model (Dunbrack 2006Go; Ginalski 2006Go) of the core of RB49 gp23 was constructed using the SWISS-MODEL server (Arnold et al. 2006Go) and the DeepView protein modeling/assessment program (Schwede et al. 2003Go) using the solved T4 gp24 structure (Protein Data Bank: 1YUE [PDB] ) (Fokine et al. 2005Go) as a template. Portions of the protein not present in the template were modeled using HMMSTR (Bystroff and Shao 2002Go) (extra N-terminal portion, A65-T84) and ModLoop (Fiser and Sali 2003Go) (novel loops L135-A151, E160-F171, N200-S213, G235-S250, T332-K336, R374-G381, and K418-A422). A portion of the RB49 sequence (loop L156-I180) was removed from the model building due to its very poor conservation among the ~1400 gp23 sequences. Model quality was assessed within DeepView and using Verify3D (Eisenberg et al. 1997Go) and VADAR (Willard et al. 2003Go). The final model had 93.1% of residues within favorable Ramachandran areas (ignoring glycines), compared with 96.9% for the solved 1YUE structure, and had a final root-mean–squared (RMS) distance to 1YUE of 2.00 Å across 353 corresponding residues. Final RMS between our model and a previous model (Protein Data Bank: 1Z1U) proposed by Fokine et al. (2005)Go was 3.55 Å across 304 corresponding residues (the "Insertion" domain was excluded due to slightly different predicted positions). Protein alignment conservation was mapped onto the model structure using ProtSkin (http://www.mcgnmr.ca/ProtSkin) and visualized in DeepView. Conservation groups were delimited from the phylogenetic tree and were as follows: the "Near T4" group (46 sequences) was composed of all the cultured phages, except the Exo T-evens, plus 13 environmental clones from Filée et al. (2005)Go; the "Far T4" group (277 sequences) was composed of RM378, plus 19/20 of the new clones from this study, clone 37510 from Filée et al. (2005)Go, and the 256 RM378-like GOS hits obtained using either RM378 or our clone ScExo373-21 as queries; finally, the "Cyano T4" group (1,077 sequences) was composed of all cultured cyanophages, plus one of our clones (ScExo420-3), 70 Filée et al. (2005)Go clones, and the top 1,000 GOS hits against RB49 gp23.


    Results and Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusion and Perspectives
 Supplementary Material
 Acknowledgements
 References
 
Capsid Genes in Cultured T4-Like Phage Genomes
The mature T4 capsid shell (or lattice) is composed essentially of only 4 proteins (fig. 1A)—gp23* ("*" indicates the processed, mature form of the gene 23 protein), the MCP; gp24*, the vertex protein; highly immunogenic outer capsid (Hoc), an accessory protein which protrudes from the surface of the capsid; and small outer capsid (Soc), a small protein (80 aa) which binds to the junctions between gp23* hexamers creating a protein "grid" on the surface of the capsid (reviewed in Leiman et al. 2003Go). A fifth component, one dodecamer of gp20, the portal protein, is located at the unique vertex that attaches the capsid to the contractile phage tail structure and does not participate in forming the remainder of the capsid lattice.


Figure 1
View larger version (53K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 1.— Organization and comparative genomics of the T4 phage capsid shell lattice proteins. (A) Schematic diagram of the phage capsid lattice in T4 wild type and in the various single and multiple mutants in the capsid genes (Iwasaki et al. 2000Go; Fokine et al. 2004Go, 2006Go). The simplified structure of the triple mutant (g24 hoc soc) has not yet been experimentally confirmed, but given the presence of only gp23 in this lattice, this seems to be the most plausible configuration. The components of the T4-like phage lattice are the hexamer of gp23*, processed MCP; the pentamer of gp24*, processed vertex protein; Hoc, highly immunogenic outer capsid protein; and Soc, small outer capsid protein. One dodecamer of gp20, the portal protein, replaces gp24* at the unique vertex that attaches the capsid to the contractile phage tail structure. Note that 5 hexamers of gp23* surround each vertex, but 6 hexamers surround each individual gp23* hexamer on the capsid faces (i.e., between vertices). Adapted from Fokine et al. (2006)Go. (B) Comparative genomics of the capsid genes in T4 superfamily genomes, showing their size, orientation and context of the four principle proteins (in black, dark gray or double-lined arrows). Additional T4 genes are in light gray with the genes bordering the capsid genes labeled to indicate their identity. The presence of known/putative mobile elements (e.g., mobile endonucleases) are indicated by hatching and genes not present in the T4 genome (frequently orphans) are in white. The presence of a possible extra gene not included in the original KVP40 analysis (Miller, Heidelberg, et al. 2003Go) is indicated by the dashed arrow with an "?". Diagonal lines indicate inversions with respect to the genes/synteny of the preceding phage. Distances between genes/clusters are indicated in kb, with the distance separating the soc cluster from the rest in the first four phages ranging from 84-91 kb (between the arrows).

 
Only the MCP, which represents 55% of the mass of the capsid (Leiman et al. 2003Go; Fokine et al. 2004Go), and the portal protein appear to be required for capsid shell formation. Experiments with T4 mutants in Soc and Hoc (fig. 1A) indicate that these functions are facultative (Miller, Kutter, et al. 2003Go). Gene 23 can substitute for gene 24 function if it has certain single-amino acid mutations that allow it to occupy the vertex positions normally filled by gp24* (Fokine et al. 2005Go, 2006Go). Such a functional substitution is not surprising because the genes encoding for gp23 and gp24 result from an ancient duplication event and they retain nearly 31% protein sequence similarity (17% identity) in T4. The function of the accessory Hoc protein is unclear, although recently Dabrowska et al. (2006)Go have suggested a role for Hoc in T4's evasion of the mammalian immune system. Various large recombinant proteins have been fused to Hoc and Soc to create a chimeric production/display system on the T4 capsid without apparent detriment to phage capsid function or phage viability (Ren et al. 1996Go; Jiang et al. 1997Go; Ren and Black 1998Go; Sathaliyawala et al. 2006Go; Shivachandra et al. 2006Go). The other accessory capsid protein, Soc, is nonessential, but confers added stability to the phage capsid structure, protecting it from osmotic shock, alkaline pH, and temperature denaturation (Ishii et al. 1978Go; Iwasaki et al. 2000Go; Steven et al. 1992Go). Thus, in particularly harsh conditions, the presence of Soc protein, or its functional analogs, could be advantageous.

A survey (fig. 1B) of the T4 superfamily genomes that are completely sequenced reveals that only 6 of them (T4, T6, RB14, RB32, RB69, and JS98) have homologs of all 5 of the T4 capsid shell proteins. All these coliphages are very closely related to T4. The Soc protein appears to be the shell protein that is the least present in T4-like phages. However, the small size of the Soc protein (~80 aa) makes it difficult to identify distant homologs. There are 10 Soc homologs in the databases, and among these, only the trio of RB14, RB69, and JS98 sequences have modestly diverged from T4 (at both termini). Like Soc, Hoc is a facultative protein that is frequently missing in the T4-like phage genomes (fig. 1B), but it is retained in slightly more phages (12) than the former. With the significant exception of the Exo T-even subgroup, the gp24 capsid vertex protein is found in all T4 superfamily phages for which we have sufficiently complete genomic sequence data to make an evaluation, including the recently completed JS98 (Zuber et al. 2007Go) which shows a duplication of its g24. The Exo T-even subgroup includes 4 cyanophage genomes (P-SSM2, P-SSM4, S-PM2, and Syn9) (Mann et al. 2005Go; Sullivan et al. 2005Go; Weigele et al. 2007Go) and the genome of phage RM378 which infects Rhodothermus marinus, a thermophilic eubacterium (Hjorleifsdottir et al. 2002Go; Blondal et al. 2003Go). These cyanophages and RM378 have isometric icosahedral capsid structures, unlike the prolate icosahedral form of the other T4-like phages. This important morphological difference may reflect the lattice composition of these phages that apparently have only gp23* and gp20 homologs. Possibly because of a slight structural alteration in their gp23* subunits, they could have obviated the requirement for a distinct vertex protein (see above). Alternatively, these distant isometric T4-like phages may use a vertex protein analog instead of a gp24 homolog, as has been suggested by the cryoEM reconstruction and genomic analysis of Syn9 (Weigele et al. 2007Go).

All the T4 superfamily phages have both the MCP and the gp20 portal proteins, the latter being involved in DNA packaging also serves as the initiator complex for prohead formation and connects the capsid to the tail structure (Leiman et al. 2003Go). Sequences in both of these highly conserved genes have been used for the PCR amplification of T4-like sequences in the environment (Zhong et al. 2002Go; Dorigo et al. 2004Go; Filée et al. 2005Go; Jia et al. 2007Go). The size of the MCPs from the cultured phages for which we have full-length sequences (27; supplementary fig. S1, Supplementary Material online) varies somewhat (from 412-560 aa; a 36% difference), but overall the protein is quite well conserved, with ~74% of the protein alignment showing at least 50% similarity. The average identity among the 27 cultured phages over the full length of the protein is ~53% and there is only one major variable segment, located approximately between residues 210–290 (supplementary fig. S1, Supplementary Material online). This variable region is flanked by the primers we normally used to amplify environmental g23 sequences, leading to the observation of variable-size PCR amplicons (Filée et al. 2005Go).

Marine Diversity of T4-Like Capsid Genes
In an attempt to "fill in" the T4 superfamily phylogenetic tree, which had an unexplained paucity of culture-independent sequence representatives of the Schizo T-even and noncyanophage Exo T-even subgroups (Filée et al. 2005Go), which include phages such as RM378 of the thermophilic eubacteria R. marinus (Hjorleifsdottir et al. 2002Go), we designed a new set of degenerate MCP primers (ScExoT-F and ScExoT-R) with enhanced specificity for these subgroups. This new primer set was used in PCR reactions using as template the 10 marine samples of concentrated viral size fraction from our previous study (Filée et al. 2005Go) as well as 3 new samples from other marine habitats. Given that some cultured Schizo T-even phage hosts have been isolated from freshwater (Aeromonas spp.) (Ackermann and Krisch 1997Go), we also attempted PCR amplification on 5 samples from 2 British Columbia lakes (Cultus and Chilliwack) and a canal in the southeast of France (Canal du Midi). Of these 18 samples, only 3 were positive for PCR amplification (all marine; data not shown), with one sample showing a single band of the incorrect size which was later determined to be spurious. The PCR amplified fragments from the remaining 2 positive samples (#373—Salmon Arm; #420—Jericho Pier) were cloned and analyzed. These gave 20 unique g23-like sequences ranging in size from 643 to 694 bp. Surprisingly, almost all of these sequences (19/20) were most closely related to the phage RM378 (51–65% protein similarity), the exceptional sequence being related to the cyanophage P-SSM2 (80% protein similarity). The 19 unique RM378-like nucleotide sequences were 91% identical among themselves, indicating a surprisingly low level of diversity in the 2 samples. Although we cannot exclude the possibility of amplification or sampling artifacts, the low level of positive samples, and the absence of Schizo T-even sequences in these, raises the possibility that Schizo T-even phages may be much rarer in nature than the other T4 superfamily subgroups. Alternatively, the few cultured Schizo T-even sequences used to design the primers may simply not be representative of the subgroup as a whole. This may be the case given that the host bacteria for these phages (Aeromonas and Vibrio spp.) are often extremely abundant in the culturable fraction of the bacterioplankton, yet represent only a minor fraction of the total, culture-independent bacterial biomass (Thompson et al. 2004Go).

Having completed this focused environmental sequencing, we combined this additional data with the recently released Sorcerer II GOS expedition metagenomic data (Rusch et al. 2007Go). This massive amount of culture-independent sequence gave us a unique opportunity to go beyond a relatively modest targeted effort to fill in the denuded branches of our MCP phylogenetic tree by permitting us to analyze the sequence diversity of literally 1,000s of gp23 homologs isolated from the global marine sampling. An additional incentive to make this compilation came from the astonishing fact that 5 of the 6 most overrepresented protein families in the GOS metagenome were either T4 structural proteins or enzymes (with gp23 actually being ranked first) (Yooseph et al. 2007Go). Performing simple Blast searches of the GOS metagenome using various T4 capsid proteins as the query sequences (table 1) confirmed the omnipresence of the marine members of the T4 superfamily. Gp20 and gp23 homologs are by far the most abundant sequences in the database, with over 3,000 hits each (E < 10–4), and are fairly evenly distributed among ~45 sampling sites. The gp24 vertex protein is the next most abundant capsid protein, with ~900 homologs but with generally lower E values than those obtained for the first 2 proteins. This presumably reflects a lower level of gp24 sequence conservation and the possible replacement of the vertex protein function in many phages (e.g., Exo T-evens) by analogs. The Hoc protein sequence had only ~110 database hits and the Soc protein had no hits in the GOS metagenome, even using a very permissive cutoff (E < 10–2). As mentioned previously, the small size of this latter protein may make it difficult to identify homologs or, alternatively, this function may be supplied in distant phages by analogs. In summary, the abundance of the capsid shell proteins in the GOS metagenome follows the same pattern of gene preservation as the cultured phage genomes—with Soc being the sequence least present in the metagenome, followed by Hoc, then gp24, and finally by gp23 and gp20 that are the most frequent sequences having nearly 4 times more hits in the metagenome than all the others. A comparable situation occurs for the proteins that are located internal to the T4 phage capsid—proteins which form part of the scaffold during prohead assembly or serve functions in host takeover/DNA defense once they are injected with the phage DNA during infection (Leiman et al. 2003Go; Comeau and Krisch 2005Go; Depping et al. 2005Go). The gp21 prohead protease, responsible for maturing many of the T4 capsid proteins as outlined above, and the essential gp22 scaffold protein (Black et al. 1994Go) both have strikingly different metagenomic profiles than the other internal capsid proteins. These 2 proteins are well conserved within the T4 superfamily, which correlates with their abundance (~1,400–1,800 hits) in the GOS metagenome, but many of the other internal proteins of the capsid have no hits in the GOS metagenome (table 1) which is probably related to their moderate to low conservation even within the known members of the T4 superfamily.


View this table:
[in this window]
[in a new window]

 
Table 1 Protein Composition of the Mature T4 Capsid and Associated Diversity in the GOS Data Set

 
Diversity of the MCP (gp23)
The MCP is the foundation of the both the T4-like phage's capsid structure and also of the operant phylogeny of the T4 superfamily (Desplats and Krisch 2003Go; Filée et al. 2005Go). By combining the large amount of very diverse sequence data now available, we hoped to obtain an overview of how the structure of this important protein has evolved in the different branches of the superfamily in the course of evolution. Thus, a comprehensive alignment was made using all available cultured phage gp23 sequences (44), all previous marine environmental sequences (85), the new sequences reported in this study (20), along with the top 1,000 GOS hits against RB49 gp23 (not T4, for reasons explained below), the top 100 GOS hits against RM378 gp23 (to capture very divergent sequences), and the top 156 GOS hits against one of our new sequences (ScExo373-21) to include additional extremely distant homologs. This ensemble of data represented a total of 1,405 gp23 sequences, but of these, only 1,399 sequences were retained for the final analysis after 6 "problematic" sequences were discarded (see Materials and Methods). The resulting alignment (fig. 2) shows fairly good conservation along most, but not all, of the protein sequence, mirroring the conclusions previously obtained by analysis of the MCPs from cultured phages. The existence of the significant variable region (positions ~150–300 in the current alignment) was fully confirmed and, in addition, there is some sequence plasticity at both extremities of the protein. It is also evident from visual inspection that there are clear-cut sequence subgroups. First, there is separation of the cultured phage sequences (C) from nearly all the others, with the exception of a few of the Filée et al. (2005)Go environmental sequences (F). This is mostly the result of the sequence in the variable region (centered around position ~225) in the cultured phages being somewhat longer than T4-type phages isolated in the wild. There is also a striking cohesion among the new environmental sequences reported in this study (ScExo) with the 256 GOS hits (GOS lines immediately above ScExo) resulting from using either phage RM378 or the sequence ScExo373-21 as queries. Finally, there is also strong cohesion among the top 1,000 GOS sequences resulting from using RB49 gp23 as the query (GOS).


Figure 2
View larger version (118K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 2.— Alignment, conservation and secondary structure of ~1,400 T4-like MCPs from cultured phages and environmental samples. (A) Graphical presentation of the alignment of MCP sequences from cultured phages and environmental samples, including ~1,260 hits from the GOS metagenome, with the amino acids colored by type according to the BLOSUM62 matrix (Henikoff S and Henikoff JG 1992Go) in order to reveal their patterns of conservation. The cultured phage sequences are grouped together and indicated by a "C." The earlier environmental sequences from Filée et al. (2005)Go are indicated by an "F," and the new environmental sequences from this study are indicated by "ScExo." Finally, the predominant GOS sequences are indicated by "GOS." (B) Plot of the amino acid conservation of the alignment presented in (A). The degree of conservation is based on amino acid identity and was calculated using ProtSkin (www.mcgnmr.ca/ProtSkin). Common gaps have been masked using dark gray bars. (C) Secondary structure prediction for the alignment presented in (A). The {alpha}-helix (red) and β-strand (yellow) composition of the proteins was predicted using NNPREDICT (Kneller et al. 1990Go) through the STRAP interface (Gille and Frommel 2001Go).

 
These clear general trends were reinforced in quantitative detail by a Neighbor-Joining phylogeny of all the ~1,400 gp23 sequences (fig. 3). At the "top" of the tree, all cultured, non-Exo T-even (cyanophages, RM378, {Phi}SMB14) phages are well grouped together, along with about half of the Filée et al. (2005)Go environmental sequences (46 total). These sequences form a phylogenetic group representing the Near T4 phages. Close to the root of the tree are the RM378-like sequences, the majority of the new sequences obtained from the present study, and 256 GOS sequences that form several clusters which are very divergent from the rest of the known sequences. These sequences form a Far T4 phylogenetic group, and, except for RM378, there are no cultured representatives in this part of the T4-like phylogenetic space. Cleary, much work remains to be done on these most divergent members of the T4 superfamily phages which include the Exo T-evens subgroup. The only Exo T-even phages for which we do have substantial ecological (Mann 2003Go) and genomic information (Mann et al. 2005Go; Sullivan et al. 2005Go; Weigele et al. 2007Go) are the cyanophages, and it is clear that these dominate the T4-superfamily tree and, therefore, the GOS metagenome. The Cyano T4 group includes all the cultured cyanophages, the other half of the Filée et al. (2005)Go environmental sequences, the clone ScExo420-3 from this study, and all of the top 1,000 GOS hits generated using RB49 (a coliphage, not a cyanophage) as the query sequence. The "microdiversity" of cyanophages appears to be extremely high with dozens of closely related sequence subgroups with fairly short branch lengths. The predominance of T4-like cyanophages in the GOS metagenome is illustrated by the fact that 3,658 homologs (E < 10–4 level) of gp23 are found when using the T4 superfamily cyanophage S-PM2 as the query sequence, whereas 3,243 sequences were obtained with coliphage T4 and only 2,070 sequences using RM378. It should be noted that in this analysis the vast majority of the T4 hits overlap with cyanophages hits and are probably mostly cyanophages, as is suggested by the phylogeny of the top 1,000 GOS hits to RB49 (fig. 3).


Figure 3
View larger version (39K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 3.— Phylogeny of ~1,400 T4-like MCPs from cultured phages and environmental samples. Neighbor-Joining tree of the alignment presented in figure 2, rooted at the most divergent MCP sequence (RM378; asterisk in Far T4 group). Phylogenetic groups can be identified and are indicated by 3 labels: those sequences clustered around T4 (Near T4); those deep-branching sequences that are the most divergent from T4 (Far T4); and the majority of the sequences which are related to cyanophages (Cyano T4). Taxa are shaded along the circle periphery according to origin: cultured phages (dark bars; mostly in Near T4); "pre-GOS" environmental sequences from Filée et al. (2005)Go (gray bars); ScExo environmental sequences from this study (dark bars; clustered in Far T4 group); GOS sequences (remaining unlabeled). The positions of T4 (asterisk in Near T4 group) and the cyanophage S-PM2 (asterisk in Cyano T4 group) are indicated.

 
Structure and Evolution of the MCP (gp23)
Our further analysis of the MCPs in the T4 superfamily involved an investigation of the structure of the gp23 protein. A structural homology model (Dunbrack 2006Go; Ginalski 2006Go) was constructed using the MCP of phage RB49 because, among the available cultured phages, this gp23 sequence has the closest sequence match (~20/33% identity/similarity vs. ~17/31% for T4 gp23) to T4 gp24, which served as the template structure because it is currently the only member of the gp23/24 family whose structure is known (Fokine et al. 2005Go). The structural model resulting from such an analysis (fig. 4; supplementary data set S1, Supplementary Material online) is in excellent agreement with a previous model for the T4 MCP (Fokine et al. 2005Go). The new prediction for gp23 has a mixed secondary structure with slightly more β-strand content than the earlier sequence-based prediction (fig. 4A blue strands compared with fig. 2C yellow columns). The MCP structure has a compact central core and an adjacent less ordered Insertion domain (probably mostly random coil) that is likely to be responsible for the interaction between monomers in the hexamer, as it is in gp24 (Fokine et al. 2005Go). The T4 gp24 structure and the RB49 gp23 model share their structural folds with the MCP of the unrelated phage HK97, even though the two amino acid sequences have diverged considerably from the latter (only ~11% and ~13% identity, respectively), given that structural homologies appear to be conserved much longer than the primary sequence. In fact, Fokine et al. (2005)Go argue that the HK97 fold may be "universal" in phage capsid proteins.


Figure 4
View larger version (56K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 4.— Homology model of the gp23 MCP monomer and assembled hexamer. (A) Protein homology model of the RB49 MCP monomer, using the T4 gp24 vertex protein monomer structure (Fokine et al. 2005Go) as the template. The {alpha}-helix (red) and β-strand (blue) composition of the protein is indicated, along with the N- and C-termini of the molecule. A series of residues are indicated (white font) to show the progression of the polypeptide chain. Note that the model begins at residue 65 because the N-terminus of gp23 is cleaved during phage head maturation. Also indicated are certain key {alpha}-helices and β-strands (yellow font) as well as the Insertion domain which is probably responsible for coordinating the interaction between monomers. (B) As in (A), except here the coloring of the molecule is according to the level of amino acid sequence conservation. The degree of conservation, scale at right, is based on amino acid identity from the global alignment in figure 2 (~1,400 sequences) and was calculated using ProtSkin (www.mcgnmr.ca/ProtSkin). (C) Protein homology model of the RB49 MCP hexamer, based upon the T4 gp24 vertex protein pentamer structure and a previously proposed gp23 hexamer model (Fokine et al. 2005Go). The molecules are colored according to amino acid sequence conservation, as in (B), yet subdivided into the different T4 superfamily phylogenetic groups delimited in figure 3. The phylogenetic groups were 1) the complete data set alignment in figure 2 (~1,400 sequences; Global); 2) an alignment of all cultured phage sequences nearest to T4 (T-evens, Pseudo T-evens, and Schizo T-evens, excluding the Exo T-evens), plus 13 of the Filée et al. (2005)Go environmental sequences closest to T4 (46 total; Near T4 group); 3) an alignment of 1,077 sequences including all the cultured cyanophages (Exo T-evens) and the environmental cyanophage-like sequences (70 from Filée et al. [2005]Go, clone ScExo420-3 from this study, 1,000 GOS hits using RB49 as the query; Cyano T4 group); or 4) an alignment of 277 sequences containing the cultured phage RM378 (an Exo T-even) and the environmental RM378-like sequences (clone 37510 from Filée et al. [2005]Go, 19 sequences from this study, and 256 GOS hits using RM378/ScExo373-21 as the queries; Far T4 group).

 
Our MCP model could then be used to map the location of sequence divergence in the 3-dimensional structure. Such "sequence divergence mapping" permits the easy visualization of the conserved, and hence the potentially most important, motifs within a protein structure. Similarly, such a map highlights the most variable segments within the structure, presumably those that are responsible for variations in capsid morphology and other more subtle kinds of subgroup divergence. Although the N-terminal domain is not well conserved (fig. 4B), the central core portion of the gp23 protein is. The Insertion domain, which corresponds to the major variable region previously alluded to in the MCP alignment (fig. 2 and supplementary fig. S1, Supplementary Material online), is the least conserved domain. When the gp23 structural model is assembled into a hexamer (the functional unit of the capsid lattice) (Leiman et al. 2003Go) and this is color coded to reflect the degree of conservation within the different groups of the T4 phylogenetic tree (see above and fig. 3), the sequence conservation–structure relationships become self-evident (fig. 4C). Such a color-coded hexamer structure is shown again for reference in the upper left of the figure. Note that the precise position of the protruding, weakly conserved N-terminus ({alpha}2-3) is somewhat uncertain. The Near T4 group (primarily cultured phages) has good overall conservation of sequence, and strikingly so in the "outer ring" (formed by multiple β-strands); the only exception is the β14 strand at the C-terminus that is poorly conserved. The Cyano T4 group has an outer ring less conserved than the Near T4s, yet has a well-conserved central core (primarily helices {alpha}7–8). Finally, the Far T4 group shows the least conserved outer ring and is also the least conserved overall (more blue throughout). Mutant analyses of the T4 MCP indicate that the outer ring residues are responsible, in part, for the morphology of the head, whereas the central core contains residues that allow gp23 to also function as the vertex protein (Black et al. 1994Go; Fokine et al. 2005Go). The sequence group shows clear differences with respect to these 2 regions, and it is probable that these lead to morphological changes in the capsid and to different capsid lattice arrangements, and novel interactions with capsid decorating proteins such as Hoc and Soc.


    Conclusion and Perspectives
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusion and Perspectives
 Supplementary Material
 Acknowledgements
 References
 
Our analysis shows that the diversity of T4 superfamily capsid proteins from the GOS environmental metagenome follow the same general patterns as do their homologs from cultured phage genomes. Interestingly, the novel environmental gp23 MCP sequences obtained in our targeted sequencing are quite divergent (deeper branching) from most other known sequences, including the overwhelming majority of the GOS metagenome that is dominated by cyanophage-like T4 phages. These results suggest that the T4 superfamily's diversity has not yet been fully delimited and that there are certainly additional more distant T4-like phages yet to be discovered in the environment. Directly related to this issue, we are using this comprehensive gp23 sequence analysis to iteratively design a new set T4 superfamily PCR primers that should be yet more inclusive in its capacity to amplify distant T4-like phage sequences from the environment.

A homology model of the MCP, combined with the mapping of protein diversity from ~1,400 environmental and cultured sequences, shows that T4 superfamily phylogenetic groups have clear differences with respect to conserved and variable structural areas of their capsid proteins. These differences can be correlated with the diversity of capsid morphology, capsid lattice arrangements, and accessory protein content interactions in this phage family. The conclusions of this structure/diversity analysis would be made much stronger by the availability of the structures for the MCP in several different branches of the T4 superfamily. Given the central role that the MCP plays in coordinating interactions between the different essential and accessory components in the arrangement of the capsid lattice, having a panel of several representative structures could lead to a major advance in our understanding of how virion nanomachines are constructed and have evolved. From this point of view, having the X-ray crystallographic structures of the MCPs of various T4 phage morphotypes, such as the isometric forms of cyanophage S-PM2 and the deep-branching Exo T-even RM378, could be both fascinating and highly informative. Such data would put us in a better position to understand the structural and evolutionary features of the T4-like MCP that have allowed it to become one of the most abundant protein species in the biosphere.


    Supplementary Material
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusion and Perspectives
 Supplementary Material
 Acknowledgements
 References
 
Supplementary figure S1 and data set S1 as well as color versions of figures 1 and 3, are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusion and Perspectives
 Supplementary Material
 Acknowledgements
 References
 
This work was supported by the CNRS. We thank our colleagues Curtis Suttle (University of British Columbia, Vancouver) for the marine samples and Christine Arbiol (CNRS IFR109) for DNA sequencing/figure consultation. We also thank Viknesh Sivanathan for his advice on the STRAP interface and Shibu Yooseph for his help with the CAMERA database and GOS data set. In agreement with the Convention on Biological Diversity (http://www.cbd.int), the genetic information from the CAMERA database analyzed in this publication may be considered the genetic patrimony of the countries from which the samples were procured. A.M.C. gratefully acknowledges the support of the Les Treilles Foundation and H.M.K. that of the Kribu Foundation.


    Footnotes
 
Hervé Philippe, Associate Editor


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results and Discussion
 Conclusion and Perspectives
 Supplementary Material
 Acknowledgements
 References
 

    Ackermann HW, Krisch HM. A catalogue of T4-type bacteriophages. Arch Virol (1997) 142:2329–2345.[CrossRef][Web of Science][Medline]

    Arnold K, Bordoli L, Kopp J, Schwede T. The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling. Bioinformatics (2006) 22:195–201.[Abstract/Free Full Text]

    Black LW, Showe MK, Steven AC. Morphogenesis of the T4 head. In: Molecular biology of bacteriophage T4—Karam JD, ed. (1994) Washington (DC): American Society for Microbiology Press. 182–185.

    Blondal T, Hjorleifsdottir SH, Fridjonsson OF, Ævarsson A, Skirnisdottir S, Hermannsdottir AG, Hreggvidsson GO, Smith AV, Kristjansson JK. Discovery and characterization of a thermostable bacteriophage RNA ligase homologous to T4 RNA ligase 1. Nucleic Acids Res (2003) 31:7247–7254.[Abstract/Free Full Text]

    Bystroff C, Shao Y. Fully automated ab initio protein structure prediction using I-SITES, HMMSTR and ROSETTA. Bioinformatics (2002) 18:S54–S61.[Abstract]

    Chen F, Lu JR. Genomic sequence and evolution of marine cyanophage P60: a new insight on lytic and lysogenic phages. Appl Environ Microbiol (2002) 68:2589–2594.[Abstract/Free Full Text]

    Chow MS, Rouf MA. Isolation and partial characterization of 2 Aeromonas hydrophila bacteriophages. Appl Environ Microbiol (1983) 45:1670–1676.[Abstract/Free Full Text]

    Comeau AM, Bertrand C, Letarov A, Tétart F, Krisch HM. Modular architecture of the T4 phage superfamily: a conserved core genome and a plastic periphery. Virology (2007) 362:384–396.[CrossRef][Web of Science][Medline]

    Comeau AM, Chan AM, Suttle CA. Genetic richness of vibriophages isolated in a coastal environment. Environ Microbiol (2006) 8:1164–1176.[CrossRef][Medline]

    Comeau AM, Krisch HM. War is peace—dispatches from the bacterial and phage killing fields. Curr Opin Microbiol (2005) 8:488–494.[CrossRef][Web of Science][Medline]

    Dabrowska K, Switala-Jelen K, Opolski A, Gorski A. Possible association between phages, Hoc protein, and the immune system. Arch Virol (2006) 151:209–215.[CrossRef][Web of Science][Medline]

    Depping R, Lohaus C, Meyer HE, Ruger W. The mono-ADP-ribosyltransferases Alt and ModB of bacteriophage T4: target proteins identified. Biochem Biophys Res Commun (2005) 335:1217–1223.[Web of Science][Medline]

    Desplats C, Krisch HM. The diversity and evolution of the T4-type bacteriophages. Res Microbiol (2003) 154:259–267.[Medline]

    Dorigo U, Jacquet S, Humbert JF. Cyanophage diversity, inferred from g20 gene analyses, in the largest natural lake in France, Lake Bourget. Appl Environ Microbiol (2004) 70:1017–1022.[Abstract/Free Full Text]

    Dunbrack RL. Sequence comparison and protein structure prediction. Curr Opin Struct Biol (2006) 16:374–384.[CrossRef][Web of Science][Medline]

    Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res (2004) 32:1792–1797.[Abstract/Free Full Text]

    Eisenberg D, Luthy R, Bowie JU. VERIFY3D: assessment of protein models with three-dimensional profiles. In: Methods in enzymology. Vol. 277—Carter CW Jr, Sweet RM, eds. (1997) New York: Academic Press. 396–404.

    Filée J, Bapteste E, Susko E, Krisch HM. A selective barrier to horizontal gene transfer in the T4-type bacteriophages that has preserved a core genome with the viral replication and structural genes. Mol Biol Evol (2006) 23:1688–1696.[Abstract/Free Full Text]

    Filée J, Tétart F, Suttle CA, Krisch HM. Marine T4-type bacteriophages, a ubiquitous component of the dark matter of the biosphere. Proc Natl Acad Sci USA (2005) 102:12471–12476.[Abstract/Free Full Text]

    Fiser A, Sali A. ModLoop: automated modeling of loops in protein structures. Bioinformatics (2003) 19:2500–2501.[Abstract/Free Full Text]

    Fokine A, Battisti AJ, Kostyuchenko VA, Black LW, Rossmann MG. Cryo-EM structure of a bacteriophage T4 gp24 bypass mutant: the evolution of pentameric vertex proteins in icosahedral viruses. J Struct Biol (2006) 154:255–259.[CrossRef][Web of Science][Medline]

    Fokine A, Chipman PR, Leiman PG, Mesyanzhinov VV, Rao VB, Rossmann MG. Molecular architecture of the prolate head of bacteriophage T4. Proc Natl Acad Sci USA (2004) 101:6003–6008.[Abstract/Free Full Text]

    Fokine A, Leiman PG, Shneider MM, Ahvazi B, Boeshans KM, Steven AC, Black LW, Mesyanzhinov VV, Rossmann MG. Structural and functional similarities between the capsid proteins of bacteriophages T4 and HK97 point to a common ancestry. Proc Natl Acad Sci USA (2005) 102:7163–7168.[Abstract/Free Full Text]

    Gibb EA, Edgell DR. Multiple controls regulate the expression of mobE, an HNH homing endonuclease gene embedded within a ribonucleotide reductase gene of phage Aeh1. J Bacteriol (2007) 189:4648–4661.[Abstract/Free Full Text]

    Gille C, Frommel C. STRAP: editor for structural alignments of proteins. Bioinformatics (2001) 17:377–378.[Abstract/Free Full Text]

    Ginalski K. Comparative modeling for protein structure prediction. Curr Opin Struct Biol (2006) 16:172–177.[CrossRef][Web of Science][Medline]

    Hambly E, Tétart F, Desplats C, Wilson WH, Krisch HM, Mann NH. A conserved genetic module that encodes the major virion components in both the coliphage T4 and the marine cyanophage S-PM2. Proc Natl Acad Sci USA (2001) 98:11411–11416.[Abstract/Free Full Text]

    Hardies SC, Comeau AM, Serwer P, Suttle CA. The complete sequence of marine bacteriophage VpV262 infecting Vibrio parahaemolyticus indicates that an ancestral component of a T7 viral supergroup is widespread in the marine environment. Virology (2003) 310:359–371.[Web of Science][Medline]

    Henikoff S, Henikoff JG. Amino-acid substitution matrices from protein blocks. Proc Natl Acad Sci USA (1992) 89:10915–10919.[Abstract/Free Full Text]

    Hjorleifsdottir S, Hreggvidsson GO, Fridjonsson OH, Ævarsson A, Kristjansson JK. Bacteriophage RM 378 of a thermophilic host organism (2002) US Patent 6,492,161.

    Huber T, Faulkner G, Hugenholtz P. Bellerophon: a program to detect chimeric sequences in multiple sequence alignments. Bioinformatics (2004) 20:2317–2319.[Abstract/Free Full Text]

    Ishii T, Yamaguchi Y, Yanagida M. Binding of structural protein Soc to head shell of bacteriophage-T4. J Mol Biol (1978) 120:533–544.[CrossRef][Web of Science][Medline]

    Iwasaki K, Trus BL, Wingfield PT, Cheng NQ, Campusano G, Rao VB, Steven AC. Molecular architecture of bacteriophage T4 capsid: vertex structure and bimodal binding of the stabilizing accessory protein. Soc Virol (2000) 271:321–333.

    Jia ZJ, Ishihara R, Nakajima Y, Asakawa S, Kimura M. Molecular characterization of T4-type bacteriophages in a rice field. Environ Microbiol (2007) 9:1091–1096.[CrossRef][Medline]

    Jiang J, Abushilbayeh L, Rao VB. Display of a PorA peptide from Neisseria meningitidis on the bacteriophage T4 capsid surface. Infect Immun (1997) 65:4770–4777.[Abstract]

    Karam JD, Konigsberg WH. DNA polymerase of the T4-related bacteriophages. Prog Nucleic Acid Res Mol Biol (2000) 64:65–96.[Web of Science][Medline]

    Kneller DG, Cohen FE, Langridge R. Improvements in protein secondary structure prediction by an enhanced neural network. J Mol Biol (1990) 214:171–182.[CrossRef][Web of Science][Medline]

    Leiman PG, Kanamaru S, Mesyanzhinov VV, Arisaka F, Rossmann MG. Structure and morphogenesis of bacteriophage T4. Cell Mol Life Sci (2003) 60:2356–2370.[CrossRef][Web of Science][Medline]

    Letarov A, Manival X, Desplats C, Krisch HM. gpwac of the T4-type bacteriophages: structure, function, and evolution of a segmented coiled-coil protein that controls viral infectivity. J Bacteriol (2005) 187:1055–1066.[Abstract/Free Full Text]

    Mann NH. Phages of the marine cyanobacterial picophytoplankton. FEMS Microbiol Rev (2003) 27:17–34.[CrossRef][Web of Science][Medline]

    Mann NH, Clokie MRJ, Millard A, Cook A, Wilson WH, Wheatley PJ, Letarov A, Krisch HM. The genome of S-PM2, a "photosynthetic" T4-type bacteriophage that infects marine Synechococcus strains. J Bacteriol (2005) 187:3188–3200.[Abstract/Free Full Text]

    Matsuzaki S, Inoue T, Kuroda M, Kimura S, Tanaka S. Cloning and sequencing of major capsid protein (MCP) gene of a vibriophage, KVP20, possibly related to T-even coliphages. Gene (1998) 222:25–30.[CrossRef][Web of Science][Medline]

    Matsuzaki S, Inoue T, Tanaka S, Koga T, Kuroda M, Kimura S, Imai S. Characterization of a novel Vibrio parahaemolyticus phage, KVP241, and its relatives frequently isolated from seawater. Microbiol Immunol (2000) 44:953–956.[Web of Science][Medline]

    Matsuzaki S, Kuroda M, Kimura S, Tanaka S. Major capsid proteins of certain Vibrio and Aeromonas phages are homologous to the equivalent protein, gp23*, of coliphage T4. Arch Virol (1999) 144:1647–1651.[CrossRef][Web of Science][Medline]

    Miller ES, Heidelberg JF, Eisen JA, et al, (13 co-authors). Complete genome sequence of the broad-host-range vibriophage KVP40: comparative genomics of a T4-related bacteriophage. J Bacteriol (2003) 185:5220–5233.[Abstract/Free Full Text]

    Miller ES, Kutter E, Mosig G, Arisaka F, Kunisawa T, Ruger W. Bacteriophage T4 genome. Microbiol Mol Biol Rev (2003) 67:86–156.[Abstract/Free Full Text]

    Nolan JM, Petrov V, Bertrand C, Krisch HM, Karam JD. Genetic diversity among five T4-like bacteriophages. Virol J (2006) 3:30.[CrossRef][Medline]

    Petrov VM, Nolan JM, Bertrand C, Levy D, Desplats C, Krisch HM, Karam JD. Plasticity of the gene functions for DNA replication in the T4-like phages. J Mol Biol (2006) 361:46–68.[CrossRef][Web of Science][Medline]

    Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE. UCSF Chimera—a visualization system for exploratory research and analysis. J Comput Chem (2004) 25:1605–1612.[CrossRef][Web of Science][Medline]

    Ren ZJ, Black LW. Phage T4 Soc and Hoc display of biologically active, full-length proteins on the viral capsid. Gene (1998) 215:439–444.[CrossRef][Web of Science][Medline]

    Ren ZJ, Lewis GK, Wingfield PT, Locke EG, Steven AC, Black LW. Phage display of intact domains at high copy number: a system based on SOC, the small outer capsid protein of bacteriophage T4. Protein Sci (1996) 5:1833–1843.[Web of Science][Medline]

    Rohwer F, Segall A, Steward G, Seguritan V, Breitbart M, Wolven F, Azam F. The complete genomic sequence of the marine phage roseophage SIO1 shares homology with nonmarine phages. Limnol Oceanogr (2000) 45:408–418.

    Rossmann MG, Mesyanzhinov VV, Arisaka F, Leiman PG. The bacteriophage T4 DNA injection machine. Curr Opin Struct Biol (2004) 14:171–180.[CrossRef][Web of Science][Medline]

    Rusch DB, Halpern AL, Sutton G, et al, (40 co-authors). The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific. PLoS Biol (2007) 5:398–431.[Web of Science]

    Sathaliyawala T, Rao M, Maclean DM, Birx DL, Alving CR, Rao VB. Assembly of human immunodeficiency virus (HIV) antigens on bacteriophage T4: a novel in vitro approach to construct multicomponent HIV vaccines. J Virol (2006) 80:7688–7698.[Abstract/Free Full Text]

    Scholl D, Kieleczawa J, Kemp P, Rush J, Richardson CC, Merril C, Adhya S, Molineux IJ. Genomic analysis of bacteriophages SP6 and K1-5, an estranged subgroup of the T7 supergroup. J Mol Biol (2004) 335:1151–1171.[CrossRef][Web of Science][Medline]

    Schwede T, Kopp J, Guex N, Peitsch MC. SWISS-MODEL: an automated protein homology-modeling server. Nucleic Acids Res (2003) 31:3381–3385.[Abstract/Free Full Text]

    Seshadri R, Kravitz SA, Smarr L, Gilna P, Frazier M. CAMERA: a community resource for metagenomics. PLoS Biol (2007) 5:394–397.[Web of Science]

    Shamoo Y, Friedman AM, Parsons MR, Konigsberg WH, Steitz TA. Crystal-structure of a replication fork single-stranded-DNA binding-protein (T4 gp32) complexed to DNA. Nature (1995) 376:362–366.[CrossRef][Medline]

    Shivachandra SB, Rao M, Janosi L, Sathaliyawala T, Matyas GR, Alving CR, Leppla SH, Rao VB. In vitro binding of anthrax protective antigen on bacteriophage T4 capsid surface through Hoc-capsid interactions: a strategy for efficient display of large full-length proteins. Virology (2006) 345:190–198.[CrossRef][Web of Science][Medline]

    Short CM, Suttle CA. Nearly identical bacteriophage structural gene sequences are widely distributed in both marine and freshwater environments. Appl Environ Microbiol (2005) 71:480–486.[Abstract/Free Full Text]

    Steven AC, Greenstone HL, Booy FP, Black LW, Ross PD. Conformational-changes of a viral capsid protein—thermodynamic rationale for proteolytic regulation of bacteriophage-T4 capsid expansion, cooperativity, and super-stabilization by Soc binding. J Mol Biol (1992) 228:870–884.[CrossRef][Web of Science][Medline]

    Sullivan MB, Coleman ML, Weigele P, Rohwer F, Chisholm SW. Three Prochlorococcus cyanophage genomes: signature features and ecological interpretations. PLoS Biol (2005) 3:790–806.[Web of Science]

    Suttle CA, Chan AM, Cottrell MT. Use of ultrafiltration to isolate viruses from seawater which are pathogens of marine phytoplankton. Appl Environ Microbiol (1991) 57:721–726.[Abstract/Free Full Text]

    Thomassen E, Gielen G, Schutz M, Schoehn G, Abrahams JP, Miller S, Van Raaij MJ. The structure of the receptor-binding domain of the bacteriophage T4 short tail fibre reveals a knitted trimeric metal-binding fold. J Mol Biol (2003) 331:361–373.[CrossRef][Web of Science][Medline]

    Thompson FL, Iida T, Swings J. Biodiversity of vibrios. Microbiol Mol Biol Rev (2004) 68:403–431.[Abstract/Free Full Text]

    Thompson JD, Higgins DG, Gibson TJ. CLUSTAL-W—improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res (1994) 22:4673–4680.[Abstract/Free Full Text]

    Tétart F, Desplats C, Kutateladze M, Monod C, Ackermann HW, Krisch HM. Phylogeny of the major head and tail genes of the wide-ranging T4-type bacteriophages. J Bacteriol (2001) 183:358–366.[Abstract/Free Full Text]

    Weigele PR, Pope WH, Pedulla ML, Houtz JM, Smith AL, Conway JF, King J, Hatfull GF, Lawrence JG, Hendrix RW. Genomic and structural analysis of Syn9, a cyanophage infecting marine Prochlorococcus and Synechococcus. Environ Microbiol (2007) 9:1675–1695.[CrossRef][Medline]

    Willard L, Ranjan A, Zhang HY, Monzavi H, Boyko RF, Sykes BD, Wishart DS. VADAR: a web server for quantitative evaluation of protein structure quality. Nucleic Acids Res (2003) 31:3316–3319.[Abstract/Free Full Text]

    Yooseph S, Sutton G, Rusch DB, et al, (33 co-authors). The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. PLoS Biol (2007) 5:432–466.[Web of Science]

    Zhong Y, Chen F, Wilhelm SW, Poorvin L, Hodson RE. Phylogenetic diversity of marine cyanophage isolates and natural virus communities as revealed by sequences of viral capsid assembly protein gene g20. Appl Environ Microbiol (2002) 68:1576–1584.[Abstract/Free Full Text]

    Zuber C, Ngom-Bru C, Barretto C, Bruttin A, Brüssow H, Denou E. Genome analysis of phage JS98 defines a fourth major subgroup of T4-like phages in Escherichia coli. J Bacteriol (2007) 189:8206–8214.[Abstract/Free Full Text]

Accepted for publication March 21, 2008.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Data
Right arrow All Versions of this Article:
25/7/1321    most recent
msn080v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Comeau, A. M.
Right arrow Articles by Krisch, H. M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Comeau, A. M.
Right arrow Articles by Krisch, H. M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?