MBE Advance Access originally published online on November 28, 2007
Molecular Biology and Evolution 2008 25(2):277-286; doi:10.1093/molbev/msm246
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
Multiple Recombining Loci Encode MaSp1, the Primary Constituent of Dragline Silk, in Widow Spiders (Latrodectus: Theridiidae)
Department of Biology, University of California, Riverside
E-mail: nadiaa{at}ucr.edu.
| Abstract |
|---|
|
|
|---|
Spiders spin a functionally diverse array of silk fibers, each composed of one or more unique proteins. Most of these proteins, in turn, are encoded by members of a single gene family thought to have arisen through duplication and divergence of an ancestral silk gene. Because of its remarkable mechanical properties, orb weaver dragline silk, a composite of 2 proteins (MaSp1 and MaSp2), is the best studied. Here, we demonstrate that multiple loci encode MaSp1 in widow spiders (Latrodectus). Because these copies may be the result of more recent duplication events than those leading to the currently recognized silk gene paralogs, they offer insight into the early evolutionary fate of silk gene duplicates. In addition to 3 presumed functional MaSp1 loci in Latrodectus hesperus (Western black widow) and Latrodectus geometricus (brown widow) genomes, we find a MaSp1 pseudogene in L. hesperus, demonstrating the potential for unrecognized extinction of silk gene paralogs. We also document recombination events among L. hesperus MaSp1 loci and between Latrodectus MaSp1 loci and MaSp2. This result supports the hypothesis that concerted evolution occurs not only within an individual silk gene but also among silk gene paralogs. One of the L. geometricus MaSp1 copies encodes a protein that has diverged in amino acid composition and potentially converged on the secondary structure of MaSp2. Based on the presence of multiple MaSp1 loci and the phylogenetic distribution of MaSp1 versus MaSp2, we propose that MaSp2 is derived from an ancestral MaSp1 duplicate. Finally, divergence has occurred in the upstream flanking sequences of the L. hesperus MaSp1 loci, the region most likely to contain regulatory motifs, providing ample opportunity for differential expression. However, the benefits associated with increased protein production may be the primary mechanism maintaining multiple functional MaSp1 copies in widow genomes.
Key Words: gene duplication Latrodectus geometricus Latrodectus hesperus major ampullate spidroin concerted evolution spider silk gene family
| Introduction |
|---|
|
|
|---|
Spider silks are renowned for their spectacular mechanical properties. Most impressively, the dragline silk of some spiders is tougher than most natural or synthetic materials (Gosline et al. 1999
The Spider Silk Gene Family
All described spidroins are extremely large, with transcript sizes typically
10,000 bp long (Hayashi et al. 1999
, 2004
; Sponner et al. 2005
; Zhao et al. 2006
; Ayoub et al. 2007
). They are also highly modular. Each polypeptide is composed of an uninterrupted block of repetitive sequence that varies in complexity according to protein type, with some spidroins composed of short repeat units characterized by a small set of amino acid (aa) motifs, whereas others are composed of long intricate repeats (Gatesy et al. 2001
; Hayashi et al. 2004
). The repetitive region of each spider fibroin is flanked by nonrepetitive amino (N)- and carboxy (C)-termini that are respectively
150 and
100 aa long (Hayashi and Lewis 2000
; Motriuk-Smith et al. 2005
; Rising et al. 2006
; Zhao et al. 2006
; Ayoub et al. 2007
).
Despite a growing knowledge of silk gene diversity and protein structure within and across phylogenetically disparate species, many aspects of the evolutionary history and dynamics of the spider silk gene family remain poorly understood (Gatesy et al. 2001
; Garb and Hayashi 2005
; Garb et al. 2006
). Silk is a defining feature of spiders, which originated approximately 400 MYA (Selden et al. 1991
). Thus, the spidroin gene family is ancient and it is likely that some gene copies have been lost, causing difficulties in reconstructing the order of duplication events that led to novel proteins. Additionally, the repetitive regions of these proteins evolve rapidly, even within a spidroin type, rendering positional homology determination of repetitive sequences across fibroin types virtually impossible (Hayashi and Lewis 2000
; Gatesy et al. 2001
; Garb et al. 2006
). Phylogenetic reconstruction of spidroin family members has instead relied on the short, nonrepetitive C-terminus (Beckwitt and Arcidiacono 1994
; Hayashi and Lewis 1998
; Hayashi et al. 2004
; Garb and Hayashi 2005
; Tian and Lewis 2005
; Garb et al. 2006
) and, to a lesser extent, the N-terminus (Rising et al. 2006
; Ayoub et al. 2007
).
Major Ampullate Silk Genes
One route toward understanding the evolution of the spidroin gene family is to focus on proteins that likely share a more recent common ancestor than the family as a whole, such as the constituents of dragline silk, MaSp1 and MaSp2 (Xu and Lewis 1990
; Hinman and Lewis 1992
). These proteins are also the best-understood spidroins in terms of their phylogenetic distribution across spider taxa and their molecular and mechanical properties. Immunological evidence indicates that the genes encoding MaSp1 and MaSp2 are coexpressed in the major ampullate glands (Sponner et al. 2005
). The 2 spidroins also share similar aa motifs and repeat unit complexity. Specifically, both are characterized by short repeat units, termed ensemble repeats, which are composed of a glycine (G)-rich region punctuated by a contiguous stretch of alanines (poly-A). The aa motif GGX (where X represents a subset of aa) is present in both spidroins, whereas MaSp2 additionally harbors a large proportion of GPG motifs (Xu and Lewis 1990
; Hinman and Lewis 1992
; Guerette et al. 1996
; Gatesy et al. 2001
). Further evidence for a close relationship between MaSp1 and MaSp2 is their consistent grouping to the exclusion of other spidroin types in phylogenetic analyses of C-termini (Gatesy et al. 2001
; Garb and Hayashi 2005
; Garb et al. 2006
; Ayoub et al. 2007
).
MaSp1 orthologues and MaSp2 orthologues have been defined by the similarity of repetitive sequences in the respective proteins (Xu and Lewis 1990
; Hinman and Lewis 1992
; Gatesy et al. 2001
). Thus, MaSp1 is expected to form a separate clade from MaSp2. Phylogenetic analyses of the N- and C-termini, however, do not recover a monophyletic MaSp1 clade separate from a MaSp2 clade. Instead the termini of MaSp1 and MaSp2 within a species or group of closely related species are often united (Gatesy et al. 2001
; Garb and Hayashi 2005
; Garb et al. 2006
; Ayoub et al. 2007
). It is possible that this pattern reflects the true evolutionary history of MaSp1 and MaSp2 and that numerous duplication events have been followed by multiple convergences of the repetitive regions. Such a scenario may be unlikely because it would require the convergence of thousands of nucleotides. A simpler explanation is that the short, nonrepetitive terminal regions have been homogenized through concerted evolution by unequal crossing over or gene conversion between paralogs (Brown et al. 1972
; Zimmer et al. 1980
; Dover et al. 1993
). The substantial proportion of codons for G and alanine found in both MaSp1 and MaSp2 could facilitate intergenic pairing during meiosis. However, comparison of full-length MaSp1 and MaSp2 genes from the black widow spider, Latrodectus hesperus, shows that these genes maintain distinct ensemble repeats throughout. Thus, if recombination occurs between MaSp1 and MaSp2, it is limited to the nonrepetitive terminal regions (Ayoub et al. 2007
). To date, no direct evidence for recombination between silk gene paralogs has been documented.
Another important aspect of silk gene family evolution that has been largely ignored is the potential for multiple loci to encode a particular protein type (but see Rising et al. 2007
). Within species, several variants (presumed allelic) have been described for MaSp1 (Beckwitt et al. 1998
), MaSp2 (Guerette et al. 1996
; Gatesy et al. 2001
; Garb et al. 2006
), and the gene that encodes the capture spiral spidroin, Flag (Higgins et al. 2007
). Furthermore, Colgin and Lewis (1998)
described 2 complementary DNA (cDNA) sequences that respectively hybridized to transcripts of 2 different sizes, using RNA derived from minor ampullate silk glands. They concluded that 2 genes encode the minor ampullate silk protein, but it is possible that allelic and or transcriptional variation explains the 2 sequences.
Here we demonstrate that minimally 3 copies of MaSp1 exist in widow spider (Latrodectus: Theridiidae) genomes. We show that concerted evolution has homogenized parts of these loci within L. hesperus (Western black widow) and find evidence for past recombination between Latrodectus MaSp1 loci and MaSp2. Divergent and potential convergent evolution have additionally occurred in 1 Latrodectus geometricus (brown widow) MaSp1 copy. Finally, we discuss the implications of these findings for the roles of gene duplication and loss, recombination, and selection in the evolution of the spider silk gene family.
| Methods |
|---|
|
|
|---|
Loci Identification
A L. hesperus genomic library was screened using the polymerase chain reaction (PCR) for clones containing dragline silk genes (see Ayoub et al. 2007
Each MaSp1-positive fosmid was directly sequenced with a variety of primers targeting the N- and C-terminal coding regions and upstream flanking sequences. These primers were designed from 5' and 3' partial cDNA clones (5'-EF595247 [GenBank] ; 3'-AY953074 [GenBank] , DQ409057 [GenBank] ), as well as initial sequences of the genomic clones. The genomic clones fell into 4 categories based on levels of sequence divergence (see Results and Discussion). Because the genomic library was constructed from 8 individuals, additional PCR experiments were done to determine whether these MaSp1 categories correspond to allelic variation or multiple loci within the L. hesperus genome. PCR amplifications were done on single spider genomic DNA extractions with primers specifically designed for each MaSp1 category.
Latrodectus geometricus was also examined for multiple MaSp1 loci. Two different PCR reactions were performed on DNA extracted from a single individual. The products were gel excised and cloned with the TOPO TA Cloning kit (Invitrogen, Carlsbad, CA). Forward primers for these reactions were designed from conserved regions of the N-termini of putative L. hesperus MaSp1 loci and the published L. geometricus MaSp1-like genomic clone (5' partial length, DQ059133S1; Motriuk-Smith et al. 2005
). The reverse primer was designed from the repetitive region of the full-length L. hesperus MaSp1 sequence. Seventy clones were amplified using universal primers (M13 forward and M13 reverse), and inserts of the expected size (500–700 or 700–900 bp) were sequenced. DNA sequences were aligned in SEQUENCHER v.4.5 (Gene Codes, Ann Arbor, MI), and a Neighbor-Joining tree (constructed in PAUP* v.4.0B10; Swofford 2002
) revealed 3 clusters of highly similar sequences. A consensus sequence for each cluster was calculated in SEQUENCHER.
The alignment of the TOPO TA clones with direct sequences of the original PCR products was inspected to verify that all base calls could be accounted for by the original PCR. Any polymorphic positions in the alignment of the clones that could not be accounted for by the original PCR were considered cloning error. Individual clones differed from the consensus sequence at 0–5 sites and typically, only 1 clone differed from the consensus sequence at any one position. However, at one position in the alignment of the third cluster, 16 clones displayed a T, 10 displayed a C, and the direct PCR sequences were polymorphic (i.e., multiple peaks at that position in the chromatographs). This polymorphism was thus considered a true allelic difference. A third PCR reaction was performed on the same individual as above with primers designed to specifically amplify the L. geometricus MaSp1-like sequence, and this PCR product was directly sequenced.
The L. hesperus genomic library was also screened for MaSp2 (Ayoub et al. 2007
). Three positive clones were found. One was sequenced in full (EF595245
[GenBank]
; Ayoub et al. 2007
), and another was directly sequenced with N- and C-terminal primers and a downstream flanking primer. MaSp2 was additionally amplified from L. geometricus individuals using N-terminal forward primers and a repetitive reverse primer and from L. hesperus using repetitive forward and C-terminal reverse primers. PCR products were directly sequenced.
For all PCRs (except one) performed in this study, 1 µl genomic DNA formed the template for a 50 µl reaction containing 0.5 U Taq polymerase (Invitrogen), 1 µM each primer, 0.5 mM each deoxynucleoside triphosphate (Fisher Scientific, Waltham, MA), 67 mM Tris, 3 mM MgCl2, and 16.6 mM (NH4)2SO4. PCR was done with 45 cycles of 30 s at 94 °C, 30–45 s at 55–62 °C, and 60–90 s at 72 °C. A slightly different PCR reaction was performed to obtain one of the products used to clone L. geometricus MaSp1. For this reaction, 1 µl genomic DNA formed the template for a 50 µl reaction containing 1 U AccuPrime Taq DNA Polymerase High Fidelity (Invitrogen), 1 x AccuPrime Buffer II (Invitrogen), and 0.2 µM each primer. PCR parameters were 30 cycles of 30 s at 94 °C, 30 s at 60 °C, and 90 s at 68 °C.
Sequence Analysis
Translated N-terminal aa sequences described in this paper were aligned with all published N-termini using default parameters in ClustalW (Thompson et al. 1994
) as implemented in MacVector 7.2 (Accelrys, San Diego, CA). When available, the corresponding C-termini were also aligned. Other than MaSp1 and MaSp2, included fibroin types are Flag, which forms capture spiral silk (Hayashi and Lewis 1998
, 2000
), and the egg case protein TuSp (for simplicity, TuSp orthologues that were named CySp are referred to as TuSp in this paper; Garb and Hayashi 2005
; Tian and Lewis 2005
; Zhao et al. 2006
). Additional published sequences from Agelenopsis aperta (MaSp), Argiope trifasciata (MaSp1), Nephila inaurata madagascariensis (MaSp1), and L. geometricus (MaSp1) were included in the C-terminal alignment. aa alignments were used to guide nucleotide alignments, which formed the basis for phylogenetic analyses. For each nucleotide data set, maximum parsimony (MP), maximum likelihood (ML), and Bayesian analyses were performed. MP analyses were additionally executed on aa alignments. Gaps were treated as missing data in all analyses. Heuristic searches were conducted in PAUP* with 1,000 (MP) or 100 (ML) replicates of random addition sequence and tree-bisection-reconnection branch swapping. Bootstrap support was determined with 1,000 (MP) or 100 (ML) pseudoreplications with 100 (MP) or 1 (ML) addition sequences per pseudoreplicate. The best-fit model of evolution for ML analyses was determined with the likelihood ratio test carried out in Modeltest v.3.7 (Posada and Crandall 1998
).
Bayesian analyses were performed with MrBayes v.3.1.2 (Ronquist and Huelsenbeck 2003
) using default priors and heating parameters. Two independent simultaneous runs were carried out, each with 4 simultaneous Markov Chain Monte Carlo chains, for 1 million generations, sampling every 100 generations. The analyses were considered to have converged if the standard deviation of the split frequencies of the 2 independent runs (discarding the first 25% of generations as burn-in) was below 0.01. Majority rule consensus tree topologies and branch lengths were calculated discarding the first 25% of generations as burn-in. Separate Bayesian analyses were carried out on unpartitioned data sets and data sets partitioned according to codon position. Modeltest was used to determine the best-fit model of evolution for each partition, but optimal parameter values were calculated with MrBayes (supplementary table S2, Supplementary Material online), unlinking parameters in partitioned analyses. Tree topologies with the better likelihood score were favored.
MULTIPIPMAKER (Schwartz et al. 2000
) was used to identify the maximum amount of upstream and downstream flanking sequence that could reliably be aligned among L. hesperus MaSp1 loci, the full-length L. hesperus MaSp2, and L. geometricus MaSp21-like. To determine whether sequences could be aligned, MULTIPIPMAKER utilized a threshold score implemented in the BlastZ algorithm (Schwartz et al. 2003
). Approximately 500 bp of upstream and 635 bp of downstream sequence could be aligned among L. hesperus MaSp1 loci, and
300 bp upstream and
180 bp downstream sequence could be aligned among all sequences. The identified N- and C-terminal coding and adjacent noncoding regions were then aligned using default parameters in ClustalW. Pairwise numbers of synonymous substitutions per synonymous site (Ks) for coding and noncoding sequences were calculated using DNASP v.4.0 (Rozas et al. 2003
). The number of nonsynonymous substitutions per nonsynonymous site (Ka) was additionally calculated for all pairwise comparisons of Latrodectus MaSp1 and MaSp2 N- and C-terminal coding regions.
Recombination analyses were performed using GENECONV v1.81a (Sawyer 1989
) to search for segments of sequences that were more similar than expected based on the overall divergence of compared sequences. In a simulation analysis, GENECONV had a low false positive rate for detecting recombination events (Chan et al. 2006
). The 5' and 3' sequences of the following genomic clones were analyzed: 3 L. hesperus MaSp1 loci, L. hesperus MaSp2, and L. geometricus MaSp1-like N- and C-terminal coding and adjacent noncoding alignments. Recombination analyses were also performed on only the N- and C-terminal coding alignments of the above sequences plus L. geometricus MaSp2. The N- and C-termini were combined into single alignments for these analyses. Repetitive sequence was not included because of the difficulties in determining positional homology of these sequences across species and protein type (Gatesy et al. 2001
). Default parameters were used in GENECONV, with a mismatch penalty of 1*Npoly/Ndiff, where Npoly equals the number of polymorphic sites in an alignment and Ndiff equals the number of differences between a particular pair of sequences. Significance values were adjusted for multiple pairwise comparisons using a Bonferroni correction (see Sawyer 1989
).
| Results and Discussion |
|---|
|
|
|---|
Detection of Multiple MaSp1 Loci in Latrodectus Genomes
Partial sequencing of L. hesperus MaSp1-positive genomic clones revealed that they fit into 4 categories, each presumed to represent a different locus. Three of these loci appear functional (no premature stop codons were detected) and will be referred to as LhMaSp1_L1–3 for the remainder of the paper. The fourth locus, LhMaSp1_pseudo, is a pseudogene represented by a single fosmid clone (EU177647 [GenBank] ). Conceptual translation of this sequence reveals a stop codon after 153 aa that correspond to the N-terminus and 11 aa of repetitive spidroin sequence. After the stop codon, there are 18 consecutive codons for repetitive sequence before the conceptual translation fails to recover recognizable spidroin sequence in any frame. Sequencing reactions targeting the C-terminus failed. Additionally, the pairwise Ka/Ks value between the N-terminus of LhMaSp1_pseudo and the locus most similar to it, LhMaSp1_L3, is 0.96, suggesting a loss of functional constraints on this locus. In contrast, pairwise Ka/Ks values for all Latrodectus MaSp1 loci and MaSp2 comparisons, excluding LhMaSp1_pseudo, are typically below 0.2, indicating strong functional constraints on the remaining loci. Four fosmid clones (EU177649 [GenBank] , EU177654 [GenBank] , EU177655 [GenBank] , and EF595246 [GenBank] ), including the fully sequenced clone, belong to LhMaSp1_L1. Two clones (EU177651 [GenBank] and EU177653 [GenBank] ) belong to LhMaSp1_L2 and 2 (EU177648 [GenBank] and EU177650 [GenBank] ) to LhMaSp1_L3. Pairwise differences between clones belonging to a single locus range from 0% to 1.2% (including all available sequence: noncoding, N- and C-termini, and repetitive sequence). In contrast, pairwise differences between loci range from 10.8% to 36.3% (excluding repetitive sequence). The N- and C-terminal coding sequences of clones within a locus were either identical or only differed at one position.
Previously reported L. hesperus MaSp1 sequences could be unambiguously assigned to our locus categories. The N-terminal coding region (
450 bp) of a 5' partial MaSp1 cDNA sequence (EF595247
[GenBank]
) is 99.8% identical to LhMaSp1_L1 and is assumed to represent a transcript of this locus. The published 3'-partial L. hesperus MaSp1 cDNA sequences (AY953074
[GenBank]
and DQ409057
[GenBank]
) are identical to LhMaSp1_L2 in the C-terminal coding region (
300 bp) and 3' untranslated region (90 bp). The only difference between the genomic clone and these cDNA sequences is the presence of a gap in the repetitive region, and thus, these cDNA sequences are assumed to represent allelic variants of LhMaSp1_L2.
The amplification of MaSp1 from an individual spider's genomic DNA with locus-specific PCR primers shows that differences among the loci cannot be explained by allelic variation. Direct sequencing of the locus-specific PCR products results in a few (<1.0%) polymorphic base calls, which are visualized as positions with multiple peaks on chromatographs (EU177658 [GenBank] , EU177659 [GenBank] , EU177661 [GenBank] , EU177662 [GenBank] , EU177663 [GenBank] , EU177664 [GenBank] , and EU177665 [GenBank] ). These polymorphic positions are interpreted as allelic variation, and their low frequency and specific locations cannot account for the variation seen among MaSp1 loci.
At least 3 loci are also present in L. geometricus. The cloned PCR products can be assigned to 4 categories of MaSp1 sequences. Because PCR reactions were carried out on genomic DNA from a single individual and spiders are diploid, at least 2 loci must exist to account for these 4 alleles. Two alleles are identical in the N-terminal coding region and differ at only a few nucleotide positions in the repetitive region (1.8% of 284 nt). These alleles are considered to belong to a single locus, LgMaSp1_L1 (EU177666
[GenBank]
and EU177667
[GenBank]
). The other 2 alleles differ at only one position (876 nt) and are referred to as LgMaSp1_L2 (EU177668
[GenBank]
and EU177669
[GenBank]
). A third locus (henceforth referred to as LgMaSp1_L3) is represented by the published genomic sequence, L. geometricus MaSp1-like (Motriuk-Smith et al. 2005
), which is considerably divergent (19.2%) from the other 2 loci. The existence of all 3 loci in the genome of a single individual is confirmed by sequencing PCR products generated with LgMaSp1_L3 locus-specific primers (EU177660
[GenBank]
). The published L. geometricus 3'-partial cDNA sequence (AF350273
[GenBank]
) may represent a fourth locus or may belong to LgMaSp1_L1 or L2. It is distinct from LgMaSp1_L3 (16.7% different in C-terminal coding region), but we do not currently have C-termini known to correspond to LgMaSp1_L1 or LgMaSp1_L2.
The presence of at least 3 copies of MaSp1 in both L. geometricus and L. hesperus suggests that multiple loci encode MaSp1 in all widow spiders. These 2 species represent the extent of divergence in Latrodectus, which is split into 2 primary clades with L. hesperus belonging to one and L. geometricus belonging to the other (Garb et al. 2004
). Furthermore, Rising et al. (2007)
described several MaSp1 loci in Euprosthenops australis (Pisauridae). This discovery, combined with our Latrodectus findings, suggests that the presence of multiple MaSp1 loci within a genome is characteristic of species broadly distributed across entelegyne spider phylogeny.
In contrast to MaSp1, the L. hesperus MaSp2 clones (EF595245 [GenBank] and EU177652 [GenBank] ) are very similar (99% identity over 3030 bp). In addition, sequences of L. hesperus and L. geometricus MaSp2 PCR products (EU177656 [GenBank] and EU177657 [GenBank] , respectively) reveal 3 and 0 polymorphic base calls, respectively, indicating low allelic diversity. Thus, we currently have no evidence for multiple loci encoding MaSp2 in Latrodectus.
Variation in the Repetitive Sequences of MaSp1 Loci
The conceptual translations of the repetitive portions of each L. hesperus and L. geometricus locus contain aa motifs typical of MaSp1, such as GGX (X = A, Q, Y, S, L, I, or F), GX (X = Q, A, R, E, or L), and poly-A (fig. 1). In L. hesperus, these aa motifs are combined to form 4 different ensemble repeat types possessed by each putatively functional locus. However, the 3 loci (LhMaSp1_L1–3) differ in the arrangement of the ensemble types. LhMaSp1_L1 displays a consistent aggregate repetition of ensemble types "a," "b," "c," and "d" in that order (fig. 1A; see also Ayoub et al. 2007
). In contrast, LhMaSp1_L2 and L3 do not display a consistent aggregate repeat of ensemble types, at least over the sequenced portions (fig. 1A).
|
Latrodectus geometricus does not display consistent ensemble types across loci, although LgMaSp1_L1, L2, and the cDNA sequence have very similar ensemble repeats (fig. 1B). However, these sequences are directly adjacent to the N-termini (L1 and L2) or C-terminus (cDNA) and more regular ensemble types may be found in the central portions of the genes. LgMaSp1_L3 shares aa motifs with the other loci, but the ensemble repeats are distinct (fig. 1B). A striking difference between LgMaSp1_L3 and the other loci is that its repetitive sequence has a more diverse aa composition (fig. 2; supplementary table S3 [Supplementary Material online]). Especially notable is the lower proportion of G and the higher proportions of S, P, L, V, and F in the ensemble repeats of LgMaSp1_L3 compared with the other L. geometricus and L. hesperus loci (figs. 1B and 2
|
Concerted Evolution and Gene Turnover
Relationships among the different types of silk proteins (MaSp1, MaSp2, TuSp, and Flag) are largely congruent according to N- and C-termini (fig. 3). Partitioned Bayesian analyses produced topologies with higher likelihood scores and posterior probabilities than unpartitioned analyses, and these are considered to represent the more optimal topologies. MP and partitioned Bayesian topologies (shown in fig. 3) are identical except for the placement of Nephila clavata TuSp1 among the egg case proteins in the N-terminal tree. The ML trees display slight differences from the MP and partitioned Bayesian analyses, but these differences are poorly supported (<55% ML bootstrap support). The egg case silk proteins (TuSp1 and TuSp2) and the major ampullate proteins (MaSp1 and MaSp2) are recovered as monophyletic with strong support in both terminal regions and all analyses (fig. 3). A striking pattern found in both the N- and C-termini analyses is the grouping of all Latrodectus MaSp1 and MaSp2 sequences to the exclusion of the other species MaSp1 and MaSp2. The additional MaSp1 sequences included in the C-terminal tree compared with the N-terminal tree confirm that the pattern is not an artifact of under sampling MaSp1.
|
Within the Latrodectus clades, incongruence between the N- and C-terminal relationships becomes apparent. Both termini recover a grouping of L. hesperus and L. geometricus MaSp2. In the C-terminal tree, the Latrodectus MaSp1 sequences are also monophyletic, but in the N-terminal tree, MaSp2 is nested within MaSp1 (fig. 3). The other major difference between the termini is that the LhMaSp1_L3 N-terminus is sister to the remaining Latrodectus MaSp1 and MaSp2 N-termini, but the LhMaSp1_L3 C-terminus is sister to LhMaSp1_L1.
Incongruence between N- and C-terminal tree topologies is likely the result of recombination and concerted evolution among MaSp1 loci as well as between MaSp1 and MaSp2. Because recombination events should affect limited tracts of DNA, concerted evolution can often be detected by identifying portions of genes that are significantly more similar than expected based on the overall divergence level (Sawyer 1989
; Dover et al. 1993
; Thornton and DeSalle 2000
). Comparison of Ks values across N- and C-terminal coding and adjacent noncoding sequences of L. hesperus MaSp1 loci shows that the level of sequence divergence varies remarkably according to region (fig. 4). Evidence for recombination among L. hesperus MaSp1 loci is also well supported by mismatch analyses in GENECONV. The entire C-terminus and downstream flanking sequence of LhMaSp1_L1 have experienced a recombination event with LhMaSp1_L3 (P < 0.00001). The N- and C-terminal coding regions of LhMaSp1_L1 and L2 have also experienced recombination (P < 0.00001). In addition to recombination within the L. hesperus genome, there is evidence for past recombination between 150 bp of the N-termini of LgMaSp1_L3 and LhMaSp1_L1 and L2 (P = 0.00008 and 0.00016, respectively), although this significance is lost when variable aa codons are excluded.
|
Recombination has not only affected MaSp1 loci but also potentially occurred between MaSp1 loci and MaSp2. Results from GENECONV indicate that LhMaSp1_L3 and L. hesperus MaSp2 experienced a 90-bp recombination event in the N-terminus, although this pattern is only detected when the analysis is limited to the silent sites of coding sequence (P = 0.06). Additionally, a 73- to 108-bp recombination event was detected between the N-termini of L. hesperus MaSp2 and LgMaSp1_L3 (P = 0.00003, coding and noncoding sequences included; P = 0.006, only coding sequences included). However, this result may be a by-product of selection as no recombination was detected when limiting the analyses to silent sites.
The recombination results, in conjunction with the tree topologies, suggest recurrent concerted evolution among MaSp1 terminal regions within Latrodectus genomes. They also suggest that past recombination events between MaSp1 and MaSp2 occurred within L. hesperus and/or in the common ancestor of L. hesperus and L. geometricus. Occasional recombination (i.e., less often than speciation events) between MaSp1 and MaSp2 C-terminal coding regions likely explains their consistent within species or genus grouping found in other studies (Gatesy et al. 2001
; Garb and Hayashi 2005
; Garb et al. 2006
).
Selection for similarity of MaSp1 and MaSp2 terminal regions may also play a role in their grouping within closely related species, which could explain why the recombination signal between L. hesperus MaSp2 and LgMaSp1_L3 was lost when only considering silent sites. Such similarity might facilitate their simultaneous expression in the major ampullate gland or the proper orientation of these 2 proteins into a single fiber. Furthermore, there is some evidence for convergence of the LgMaSp1_L3 N-terminus with the Latrodectus MaSp2 N-termini (see below).
In addition to concerted evolution among loci, intermittent duplication of MaSp1 may occur. LhMaSp1_L1–2 and LgMaSp1_L1–2 may have resulted from a duplication event prior to speciation of L. hesperus and L. geometricus. These loci group together with strong support in the N-terminal tree (fig. 3) and their ensemble repeats are very similar (fig. 1). However, it is possible that these loci experienced concerted evolution in their common ancestor. We currently cannot test for recombination involving these L. geometricus loci because there is no available flanking noncoding sequence. Results from L. hesperus (above) suggest that concerted evolution among loci typically involves the entire N- and/or C-terminal coding regions. Recombination could also occur between repetitive regions of different loci, but our analyses did not include repetitive sequences. However, LgMaSp1_L1 and LgMaSp1_L2 group together, suggesting concerted evolution occurs between these L. geometricus MaSp1 loci.
The presence of a MaSp1 pseudogene in L. hesperus suggests that there could be continual duplication and loss of MaSp1 copies. Such a pattern fits the birth-and-death model of gene family evolution (Nei and Hughes 1992
; Nei and Rooney 2005
). The birth-and-death model has often been framed as an alternative explanation to concerted evolution for the pattern of within-species gene copy similarity (e.g., Nei et al. 2000
; Rooney et al. 2002
; Rooney and Ward 2005
). Our results indicate that if duplication and loss of gene copies play a role in the evolution of MaSp1 loci, then this process acts jointly with concerted evolution. A similar pattern of concerted evolution coupled with gene duplication and loss is found in other gene families, such as major histocompatibility cell genes (Joly and Rouillon 2006
) and the Hsp70 gene superfamily (Nikolaidis and Nei 2004
), reinforcing the assertion that these processes are not mutually exclusive.
Evolution of the Silk Gene Family
The presence of multiple copies of MaSp1 in Latrodectus genomes confirms that duplication of at least one silk gene has occurred. Duplication is thought to precede divergence in the evolution of new function (Stephens 1951
; Nei 1969
; Zhang 2003
) and likely led to the evolution of the spectacularly diverse array of spider fibroins (Guerette et al. 1996
; Gatesy et al. 2001
). Not only do we find evidence for duplication of MaSp1 loci but we also find evidence for divergence when comparing LgMaSp1_L3 with the other loci. This locus has a much more variable aa composition in the repetitive region than the other L. geometricus and L. hesperus loci (fig. 2). Strikingly, LgMaSp1_L3 has proline (P) at the beginning of almost every other ensemble repeat (fig. 1B). The other loci either do not have any P in the repetitive region or it only occurs once at the beginning or end of the repetitive region. The aa motif GPG is characteristic of MaSp2 (Hinman and Lewis 1992
; Gatesy et al. 2001
), which is hypothesized to form type II ß-turns in silk proteins (Hayashi and Lewis 1998
). This prediction is based on examination of thousands of empirically determined type II ß-turns that consistently exhibit a P in the second position of the turn and a G in the third position (Hutchinson and Thornton 1994
). After every other poly-A motif, LgMaSp1_L3 displays PG motifs (fig. 1B) that likely form type II ß-turns. In MaSp2, the occurrence of multiple adjacent GPG motifs are expected to form ß-spirals that contribute to the elasticity of dragline silk (Hayashi and Lewis 1998
; Hayashi et al. 1999
). The PG motifs in LgMaSp1_L3 display a longer periodicity than do the GPG motifs of MaSp2. Structural studies will be necessary to determine whether the consistent presence of PG in LgMaSp1_L3 results in a loose spiral that converges on the secondary structure of MaSp2.
Additional evidence that LgMaSp1_L3 may be converging on MaSp2 comes from the nonrepetitive N-terminus. Phylogenetic analysis of the encoding nucleotides weakly supports a relationship joining LgMaSp1_L3 with Latrodectus MaSp2 (<58% MP bootstrap support and posterior probability; not recovered in ML tree). Furthermore, analyses of only the third codon positions (presumed neutral) group LgMaSp1_L3 with other MaSp1 loci (61% MP bootstrap support, recovered in ML tree but with <50% bootstrap support). In contrast, parsimony analysis of the aa strongly supports LgMaSp1_L3 grouping with MaSp2 (90% bootstrap).
These findings show the potential for a MaSp1 duplicate to give rise to a MaSp2-like gene. MaSp1 has been described from phylogenetically diverse spider taxa (e.g., Gatesy et al. 2001
; Rising et al. 2006
). In contrast, MaSp2 has only been described from the Orbiculariae (orb-weaving spiders and their relatives), implying that the MaSp2 paralog arose in the common ancestor of Orbiculariae (Garb et al. 2006
). Our results further suggest that this novel gene derived from a MaSp1 duplicate.
Expression of MaSp1 Loci
The MaSp1 loci described here appear to be under functional constraints (i.e., low Ka/Ks values, excluding the pseudogene), indicating that each of these loci encodes a functional protein. The similarity of cDNA sequences to 2 of the L. hesperus loci (L1 and L2) further indicates that these genes are expressed. However, the timing and extent of expression are currently unknown. Functional gene copies can be retained in a genome because of the benefits associated with higher gene dosage producing additional protein product (Ohno 1970
; Zhang 2003
; Kondrashov and Kondrashov 2006
). Spiders produce large amounts of dragline silk throughout their lifetimes (Foelix 1996
), and thus, the genes encoding the component proteins must be highly expressed. Additionally, it has been shown in Nephila clavipes that MaSp1 is more abundant in dragline silk fibers than MaSp2; the relative proportion of aa in dragline fibers compared with MaSp1 and MaSp2 protein sequences combined with immunological evidence suggests a 3:2 ratio of MaSp1:MaSp2 (Hinman and Lewis 1992
; Sponner et al. 2005
). A similar situation likely applies to black widows. We compared the aa composition of the full-length MaSp1 and MaSp2 sequences with the previously documented aa composition of L. hesperus dragline fibers (Casem et al. 1999
). We found that a 1:1 ratio of MaSp1:MaSp2 cannot account for the proportions of serine and P found in black widow dragline fibers. However, if MaSp1 is approximately 2.5 times more abundant than MaSp2, then the average aa composition of the sequences fits the aa composition of the fiber. Thus, selection for increased MaSp1 protein product could favor the maintenance of multiple functional copies in Latrodectus genomes.
Alternatively, gene duplicates can diverge in expression, that is, be activated in different tissues or at different developmental stages. Divergence in expression patterns is thought to be an important first step in the evolution of functional divergence between redundant genes (Ohno 1970
; Li et al. 2005
). Even small changes in the regulatory regions of duplicate genes can lead to expression divergence (Gu et al. 2002
; Castillo-Davis et al. 2004
; Zhang et al. 2004
). The average pairwise difference of the
500 bp of upstream flanking sequences of the functional L. hesperus MaSp1 loci is 38%, providing raw material for divergence in cis-regulatory motifs. However, the immediate 150 upstream bases are more conserved, with the pairwise Ks for this region ranging from 0.20 to 0.24, even though the N-terminal Ks ranges from 0.07 to 1.15 (fig. 4). This pattern of flanking sequence conservation suggests similar functional constraints on the regulatory regions of each of the 3 L. hesperus loci. However, it is still possible that widow spider MaSp1 loci could be expressed at different developmental stages. In N. clavipes, some evidence suggests a developmental change in investment in various components of the spider's web, specifically between juveniles and adults in the construction of prey capture versus protective barrier webs (Higgins 1992
). The possession of multiple MaSp1 loci could be associated with variation in the dragline silk used to build these different structures. Differential expression of MaSp1 loci could also explain the reported changes in aa composition of some spider's dragline silk in response to diet (Craig et al. 2000
; Tso et al. 2005
). However, this latter hypothesis seems unlikely for L. hesperus, given the substantial similarity in predicted aa composition of the 3 functional loci (fig. 2).
| Conclusions |
|---|
|
|
|---|
We provide compelling evidence that multiple genes encode one of the spider fibroins. Our data suggest that at least 3 functional copies of MaSp1 exist in widow spider (Latrodectus) genomes. Concerted evolution has maintained homogeneity of the coding sequences of these loci. Continual gene duplication and loss may also occur. We additionally find evidence for recombination between the nonrepetitive terminal coding regions of Latrodectus MaSp1 and MaSp2. Based on the phylogenetic distribution of MaSp2 versus MaSp1 and potential convergence of one MaSp1 locus on MaSp2, we propose that MaSp2 derived from a MaSp1 duplicate in the common ancestor of orb-weaving spiders and their relatives (Orbiculariae). Expression divergence could have occurred among MaSp1 loci, and future work on their expression patterns should help elucidate the early evolution and functional diversification of the spider silk gene family.
| Supplementary Material |
|---|
|
|
|---|
Supplementary tables S1–S3 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
| Acknowledgements |
|---|
|
|
|---|
We thank Dayan Colon-Sanchez for help with PCR. We also thank Laura Baldo, Jessica Garb, John Gatesy, and anonymous reviewers for comments on the manuscript. The research was funded by the Army Research Office (DAAD19-02-1-0358 and W911NF-06-1-0455) and the National Science Foundation (DEB-0236020).
| Footnotes |
|---|
David Irwin, Associate Editor
| References |
|---|
|
|
|---|
Ayoub NA, Garb JE, Tinghitella RM, Collin MA, Hayashi CY. Blueprint for a high-performance biomaterial: full-length spider dragline silk genes. PLoS One (2007) 2:e514.[CrossRef]
Beckwitt R, Arcidiacono S. Sequence conservation in the C-terminal region of spider silk proteins (spidroin) from Nephila clavipes (Tetragnathidae) and Araneus bicentenarius (Araneidae). J Biol Chem (1994) 269:6661–6663.
Beckwitt R, Arcidiacono S, Stote R. Evolution of repetitive proteins: spider silks from Nephila clavipes (Tetragnathidae) and Araneus bicentenarius (Araneidae). Insect Biochem Mol Biol (1998) 28:121–130.[CrossRef][Web of Science][Medline]
Blackledge TA, Hayashi CY. Silken toolkits: biomechanics of silk fibers spun by the orb web spider Argiope argentata (Fabricius 1775). J Exp Biol (2006) 209:2452–2461.
Brown DD, Wensink PC, Jordan E. Xenopus laevis and Xenopus mulleri: the evolution of tandem genes. J Mol Biol (1972) 63:57–73.[CrossRef][Web of Science][Medline]
Casem ML, Turner D, Houchin K. Protein and amino acid composition of silks from the cob weaver, Latrodectus hesperus (black widow). Int J Biol Macromol (1999) 24:103–108.[CrossRef][Web of Science][Medline]
Castillo-Davis CI, Hartl DL, Achaz G. Cis-regulatory and protein evolution in orthologous and duplicate genes. Genome Res (2004) 14:1530–1536.
Chan CX, Beiko RG, Ragan MA. Detecting recombination in evolving nucleotide sequences. BMC Bioinformatics (2006) 7:412.[CrossRef][Medline]
Colgin MA, Lewis RV. Spider minor ampullate silk proteins contain new repetitive sequences and highly conserved non-silk-like "spacer regions". Protein Sci (1998) 7:667–672.[Web of Science][Medline]
Craig CL, Riekel C, Herberstein ME, Weber RS, Kaplan D, Pierce NE. Evidence for diet effects on the composition of silk proteins produced by spiders. Mol Biol Evol (2000) 17:1904–1913.
Dover GA, Linares AR, Bowen T, Hancock JM. Detection and quantification of concerted evolution and molecular drive. Meth Enzymol (1993) 224:525–541.[Web of Science][Medline]
Foelix R. Biology of Spiders (1996) New York: Oxford University Press.
Garb JE, DiMauro T, Vo V, Hayashi CY. Silk genes support the single origin of orb webs. Science (2006) 312:1762.
Garb JE, González A, Gillespie RG. The black widow spider genus Latrodectus (Araneae: Theridiidae): phylogeny, biogeography, and invasion history. Mol Phylogenet Evol (2004) 31:1127–1142.[CrossRef][Web of Science][Medline]
Garb JE, Hayashi CY. Modular evolution of egg case silk genes across orb-weaving spider superfamilies. Proc Natl Acad Sci USA (2005) 102:11379–11384.
Gatesy J, Hayashi C, Motriuk D, Woods J, Lewis R. Extreme diversity, conservation, and convergence of spider silk fibroin sequences. Science (2001) 291:2603–2605.
Gosline JM, Guerette PA, Ortlepp CS, Savage KN. The mechanical design of spider silks: from fibroin sequence to mechanical function. J Exp Biol (1999) 202:3295–3303.[Abstract]
Gu Z, Nicolae D, Lu HH-S, Li W-H. Rapid divergence in expression between duplicate genes inferred from microarray data. Trends Genet (2002) 18:609–613.[CrossRef][Web of Science][Medline]
Guerette PA, Ginzinger DG, Weber BHF, Gosline JM. Silk properties determined by gland-specific expression of a spider fibroin gene family. Science (1996) 272:112–115.[Abstract]
Hayashi CY, Blackledge TA, Lewis RV. Molecular and mechanical characterization of aciniform silk: uniformity of iterated sequence modules in a novel member of the spider silk fibroin gene family. Mol Biol Evol (2004) 21:1950–1959.
Hayashi CY, Lewis RV. Evidence from flagelliform silk cDNA for the structural basis of elasticity and modular nature of spider silks. J Mol Biol (1998) 275:773–784.[CrossRef][Web of Science][Medline]
Hayashi CY, Lewis RV. Molecular architecture and evolution of a modular spider silk protein gene. Science (2000) 287:1477–1479.
Hayashi CY, Shipley NH, Lewis RV. Hypotheses that correlate the sequence, structure, and mechanical properties of spider silk proteins. Int J Biol Macromol (1999) 24:271–275.[CrossRef][Web of Science][Medline]
Higgins LE. Developmental changes in barrier web structure under different levels of predation risk in Nephila clavipes (Araneae: Tetragnathidae). J Insect Behav (1992) 5:635–655.[CrossRef]
Higgins LE, White S, Nuñez-Farfán J, Vargas J. Patterns of variation among distinct alleles of the Flag silk gene from Nephila clavipes. Int J Biol Macromol (2007) 40:201–216.[CrossRef][Web of Science][Medline]
Hinman MB, Lewis RV. Isolation of a clone encoding a second dragline silk fibroin. Nephila clavipes dragline silk is a two-protein fiber. J Biol Chem (1992) 267:19320–19324.
Hutchinson EG, Thornton JM. A revised set of potentials for ß-turn formation in proteins. Protein Sci (1994) 3:2207–2216.[Web of Science][Medline]
Joly E, Rouillon V. The orthology of HLA-E and H2-QaI is hidden by their concerted evolution with other MHC class I molecules. Biol Direct (2006) 1:2.[CrossRef][Medline]
Kondrashov FA, Kondrashov AS. Role of selection and fixation of gene duplications. J Theor Biol (2006) 239:141–151.[CrossRef][Web of Science][Medline]
Li W-H, Yang J, Gu X. Expression divergence between duplicate genes. Trends Genet (2005) 21:602–607.[CrossRef][Web of Science][Medline]
Motriuk-Smith D, Smith A, Hayashi CY, Lewis RV. Analysis of the conserved N-terminal domains in major ampullate spider silk proteins. Biomacromolecules (2005) 6:3152–3159.[CrossRef][Web of Science][Medline]
Nei M. Gene duplication and nucleotide substitution in evolution. Nature (1969) 221:40–42.[CrossRef][Medline]
Nei M, Hughes AL. Balanced polymorphism and evolution by the birth-and-death process in the MHC loci. In: Histcompatibility Workshop and Conference—Tsugi K, Aizawa M, Sasazuki T, eds. (1992) Oxford: Oxford University Press. 27–38.
Nei M, Rogozin IB, Piontkivska H. Purifying selection and birth-and-death evolution in the ubiquitin gene family. Proc Natl Acad Sci USA (2000) 97:10866–10871.
Nei M, Rooney AP. Concerted and birth-and-death evolution of multigene families. Annu Rev Genet (2005) 39:121–152.[CrossRef][Web of Science][Medline]
Nikolaidis N, Nei M. Concerted and nonconcerted evolution of the Hsp70 gene superfamily in two sibling species of nematodes. Mol Biol Evol (2004) 21:498–505.
Ohno S. Evolution by gene duplication (1970) New York: Springer.
Posada D, Crandall KA. MODELTEST: testing the model of DNA substitution. Bioinformatics (1998) 14:817–818.
Rising A, Hjälm G, Engström W, Johansson J. N-terminal nonrepetitive domain common to dragline, flagelliform, and cylindriform spider silk proteins. Biomacromolecules (2006) 7:3120–3124.[CrossRef][Web of Science][Medline]
Rising A, Johansson J, Larson G, Bongcam-Rudloff E, Engström W, Hjälm G. Major ampullate spidroins from Euprosthenops australis: multiplicity at protein, mRNA and gene levels. Insect Mol Biol (2007) 16:551–561.[Web of Science][Medline]
Ronquist F, Huelsenbeck JP. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics (2003) 19:1572–1574.
Rooney AP, Piontkivska H, Nei M. Molecular evolution of the nontandemly repeated genes of the histone H3 multigene family. Mol Biol Evol (2002) 19:68–75.
Rooney AP, Ward TJ. Evolution of a large ribosomal RNA multigene family in filamentous fungi: birth and death of a concerted evolution paradigm. Proc Natl Acad Sci USA (2005) 102:5084–5089.
Rozas J, Sánchez-DelBarrio JC, Messeguer X, Rozas R. DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics (2003) 19:2496–2497.
Sawyer S. Statistical tests for detecting gene conversion. Mol Biol Evol (1989) 6:526–538.[Abstract]
Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W. Human-mouse alignments with BLASTZ. Genome Res (2003) 13:103–107.
Schwartz S, Zhang Z, Frazer KA, Smit A, Riemer C, Bouck J, Gibbs R, Hardison R, Miller W. PipMaker—a web server for aligning two genomic DNA sequences. Genome Res (2000) 10:577–586.
Selden PA, Shear WA, Bonamo PM. A spider and other arachnids from the Devonian of New York, and reinterpretation of Devonian Araneae. Palaeontology (1991) 34:241–281.[Web of Science]
Sponner A, Schlott B, Vollrath F, Unger E, Grosse F, Weisshart K. Characterization of the protein components of Nephila clavipes dragline silk. Biochemistry (2005) 44:4727–4736.[CrossRef][Web of Science][Medline]
Stephens SG. Possible significance of duplication in evolution. Adv Genet (1951) 4:247–265.[Web of Science][Medline]
Swofford DL. PAUP*. Phylogenetic analysis using parsimony (*and other methods). Version 4 (2002) Sunderland (MA): Sinauer Associates.
Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res (1994) 22:4673–4680.
Thornton JW, DeSalle R. Gene family evolution and homology: genomics meets phylogenetics. Annu Rev Genomics Hum Genet (2000) 1:41–73.[CrossRef][Web of Science][Medline]
Tian M, Lewis RV. Molecular characterization and evolutionary study of spider tubuliform (egg case) silk protein. Biochemistry (2005) 44:8006–8012.[CrossRef][Web of Science][Medline]
Tso I-M, Wu H-C, Hwang I-R. Giant wood spider Nephila pilipes alters silk protein in response to prey variation. J Exp Biol (2005) 208:1053–1061.
Xu M, Lewis RV. Structure of a protein superfiber: spider dragline silk. Proc Natl Acad Sci USA (1990) 87:7120–7124.
Zhang J. Evolution by gene duplication: an update. Trends Ecol Evol (2003) 18:292–299.[CrossRef]
Zhang Z, Gu J, Gu X. How much expression divergence after yeast gene duplication could be explained by regulatory motif evolution? Trends Genet (2004) 20:403–407.[CrossRef][Web of Science][Medline]
Zhao A, Zhao T, Nakagaki K, et al, (11 co-authors). Novel molecular and mechanical properties of egg case silk from wasp spider, Argiope bruennichi. Biochemistry (2006) 45:3348–3356.[CrossRef][Web of Science][Medline]
Zimmer EA, Martin SL, Beverley SM, Kan YW, Wilson AC. Rapid duplication and loss of genes coding for the alpha chains of hemoglobin. Proc Natl Acad Sci USA (1980) 77:2158–2162.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



