Skip Navigation


MBE Advance Access originally published online on December 18, 2006
Molecular Biology and Evolution 2007 24(3):679-686; doi:10.1093/molbev/msl199
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
24/3/679    most recent
msl199v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Cusack, B. P.
Right arrow Articles by Wolfe, K. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Cusack, B. P.
Right arrow Articles by Wolfe, K. H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

Research Articles

Not Born Equal: Increased Rate Asymmetry in Relocated and Retrotransposed Rodent Gene Duplicates

Brian P. Cusack and Kenneth H. Wolfe

Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland

E-mail: khwolfe{at}tcd.ie.


    Abstract
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 References
 
Duplicated genes frequently evolve at different rates. This asymmetry is evidence of natural selection's ability to discriminate between the 2 copies, subjecting them to different levels of purifying selection or even permitting adaptive evolution of one or both copies. However, if gene duplication creates pairs of protein-coding sequences that are initially identical, this raises the question of how selection tells the 2 copies apart. Here, we investigated asymmetric sequence divergence of recently duplicated genes in rodents and related this to 2 possible sources of such asymmetry: gene relocation as a consequence of duplication and retrotransposition as a mechanism of gene duplication. We found that most young rodent duplicates that have been relocated were created by retrotransposition. The degree of rate asymmetry in gene pairs where one copy has been relocated (either by retrotransposition or DNA-based duplication) is greater than in pairs formed by local DNA-based duplication events. Furthermore, by considering the direction of transposition for distant duplicates, we found a consistent tendency for retrogenes to undergo accelerated protein evolution relative to their static paralogs, whereas DNA-based transpositions showed no such tendency. Finally, we demonstrate that the faster sequence evolution of retrogenes correlates with the profound alteration of their expression pattern that is precipitated by retrotransposition.

Key Words: retrotransposition • gene duplication • asymmetry


    Introduction
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 References
 
Several genome-wide studies of gene duplication have sought to investigate the processes by which the fates of the paralogs generated by a duplication event become uncoupled (Kondrashov et al. 2002Go; Zhang et al. 2003Go; Kellis et al. 2004Go). This question is relevant to determining the relative importance of different processes governing the fixation of duplicates (Lynch and Katju 2004Go). The period immediately following gene duplication is commonly described as one of genetic redundancy between functionally equivalent copies. This situation was originally thought to be resolved by 1 of 2 alternative mechanisms: nonfunctionalization resulting from loss of a superfluous copy or neofunctionalization in which a gain of function in 1 copy leads to the retention of both copies (Ohno 1970Go). More recently, the subfunctionalization model for preservation of duplicate genes has received much attention. This model proposes that pairs are retained if they partition the subfunctions of the ancestral gene between them (Force et al. 1999Go).

The nonfunctionalization and neofunctionalization processes predict accelerated evolution of the deconstrained copy compared with the paralog that remains constrained by purifying selection on the ancestral function. Unequal rates of duplicate gene divergence are therefore expected under both scenarios. In contrast, if duplicates are retained by equally strong purifying selection on both copies through a gene dosage effect, duplicate divergence should be roughly symmetrical (Lynch and Katju 2004Go), with any observed rate asymmetry resulting purely from stochastic effects. Finally, the subfunctionalization model does not make any prediction about the asymmetry (or otherwise) of sequence evolution because the ancestral subfunctions could be divided equally or unequally between duplicates. In this context, recent work has begun to shed light on the extent to which asymmetric sequence divergence is explained by asymmetric functional divergence (Kim and Yi 2006Go).

Although previous studies have reached conflicting conclusions as to whether asymmetry in sequence divergence following gene duplication is common (Conant and Wagner 2003Go; Zhang et al. 2003Go) or relatively rare (Kondrashov et al. 2002Go), rate differences between duplicates are often interpreted as evidence that natural selection can somehow differentiate between gene copies that were initially identical and thus functionally interchangeable. However, these studies have largely ignored the mechanism by which genes are duplicated, an issue that is relevant to the assumption that all duplicates are equal at birth (Lynch and Katju 2004Go; Katju and Lynch 2006Go).

Single-gene duplications in metazoan genomes can be formed by 2 mechanisms—DNA based and RNA based—that probably differ in their likelihood of generating identical paralogous copies of an ancestral gene. DNA-based duplication (copying of segments of chromosome) is expected to create 2 copies that are indistinguishable, with conservation of both the exon–intron structure and the regulatory sequences (provided that the entire gene and its promoter are contained within the duplication span [Katju and Lynch 2003Go]). Furthermore, if a DNA-based duplication occurs by the tandem duplication of a single gene or by segmental duplication of a group of linked genes, the duplication will not cause any extensive disruption of synteny. In contrast, gene duplication by retrotransposition (Soares et al. 1985Go; Boer et al. 1987Go) creates a new duplicate that differs from its parent in a number of respects. The retrocopy is created by reverse transcription of a spliced messenger RNA, typically creating a single-exon copy of a multiexon parental gene. In addition, because only the transcribed sequence is duplicated the retrocopy becomes detached from the ancestral promoter that controlled expression of the parental gene. Only if a new promoter is acquired by the retrocopy is it likely to survive as a functional retrogene. Furthermore, new genes formed by retrotransposition are usually not physically linked to their parents so synteny is disrupted. The newly created retrogene is deposited in a novel chromosomal environment with a different set of gene neighbors.

In this study, we investigated the impact of 2 potential causes of rate asymmetry in duplicated mammalian genes: the genomic relocation that may occur as a consequence of gene duplication and the mechanism of duplication (via DNA or RNA). We hypothesized that if syntenic context is an important aspect of gene function (e.g., due to the chromosomal clustering of coregulated genes [Lercher et al. 2002Go; Singer et al. 2005Go]), then gene duplications that result in gene relocation may create duplicates that are not functionally equivalent. This might be expected to increase rate asymmetry among relocated duplicates. Similarly, duplicates created by retrotransposition might be expected to show asymmetrical rates of evolution due to the almost inevitable regulatory changes associated with retrotransposition, even if the protein sequence is unaltered by the duplication event itself. In order to focus on recently duplicated genes, we consider genes that have become duplicated since the divergence of mouse and rat. Although it is widely assumed that retrogenes will show fast rates of evolution compared with their progenitors, to our knowledge this has only been demonstrated in one, very recent, study (Gayral et al. 2006Go).


    Methods
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 References
 
Recent Rodent Duplicates
We retrieved gene duplicates from the Homolens (version 1) database of automatically inferred phylogenies constructed using Ensembl gene predictions (Penel S, Duret L, personal communication; http://pbil.univ-lyon1.fr/databases/homolens.html) and queried using FamFetch (Dufayard et al. 2005Go). We searched for cases of recent lineage-specific gene duplication in rodents, where a single gene in rat or mouse has exactly 2 coorthologs in the second species. Homolens internal identifiers were mapped to Ensembl identifiers, which were then used to retrieve map locations. Because Ensembl contains some annotated "introns" that are frameshift corrections, for analyses of intron content, we only considered annotated introns that are ≥50 nt and flanked by coding exons.

Gene Duplication Categories
We categorized recent rodent duplicates on the basis of 2 criteria: relative location in the genome and mechanism of duplication. We designated all physically linked duplicate pairs with <5 intervening genes as "local" duplications. All other duplicates either on the same chromosome or on different chromosomes were classified as "distant."

We classified duplicated genes on the basis of duplication mechanism by distinguishing between RNA-mediated retrotranspositions (which typically create a single-exon retrogene from a multiexon paralog) and DNA-based transpositions (which typically conserve exon–intron structure), using a rigorous set of criteria based on counts of coding exons. For pairs consisting of a single-exon gene with a multiexon paralog, we counted the introns of the latter gene that lie within the protein alignment of the 2 duplicates. If ≥2 introns were present, we inferred that duplication had occurred by retrotransposition resulting in the loss of these introns in the retrocopy. Because all detected retrogenes have a single coding exon, this set excludes retrogenes that have been incorporated into chimeric coding regions following gene-fusion events, but potentially includes cases that have acquired noncoding exons de novo following retrotransposition (Vinckenbosch et al. 2006Go). If both members of a duplicate pair contained ≥2 exons, we counted introns within the alignment, and where there was evidence of not more than 1 intron loss, we inferred that duplication had occurred by DNA-based transposition. Although these strict criteria allow confident inference of the mechanism of duplication for many pairs, they leave some pairs unclassified. For example, when both members of a duplicate pair are single-exon genes, we were not able to infer the mechanism. Such unclassified pairs were not used for the analysis of the impact of duplication mechanism on rate asymmetry, but were included in the analysis of the effect of duplicate relocation.

Direction of (retro)Transposition of Distant Duplicates
For distant duplicates, we established the direction of (retro)transposition to discriminate between the relocated paralog and the static paralog that remains at the ancestral locus. This was done using a framework of positional anchors consisting of unduplicated single-copy genes for which there is a 1:1:1 orthologous relationship between human, mouse, and rat. These singletons were retrieved from Homolens using FamFetch with the query topology ((mouse, rat), human) constrained so that no gene duplication has occurred since the primate–rodent split.

To establish the direction of (retro)transposition of distantly separated mouse duplicates that are coorthologs of a single rat gene, for example, we located the closest singleton anchors that bracket the rat gene (fig. 1A). We then determined the locations of the single mouse orthologs of the rat bracketing genes. When the mouse orthologs of a pair of rat bracketing genes are linked in mouse, and themselves bracket 1 of the 2 mouse duplicates, we designated the bracketed mouse duplicate as the static copy and the other mouse duplicate as the relocated duplicate. Assignment of the direction of (retro)transposition by this method was possible for 118 of the 147 distant duplicates in mouse and for 106 of the 137 distant duplicates in rat.


Figure 1
View larger version (8K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 1.— (A) Determining the direction of transposition for distantly separated duplicates: Duplication of gene Y in the mouse lineage following mouse–rat speciation created 2 mouse duplicates (paralogs YM1 and YM2), which are coorthologs of a single rat gene (YR). To polarize the direction of transposition in mouse and thus discriminate between the static and transposed duplicates, we considered genes XR and ZR that flank gene YR in rat and have single orthologs in mouse (XM and ZM). Because both XM and ZM are found to flank YM1, this duplicate can be designated the static copy implying that its paralog (YM2) has been relocated by retrotransposition. (B) Classification by duplication mechanism of duplicate pairs having branch-specific dS > 0.001 and dN > 0.001. A resampling strategy was applied to these 98 duplicates to separately determine the effect of duplicate relocation and retrotransposition on rate asymmetry measured by RN.

 
Measures of Sequence Evolution
For each sequence triplet consisting of a single-copy gene in 1 rodent species and its 2 coorthologs in the second rodent species, we aligned the Ensembl protein sequences using ClustalW (Thompson et al. 1994Go) and back translated it to create a codon-based alignment. These alignments were used as input to the program like-tri-test (Conant and Wagner 2003Go) to estimate branch-specific rates of synonymous (dS) and nonsynonymous divergence (dN) as well as branch-specific estimates of dN/dS.

For the branches leading to each duplicate, we quantified the magnitude of asymmetry for estimates of dS, dN, and {omega} (=dN/dS). In cases where the branch-specific estimate of dS is very small for either duplicate, an artifactually large asymmetry in dN/dS may result; so in this analysis we used only those cases for which branch-specific estimates of both dS and dN for both duplicates were >0.001. For the branches leading from the internal node to each duplicate, we used the absolute (unsigned) normalized difference in divergence as a measure of asymmetric evolution. For example, using the notation of Kim and Yi (2006)Go, we quantified the asymmetry in synonymous evolution between duplicates as follows:

Formula
where dS1 and dS2 are the synonymous divergences estimated for the branches leading from the internal node to duplicates 1 and 2, respectively. Thus RS values of 0.33 and 0.60 correspond to rate differences of 2-fold and 4-fold, respectively. Absolute normalized differences in nonsynonymous divergence (RN) and strength of selective constraint (R{omega}) were calculated similarly.

For distant duplicates for which we could discern the direction of transposition, we derived a measure of signed (directional) asymmetry in nonsynonymous divergence (Kim and Yi 2006Go),

Formula
where dNs and dNr refer to nonsynonymous substitutions on the terminal branches leading to the static and relocated duplicates, respectively. Thus, when SRN > 0, the relocated duplicate has accelerated at the amino acid level compared with its paralog at the ancestral locus.

Prevalence of Significantly Asymmetric Sequence Divergence
We examined the prevalence of significantly asymmetric sequence divergence using a likelihood-based approach. For each pair of duplicates, we tested whether a model of unconstrained evolution on the branches leading to each duplicate gave a significantly better fit to the data than a null model in which the duplicates were constrained to evolve symmetrically. We used like-tri-test (Conant and Wagner 2003Go) to test 3 null models representing symmetry between duplicates with respect to synonymous divergence (dS1 = dS2), nonsynonymous divergence (dN1 = dN2), and strength of selective constraint ({omega}1 = {omega}2). For each of these tests, we compared the likelihoods of the alternative models of constrained and unconstrained evolution. When twice the difference in log likelihoods exceeded 3.84 ({chi}2 test P ≤ 0.05) the null model of symmetric divergence was rejected and duplicate gene divergence was classed as asymmetric. Otherwise, the divergence of the duplicates was designated symmetric for that measure. The purpose of this analysis was to calculate the relative prevalence of asymmetry between different types of duplicate and not to determine whether sequence divergence was significantly asymmetric for an individual pair of duplicates. Therefore, we did not perform a multiple testing correction. Because this approach relies on the likelihood ratio test to assign duplicates as either symmetric or asymmetric but does not quantify the magnitude of sequence asymmetry, we did not impose the filter used in the previous section that excludes duplicates with branch-specific values of dS and dN < 0.001.

Gene Expression Information
For each recent duplicate in mouse, we looked for evidence of expression by using the predicted transcript as a MegaBlast (Zhang et al. 2000Go) query to mouse expressed sequence tags (ESTs) and cDNAs. We did not study gene expression in rat duplicates because this species has much lower overall EST coverage than mouse. Starting with all hits with E < 1 x 10–20, any ESTs with >75% of their sequence aligned with >97% nucleotide identity to only one of the 2 duplicates were assigned to that duplicate. Any ESTs matching both duplicates by these criteria were aligned to them using ClustalW. We then considered diagnostic sites at which the EST sequence shares an identical base with only one of the 2 duplicates and where all 3 sequences are well aligned (i.e., no gap occurs within 2 nt). Only if all diagnostic sites group the EST with the same duplicate did we assign the EST to that gene.

We assigned ESTs to tissues using the TissueInfo database (Skrabanek and Campagne 2001Go), discarding ESTs from cancerous sources and keeping only those from normal unpooled tissues. For each tissue, we quantified a gene's expression frequency using the count of its ESTs from that tissue expressed as a fraction of all ESTs derived from that tissue. We then used the highest expression frequency for a gene among all tissues to represent its peak expression. We quantified the asymmetry in peak expression between a given pair of duplicates using the absolute (unsigned) normalized difference in peak expression, Formula where P1 and P2 are the peak expression levels of each duplicate. For unlinked duplicates for which the direction of (retro)transposition could be determined, we also quantified the direction of change in expression using the signed normalized difference in expression peak, Formula where Ps and Pr are the peak expression levels of the static and relocated duplicate, respectively. Similarly, we defined expression breadth (B) as the number of distinct tissues represented among the ESTs assigned to the gene. Because retrogenes are sparsely sampled with ESTs (see Results), we could not reliably quantify the expression breadth of individual retrogenes. Thus, if a retrogene is expressed ubiquitously but at a very low level its expression may appear tissue specific purely as a consequence of low EST coverage. Although this prevented us from estimating asymmetry in expression breadth for individual pairs of retroduplicates (analogous to the measures of asymmetry in expression peak, RP and SRP), we were able to test whether the expression breadth of retrogenes as a group is significantly different to that of their static progenitor paralogs (see Results).


    Results
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 References
 
We measured the magnitude of asymmetric sequence divergence among a set of 147 pairs of recent rodent duplicates (postdating the rat–mouse divergence) that show at least minimal sequence divergence (branch-specific dS > 0.001 and dN > 0.001). We classified these pairs as either local (with <5 intervening genes, n = 62 pairs) or distant duplicates (n = 85 pairs). Where possible, we categorized the mechanism of gene duplication as either DNA-based duplication (62 pairs) or retrotransposition (36 pairs). For many (n = 54) of the distant pairs, we were able to identify which gene copy was at the ancestral location and which was at a novel (transposed) location by comparison to the other rodent species (fig. 1A). In addition, we used a likelihood approach to investigate the prevalence of significant sequence asymmetry in a larger set of 81 local and 200 distant duplicate pairs (without the requirement dS > 0.001 and dN > 0.001, see Methods). This set included 91 DNA-based duplications and 110 retroduplications.

Asymmetry in dN is Greater among Relocated Duplicates and Duplicates Created by Retrotransposition
We first examined whether gene relocation and duplication mechanism are each individually related to asymmetric sequence divergence. Both variables are significantly associated with asymmetry in nonsynonymous evolution (RN). Distant duplicates show a more than 2-fold increase in RN compared with local duplicates (table 1). Duplication by retrotransposition is associated with a similarly large increase in RN relative to duplication by DNA-based transposition. Thus, on average, duplication by retrotransposition precipitates a more than 4-fold difference in rate between duplicates (median RN = 0.619). In contrast, for DNA-based duplicates there is a less than 2-fold difference in rates (median RN = 0.316). To determine whether the asymmetry of dN reflects imbalanced selective constraint between duplicates, we considered the relative asymmetry in {omega}. This measure (R{omega}) is similarly increased among distant duplicates and among duplicates created by retrotransposition (table 1).


View this table:
[in this window]
[in a new window]

 
Table 1. Magnitude of Relative Asymmetry in dS (RS), dN (RN), and {omega} (R{omega}) between Diverged (dS < 0.001 and dN < 0.001) Rodent Duplicates Categorized by Duplicate Location and Duplication Mechanisma

 
Notably, we found no similar association between asymmetry in synonymous divergence (RS) and either duplicate relocation or duplication mechanism (table 1). Therefore, the increase in RN associated with relocation and retrotransposition cannot be explained as resulting from mutational differences between duplicates. Moreover, this result suggests that if some of the gene duplicates have been created by transposition between different isochores, equilibration of silent sites in the transposed duplicate to local GC content has not led to a general increase in synonymous asymmetry.

In mammals, pairwise dS provides an approximate measure of divergence time. We noticed that distant duplicates tend to be younger than tandem pairs created by local duplication (table 1). This age difference alone might underlie the result above if a high degree of nonsynonymous asymmetry is a characteristic of the initial stages of duplicate gene differentiation. In order to exclude this possibility, we used a subset of the oldest relocated duplicates whose median age matched that of the set of local duplicates (median pairwise dS = 0.09). Comparing these age-matched categories confirms our initial observation of increased asymmetry among distant compared with local duplicates (table 1).

The impact of relocation on significantly asymmetric divergence as determined by the likelihood ratio test (table 2) confirms our observations based on normalized difference measures of the magnitude of sequence asymmetry. Relocated duplicates more frequently show significant asymmetry in nonsynonymous divergence and selective constraint than local duplicates, but relocation is not associated with more frequent significant synonymous asymmetry.


View this table:
[in this window]
[in a new window]

 
Table 2. Likelihood Ratio Test: Prevalence of Statistically Significant Asymmetry in dS, dN, and {omega} between All Rodent Duplicates (without the requirement dS < 0.001 and dN < 0.001) Categorized by Duplicate Location and Duplication Mechanisma

 
Similarly, relating the mechanism of duplication to the occurrence of statistically significant asymmetry broadly supports the results from normalized difference measures. Retrotransposition leads more frequently to significant asymmetry in nonsynonymous divergence and selective constraint than does DNA-based duplication (although only for selective constraint is the difference significant; table 2). Interestingly, we found a weak tendency for significant asymmetry in synonymous divergence to occur more often following DNA-based duplication than following retrotransposition (P > 0.05).

Separating Relocation from Retrotransposition
We were able to confidently discern the mechanism of gene duplication as either DNA mediated or RNA mediated for 98 (67%; table 1) of the 147 duplicate pairs with branch-specific dS > 0.001 and dN > 0.001 (for which we could quantify RN). This classification revealed a tight association between relocation and retrotransposition: two-thirds of the distant duplicates were formed by retrotransposition whereas none of the local ones were (fig. 1). This raises the question of whether gene duplication mechanism and genomic relocation exert independent effects on rate asymmetry between duplicates.

To test this, we partitioned the data set in figure 1B by distinguishing distant from local duplicates. This partition revealed a 138% increase in RN among distant duplicates (n = 54) compared with local duplicates (n = 44), similar to the results in the larger data set in the upper part of table 1. We then tested whether an additional partition of the data based on the mechanism of duplication could explain any further variation in rate asymmetry. We introduced this second partition by comparing local duplicates created by DNA-based transposition (n = 44) with distant duplicates created by retrotransposition (n = 36). This revealed a 145% increase in RN in the latter category. We tested whether the P value associated with this comparison (reflecting the significance of the impact of relocation on asymmetry in combination with the effect of retrotransposition) was more significant than a random partition of 36 genes derived from the set of distant duplicates (reflecting the significance of the impact of relocation alone on asymmetry). The observed P value (for the comparison "local DNA transpositions" vs. "distant retrotranspositions") was lower than the equivalent P values in 92.5% of comparable random partitions. Thus, the increase in the magnitude of rate asymmetry between duplicates owing to retrotransposition is marginally significant (P = 0.075) once relocation has been accounted for. Using the same approach, we found a significant effect of genomic relocation on rate asymmetry independent of the mechanism of gene duplication (P = 0.026).

Directional Sequence Asymmetry: Retrogenes Accelerate Relative to Their Paralogs
The above results show that the magnitude of rate asymmetry between duplicated genes is strongly affected by the distance of duplication (distant vs. local) and only marginally affected by the mechanism (RNA-based vs. DNA-based duplication). We hypothesized that the duplication mechanism should, however, affect the direction of the asymmetry: for retrotransposed genes, we would expect that the decoupling of the gene from its original promoter would cause altered (probably lower and narrower) expression and make the relocated copy (the retrocopy) more likely to accelerate than its static paralog. In contrast, for DNA-based duplication we might not expect sequence acceleration to be consistently associated with either the static or the relocated paralog. To investigate this hypothesis, we introduced signed measures of sequence asymmetry that take account of the direction of transposition in distant duplicates.

For DNA-mediated duplicates, we found no consistent tendency for either copy to accelerate; the median value of SRN = –0.002 for these duplicates was not significantly different from zero (Wilcoxon signed-rank test: P = 0.96, n = 12). In contrast, following duplication by retrotransposition there is a highly significant tendency for the relocated retrogene to accelerate relative to its static paralog (median SRN = 0.52, Wilcoxon signed-rank test: P = 0.001, n = 36) (fig. 2). Relocated retrogenes also exhibited relaxed selective constraint relative to their static paralogs (median SR{omega} = 0.43; Wilcoxon signed-rank test: P = 0.004, n = 36), whereas transposed DNA-based duplicates showed no such tendency (median SR{omega} = 0.19; Wilcoxon signed-rank test: P = 0.57, n = 12).


Figure 2
View larger version (10K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 2.— Signed nonsynonymous sequence asymmetry (SRN) among distant duplicates for which the direction of transposition is known. Duplicates were categorized as created by DNA-based transposition (n = 12) or RNA-based retrotransposition (retroduplicates, n = 36). A subset of retroduplicates under selective constraint ("constrained retroduplicates" with pairwise {omega} < 0.5, n = 16) is enriched for putatively functional retrogenes and is likely to be depleted of retropseudogenes. The left and right edges of each box depict the first and third quartiles, respectively, and the vertical line within each box corresponds to the median. The left and right whiskers extend to the most extreme data point within 1.5 times the interquartile range of the first and third quartiles, respectively. The height of each box is proportional to the square root of each sample size.

 
One possible artifact that might affect the preceding result is the accidental inclusion of retropseudogenes, which would show strong and directional acceleration of sequence divergence due to their loss of selective constraint. Although all of the retrogenes in our data set have intact open reading frames, this is not in itself unequivocal evidence that they are functional (Marques et al. 2005Go). Therefore, we attempted to enrich our data set for duplicate pairs subject to relatively strong purifying selection, reasoning that these are likely to be functional. We made use of the fact that in a pairwise comparison of putatively protein-coding sequences, {omega} < 0.5 indicates that selective constraint is operating on both sequences (Emerson et al. 2004Go). After application of this conservative filter to the set of 36 retrotranspositions with directional information, a total of 16 pairs of duplicates remained. Accelerated evolution of retrogenes was still seen in this subset of the data, as revealed by values of median SRN (0.51, Wilcoxon signed-rank test: P = 0.009, n = 16) and median SR{omega} (0.50, Wilcoxon signed-rank test: P = 0.03, n = 16) that are similar to those in the unfiltered data set (fig. 2).

Greater Expression Asymmetry among Distant Duplicates Due to Lowering and Narrowing of Retrogene Expression
Both breadth of tissue distribution and peak gene expression level are known predictors of a gene's evolutionary rate (Duret and Mouchiroud 2000Go; Pal et al. 2001Go; Subramanian and Kumar 2004Go; Zhang and Li 2004Go; Drummond et al. 2006Go). We therefore tested whether the increased rate asymmetry of distant gene duplicates reflects changes in expression associated with genomic relocation. For each pair of gene duplicates, we used a stringent approach to assign a given EST or cDNA uniquely to a single paralog in each pair (see Methods). The low level of sequence divergence between duplicates meant that only a minority of duplicate pairs had independent expression evidence for both members of the pair.

Compared with local duplicates (N = 25), distant duplicates (N = 29) showed a marginally significant increase in asymmetry in peak expression (median RP: local duplicates = 0.55, distant duplicates = 0.73, and Wilcoxon rank-sum test P = 0.066). Thus, distant gene pairs are more asymmetric in both sequence divergence and peak expression than local gene pairs. A large part of this disparity in expression asymmetry may be a consequence of the overrepresentation of retrogenes among distant duplicates. The hypothesis outlined earlier predicts that retrogenes should show lower and narrower expression relative to their static progenitor paralogs. We studied this using a signed measure of peak expression asymmetry (SRP, see Methods). Among 18 retrotranspositions SRP is significantly less than zero (median SRP = –0.69, Wilcoxon signed-rank test: P = 0.037). Because of the sparse EST sampling of retrogenes, we were unable to derive a measure of asymmetry in expression breadth analogous to SRP for individual duplicate pairs created by retrotransposition. However, we found that retrogenes, collectively, show significantly narrower expression breadth compared with their static paralogs (data not shown).

We found a pronounced negative correlation between the signed measure of asymmetry in nonsynonymous rate (SRN) and in peak expression (SRP) (r2 = 0.29, P = 0.02, n = 18). Thus nearly 30% of the rate acceleration of retrotransposed duplicates is explained by the decrease in their peak expression. Because expression peak and breadth are strongly correlated (Subramanian and Kumar 2004Go), we expect part of this association to be mediated by narrowing of their expression breadth. However, for the reason outlined above, we could not estimate the magnitude of this effect and therefore could not exclude the possibility that asymmetry in expression breadth (rather than asymmetry in peak) is the more important determinant of rate asymmetry.


    Discussion
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 References
 
In this study, we tested the validity of the conventional view that gene duplication gives rise to redundant and functionally interchangeable paralogs (Ohno 1970Go). Our results demonstrate that the uncoupling of the fates of duplicated gene pairs is facilitated by both their physical relocation and the alteration of gene structure and regulation that follows retrotransposition. Moreover, because more than 60% of rodent duplicates with significantly asymmetric dN are generated by retrotransposition (table 2), this implies that previous reports of frequent sequence asymmetry might be inflated as a result of simple violation of the dogma of "equality at birth" of duplicated genes (Conant and Wagner 2003Go) rather than representing the functional divergence of sister duplicates that are identical twins. Furthermore, the fact that roughly one-third of recent rodent duplicates are retrotranspositions suggests that the assumption of equality is frequently violated.

Although it is difficult to disentangle the effects of retrotransposition from those of relocation, our results indicate that gene relocation (by any mechanism) has a strong impact on the asymmetry of protein evolutionary rates. This is consistent with the observation that the nonsynonymous evolution of linked genes occurs at similar rates (Williams and Hurst 2000Go). Genes relocated by retrotransposition show only marginally more rate asymmetry than those relocated by DNA-mediated duplication, but the retrogenes show changes in the expected direction (resulting in narrower expression profiles and accelerated sequence evolution), whereas DNA-mediated distant duplications do not show any consistent direction of rate asymmetry (fig. 2).

The effect of duplicate relocation can be appreciated by considering the likely effect of shared genomic context among tandemly duplicated genes. If the span of duplicated DNA includes entire promoters, then local tandem duplicates will initially share all their cis-regulatory elements (Katju and Lynch 2003Go). Moreover, local duplicates should share the same distal regulatory elements (e.g., locus control regions) in addition to residing in the same chromatin domain and gene neighborhood. We therefore expect this shared genomic environment to result in coregulation of local duplicates either as a consequence of selection for coexpression of functionally related genes (Lercher et al. 2002Go; Singer et al. 2005Go) or as a neutral consequence of proximity to the same regulatory elements (Spellman and Rubin 2002Go; Semon and Duret 2006Go). Conversely, because relocated DNA-based gene duplicates differ in their chromosomal environments, they are expected to show a limited degree of coregulation mediated only by their shared core promoters.

Our results can be interpreted as illustrating the impact of the "scope" of gene duplication (i.e., the degree to which duplication conserves the structure of exons, introns, and promoter regions) on the magnitude of sequence asymmetry between duplicates. This echoes the observation that, at a broader scale, asymmetry in expression is greater among small-scale compared with large-scale (whole-genome) duplicates (Casneuf et al. 2006Go). This is likely to reflect the fact that whole-genome duplicates exemplify the concept of equality at birth by not only preserving the structure of gene duplicates but also maintaining their neighborhood of flanking genes and regulatory sequences. However, it is worth noting that asymmetric divergence in sequence and expression is not exclusive to imperfect small-scale gene duplications but has also been observed among gene duplicates created by polyploidisation (Adams et al. 2003Go).

The effect of duplication mechanism on sequence asymmetry appears to be mediated by the regulatory changes associated with the process of retrotransposition. Strikingly, we found that for retrogenes nearly 30% of the variation in rate acceleration can be explained by lowering of expression peak. It is likely that the relationship between asymmetry in rate and in peak expression is partly due to narrowing of retrogene expression, although we could not quantify the contribution of expression breadth asymmetry directly. These results are consistent with accumulating evidence that the level and breadth of gene expression are among the most important determinants of the rate of protein evolution (Duret and Mouchiroud 2000Go; Drummond et al. 2006Go) and echo a similar correlation between sequence asymmetry and expression divergence observed in a genome-wide analysis of yeast duplicates (Kim and Yi 2006Go). The altered expression of a retrogene compared with its paralog may make it a more permissive target for the fixation of mutations because its lower (and narrower) expression renders some of these mutations less deleterious.

The difficulty in deriving unique expression evidence for each duplicate in combination with the variability in EST coverage between different tissues precludes a detailed in silico investigation of duplicate expression profiles. However, we found a general trend for retrogenes collectively to be expressed in a more limited range of tissues than their progenitor paralogs, in broad agreement with previous observations (Marques et al. 2005Go). Thus, relocated retrogenes show a reduction in both expression level and breadth compared with their static paralogs. These results are consistent with a loss of complex gene regulation accompanying retrotransposition. The survival of a functional retroduplicate is likely to be contingent on its expression. This can happen by acquiring a promoter de novo, coopting a neighboring gene's promoter, or through fortuitous integration into a transcribed region. Moreover, because genes giving rise to retrocopies may tend to have exceptionally broad expression (Goncalves et al. 2000Go) this will magnify the disparity in breadth. It has been suggested that the expression pattern of retrogenes broadens as they evolve more complex regulatory sequences (Vinckenbosch et al. 2006Go). If this is correct, we would expect a gradual equalization of nonsynonymous rates between duplicates as the retrogene's expression profile broadens.

It has recently been suggested that following duplicative transposition in bacteria roughly one-third of cases show "inconsistent" acceleration of the static paralog relative to the transposed copy (Notebaart et al. 2005Go). We note that the observation of Notebaart et al. is not based on an assessment of statistically significant asymmetry. However, our results based on the likelihood ratio test for statistically significant asymmetry in rodent duplicates show some support for the proposal that rate acceleration is not always consistent with relocation by transposition. Among 11 rodent DNA-based duplicate pairs with significant rate asymmetry (and with an assigned direction of transposition), 3 pairs (27%) showed inconsistent acceleration of the static paralog. A weaker trend is seen following retrotransposition: among 51 retroduplicate pairs with significant rate asymmetry, 7 pairs (14%) showed inconsistent acceleration of the static paralog. Cases of significant acceleration of the static paralog following retrotransposition may reflect rare cases of functional displacement by a retrogene of its static paralog (Krasnov et al. 2005Go; Marques et al. 2005Go).

Natural selection can seize the opportunity for evolutionary exploration afforded by gene duplication only if the business of maintaining the ancestral gene function is assumed by one of the duplicates. Our results suggest that this division of labor is conservative: the daughter that inherits most of the ancestral gene features (exon–intron structure, regulatory elements, and chromosomal neighborhood) is likely to take on the parental role by default, whereas the positional and structural modification of its prodigal twin (in particular by retrotransposition) qualify it to take on the mantle of evolutionary entrepreneur.


    Acknowledgements
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 References
 
We thank Marie Semon for assistance with the Homolens data set and Meg Woolfit for critical reading of the manuscript. This study was supported by Science Foundation Ireland.


    Footnotes
 
Michael Nachman, Associate Editor


    References
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Acknowledgements
 References
 

    Adams KL, Cronn R, Percifield R, Wendel JF. (2003) Genes duplicated by polyploidy show unequal contributions to the transcriptome and organ-specific reciprocal silencing. Proc Natl Acad Sci USA 100:4649–4654.[Abstract/Free Full Text]

    Boer P, Adra C, Lau Y, McBurney M. (1987) The testis-specific phosphoglycerate kinase gene pgk-2 is a recruited retroposon. Mol Cell Biol 7:3107–3112.[Abstract/Free Full Text]

    Casneuf T, De Bodt S, Raes J, Maere S, Van de Peer Y. (2006) Nonrandom divergence of gene expression following gene and genome duplications in the flowering plant Arabidopsis thaliana. Genome Biol 7:R13.[CrossRef][Medline]

    Conant GC and Wagner A. (2003) Asymmetric sequence divergence of duplicate genes. Genome Res 13:2052–2058.[Abstract/Free Full Text]

    Drummond DA, Raval A, Wilke CO. (2006) A single determinant dominates the rate of yeast protein evolution. Mol Biol Evol 23:327–337.[Abstract/Free Full Text]

    Dufayard JF, Duret L, Penel S, Gouy M, Rechenmann F, Perriere G. (2005) Tree pattern matching in phylogenetic trees: automatic search for orthologs or paralogs in homologous gene sequence databases. Bioinformatics 21:2596–2603.[Abstract/Free Full Text]

    Duret L and Mouchiroud D. (2000) Determinants of substitution rates in mammalian genes: expression pattern affects selection intensity but not mutation rate. Mol Biol Evol 17:68–74.[Abstract/Free Full Text]

    Emerson JJ, Kaessmann H, Betran E, Long M. (2004) Extensive gene traffic on the mammalian X chromosome. Science 303:537–540.[Abstract/Free Full Text]

    Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait J. (1999) Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151:1531–1545.[Abstract/Free Full Text]

    J Evol Biol Gayral P, Caminade P, Boursot P, Galtier N. (2006) The evolutionary fate of recently duplicated retrogenes in mice. doi: 10.1111/j.1420-9101.2006.01245.x.

    Goncalves I, Duret L, Mouchiroud D. (2000) Nature and structure of human genes that generate retropseudogenes. Genome Res 10:672–678.[Abstract/Free Full Text]

    Katju V and Lynch M. (2003) The structure and early evolution of recently arisen gene duplicates in the Caenorhabditis elegans genome. Genetics 165:1793–1803.[Abstract/Free Full Text]

    Katju V and Lynch M. (2006) On the formation of novel genes by duplication in the Caenorhabditis elegans genome. Mol Biol Evol 23:1056–1067.[Abstract/Free Full Text]

    Kellis M, Birren BW, Lander ES. (2004) Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature 428:617–624.[CrossRef][Medline]

    Kim SH and Yi SV. (2006) Correlated asymmetry of sequence and functional divergence between duplicate proteins of Saccharomyces cerevisiae. Mol Biol Evol 23:1068–1075.[Abstract/Free Full Text]

    Kondrashov FA, Rogozin IB, Wolf YI, Koonin EV. (2002) Selection in the evolution of gene duplications. Genome Biol 3: RESEARCH0008.

    Krasnov AN, Kurshakova MM, Ramensky VE, Mardanov PV, Nabirochkina EN, Georgieva SG. (2005) A retrocopy of a gene can functionally displace the source gene in evolution. Nucleic Acids Res 33:6654–6661.[Abstract/Free Full Text]

    Lercher MJ, Urrutia AO, Hurst LD. (2002) Clustering of housekeeping genes provides a unified model of gene order in the human genome. Nat Genet 31:180–183.[CrossRef][Web of Science][Medline]

    Lynch M and Katju V. (2004) The altered evolutionary trajectories of gene duplicates. Trends Genet 20:544–549.[CrossRef][Web of Science][Medline]

    Marques AC, Dupanloup I, Vinckenbosch N, Reymond A, Kaessmann H. (2005) Emergence of young human genes after a burst of retroposition in primates. PLoS Biol 3:e357.[CrossRef][Medline]

    Notebaart RA, Huynen MA, Teusink B, Siezen RJ, Snel B. (2005) Correlation between sequence conservation and the genomic context after gene duplication. Nucleic Acids Res 33:6164–6171.[Abstract/Free Full Text]

    Ohno S. (1970) Evolution by gene duplication(Springer Verlag, New York).

    Pal C, Papp B, Hurst LD. (2001) Highly expressed genes in yeast evolve slowly. Genetics 158:927–931.[Free Full Text]

    Semon M and Duret L. (2006) Evolutionary origin and maintenance of coexpressed gene clusters in mammals. Mol Biol Evol 23:1715–1723.[Abstract/Free Full Text]

    Singer GA, Lloyd AT, Huminiecki LB, Wolfe KH. (2005) Clusters of co-expressed genes in mammalian genomes are conserved by natural selection. Mol Biol Evol 22:767–775.[Abstract/Free Full Text]

    Skrabanek L and Campagne F. (2001) TissueInfo: high-throughput identification of tissue expression profiles and specificity. Nucleic Acids Res 29:e102.[Abstract/Free Full Text]

    Soares MB, Schon E, Henderson A, Karathanasis SK, Cate R, Zeitlin S, Chirgwin J, Efstratiadis A. (1985) RNA-mediated gene duplication: the rat preproinsulin I gene is a functional retroposon. Mol Cell Biol 5:2090–2103.[Abstract/Free Full Text]

    Spellman P and Rubin G. (2002) Evidence for large domains of similarly expressed genes in the Drosophila genome. J Biol 1:5.[CrossRef][Medline]

    Subramanian S and Kumar S. (2004) Gene expression intensity shapes evolutionary rates of the proteins encoded by the vertebrate genome. Genetics 168:373–381.[Abstract/Free Full Text]

    Thompson JD, Higgins DG, Gibson TJ. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680.[Abstract/Free Full Text]

    Vinckenbosch N, Dupanloup I, Kaessmann H. (2006) Evolutionary fate of retroposed gene copies in the human genome. Proc Natl Acad Sci USA 103:3220–3225.[Abstract/Free Full Text]

    Williams EJ and Hurst LD. (2000) The proteins of linked genes evolve at similar rates. Nature 407:900–903.[CrossRef][Medline]

    Zhang P, Gu Z, Li WH. (2003) Different evolutionary patterns between young duplicate genes in the human genome. Genome Biol 4:R56.[CrossRef][Medline]

    Zhang L and Li WH. (2004) Mammalian housekeeping genes evolve more slowly than tissue-specific genes. Mol Biol Evol 21:236–239.[Abstract/Free Full Text]

    Zhang Z, Schwartz S, Wagner L, Miller W. (2000) A greedy algorithm for aligning DNA sequences. J Comput Biol 7:203–214.[CrossRef][Web of Science][Medline]

Accepted for publication December 5, 2006.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Mol Biol EvolHome page
S. Fang, C.-T. Ting, C.-R. Lee, K.-H. Chu, C.-C. Wang, and S.-C. Tsaur
Molecular Evolution and Functional Diversification of Fatty Acid Desaturases after Recurrent Gene Duplication in Drosophila
Mol. Biol. Evol., July 1, 2009; 26(7): 1447 - 1456.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
M. V. Han, J. P. Demuth, C. L. McGrath, C. Casola, and M. W. Hahn
Adaptive evolution of young gene duplicates in mammals
Genome Res., May 1, 2009; 19(5): 859 - 867.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. Dorus, Z. N. Freeman, E. R. Parker, B. D. Heath, and T. L. Karr
Recent Origins of Sperm Genes in Drosophila
Mol. Biol. Evol., October 1, 2008; 25(10): 2157 - 2166.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
24/3/679    most recent
msl199v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Cusack, B. P.
Right arrow Articles by Wolfe, K. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Cusack, B. P.
Right arrow Articles by Wolfe, K. H.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?