MBE Advance Access originally published online on November 12, 2007
Molecular Biology and Evolution 2008 25(1):101-110; doi:10.1093/molbev/msm247
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
Cis and Trans Regulatory Effects Contribute to Natural Variation in Transcriptome of Drosophila melanogaster
,
,
* Section of Evolution and Ecology, University of California, Davis
Department of Molecular Genetics and Microbiology, University of Florida
University of Florida Genetics Institute, University of Florida
Department of Zoology, University of Florida
E-mail: amgenissel{at}ucdavis.edu.
| Abstract |
|---|
|
|
|---|
The dissection of intraspecific variation in transcriptome is a central theme of many recent quantitative genomic analyses. Transcript level variation has been attributed to factors at the gene itself (cis) and elsewhere in the genome (trans). Previous analyses of Drosophila intraspecific transcriptome variation pointed toward a larger contribution of trans factors. However, data from other genera, and from interspecific comparisons within Drosophila, are more consistent with a major role for cis factors. We investigated the relative amount of cis and trans variation in Drosophila melanogaster, using whole-genome expression from an oligonucleotide microarray in the 2 extensively studied genotypes Ore and 2b3, and 6 recombinant inbred (RI) lines derived from these parents. We examined 2 types of models to decompose cis and trans contributions to genetic variation in transcript level: 1) an infinitesimal model assuming that the transcription variation is highly polygenic and due to many small effects and 2) contrast models assuming that a few large effects contribute to the transcriptional variation. We explicitly fitted cis-by-trans interactions and extended our analyses to consider regulation of alternatively spliced transcripts. We estimated that approximately 10% of the transcriptome was differentially regulated among the lines. We were able to identify cis and trans effects that contribute to this differential regulation for 1,340 genes. Our analyses revealed numerous cis effects (90%) but much fewer trans effects, perhaps due to reduced power of detection for trans effects. In addition, we identified 15 genes that have alternative splice variants differentially regulated in cis.
Key Words: Drosophila cis trans genomics transcription
| Introduction |
|---|
|
|
|---|
There is abundant genetic variation in transcript level in nearly all organisms studied (Gibson and Weir 2005
Intraspecific transcriptome variation has been thoroughly studied in many organisms to determine the relative importance of cis and trans regulation. Researchers have mapped expression Quantitative Trait Loci (eQTL) in yeast (Brem et al. 2002
; Yvert et al. 2003
), mouse (Doss et al. 2005
), maize (Schadt et al. 2003
), and humans (Morley et al. 2004
). In yeast, the proportion of genes with variation in cis was found to be 32% in Brem et al. (2002)
and 25% in Yvert et al. (2003)
; for mouse, 71% (Schadt et al. 2003
); for maize, 80% (Schadt et al. 2003
); for humans, 19% (Morley et al. 2004
); and for Arabidopsis, 7% (Kiekens et al. 2006
). Another approach, based on the relative expression of each of the 2 alleles at a locus in heterozygous individuals, showed differences in allele-specific expression in mice (Cowles et al. 2002
) and humans (Yan et al. 2002
; Bray et al. 2003
; Lo et al. 2003
; Pastinen et al. 2004
). These experiments have found that 20–55% of the genes surveyed show differential, allele-specific expression. Because transcript levels were assayed in a common cellular environment, such that trans effects were assumed to be constant, these differences in the abundance of allele-specific transcripts were interpreted as being due to cis-regulatory changes.
Knowledge of cis–trans decomposition is crucial because it may influence the primary focus of research on the genetic variation of complex traits of evolutionary, medical, and agricultural value. In the field of medical science, differential expression of candidate genes has been observed between healthy and diseased individuals (Cheung and Spielman 2002
). If cis effects are the most important sources of heritable transcriptional variation, then cis-regulatory sequences at the candidate genes should be the primary focus to search for causal genetic variants. On the other hand, if trans effects contribute more to the phenotypic differences, association studies in the context of transcriptional network should become the priority (Yoder and Carroll 2006
).
Cis regulation accounts for most of the observed divergence between D. melanogaster and Drosophila simulans species (Wittkopp et al. 2004
). Interspecific expression differences were not caused by trans-regulatory variants with widespread effects but rather by many cis-acting variants throughout the genome (Davis et al. 2004
; Landry et al. 2005
). The "case studies" of single genes in a developmental context also suggest that cis divergence translates into morphological differences (Sucena and Stern 2000
; Gibert and Simpson 2003
; Romano and Wray 2003
; Gompel et al. 2005
). More recently, by comparing chromosomal substitution strains from the cosmopolitan M and Zimbabwe races of D. melanogaster, the authors attribute predominantly intrachromosomal effects on transcription to cis variation (Osada et al. 2006
).
The genetics of intraspecific transcriptome variation in Drosophila is less well studied. Wayne et al. (2004)
estimated cis and trans contributions using a round-robin cross between isogenic lines to generate F1 heterozygous males of D. simulans. Results showed that both cis and trans factors contribute to the intraspecific transcript level variation, but the additive genetic variance was 6.8% on the X chromosome versus 8.7% on the automosomes, suggesting that most variation is due to trans effects. Hughes et al. (2006)
analyzed the effects of wild third chromosome substitutions on transcript level variation of genes on both the third chromosome itself and on the other chromosomes of the lines Ore and 2b3 and found that cis effects were dominant. Harbison et al. (2005)
found "more trans effects" than cis effects with mapping approaches. Brown and Feder (2005)
sequenced the proximal regulatory regions of genes whose transcript levels were compared between these lines and their F1 reciprocal progeny by Gibson et al. (2004)
and reached the conclusion that if transcription variation is due to cis, regulatory regions must be much more extensive than the regions sequenced (ca. 1 kb); or, more parsimoniously, intraspecific transcriptome variation is typically due to trans factors.
We undertook a cis and trans decomposition of transcription in D. melanogaster using a quantitative genetic framework. In order to detect cis and trans segregating effects, whole-genome microarray data of the stocks Ore and 2b3 and 6 recombinant inbred (RI) lines derived from these 2 parents were associated with the roo TE insertion map detailed in Nuzhdin et al. (1997)
. We analyzed the data in 2 ways: 1) assuming that the transcriptional regulation in trans is highly polygenic and epistatic, we used an infinitesimal model that can detect many small random effects throughout the genome 2) assuming that a few trans factors contribute to large phenotypic differences, we performed a contrast analysis between genotypes.
| Materials and Methods |
|---|
|
|
|---|
Drosophila Strains
Experiments were conducted on the strains Oregon R (Lindsley and Zimm 1992
RNA Sample Preparation
Flies were maintained at 25 °C with a 12:12 h light/dark cycle. To minimize random environmental effects, each of the 8 lines was grown in 4 separate replicates of small cohorts (10 females and 10 males), with adults removed after 3 days. Twenty virgin males and females were collected within 24 h from each replicate, transferred separately to fresh vials, maintained for 3 days in the same conditions, and snap-frozen in liquid nitrogen for total RNA extraction. We extracted each RNA sample from 20 whole adult flies using Trizol reagent (Invitrogen, Carlsbad, CA) according to the manufacturer's instructions. RNA was purified using the RNEasy Kit (Qiagen, Valencia, CA). RNA concentration was determined using NanoDrop Spectrophotometer (NanoDrop Technologies, Wilmington, DE), and the sample quality was examined using the Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA). A total of 500 ng of RNA of each sample was used for the microarray experiment. Synthesis and purification of labeled cDNA with Cyanine 3 and Cyanine 5 were done by following the standard protocol described in the Agilent Fluorescent direct labeling kit (http://www.chem.agilent.com/).
Roo Element Map and Genotype Assignment of Transcripts on the Array
The markers used were 92 polymorphic roo TE inserts with an average spacing of 3.2 cM (Nuzhdin et al. 1997
), exhibiting fixed differences between the parental lines. We attributed genotypes at every transcribed gene in every RI line based upon the presence/absence of the nearby roo elements. When a gene was located at a cytological position flanked by 2 roo markers, we assigned a genotype only if the flanking markers were inherited from the same parent.
Microarray Design and Synthesis
The design of the microarray is reported by McIntyre et al. (2006)
and is available at http://www.genomics.purdue.edu/services/droschip. It contains 22,574 probes—60 nt long—12,994 homologous to single transcript genes; 7,207 probes for 2,768 alternatively transcribed genes and 566 probes for 177 gene families (supplementary table S1, Supplementary Material online). Gene families are set of genes for which the transcripts with multiple CG numbers have been identified. We grouped both alternatively transcribed and gene families into a multiple transcript category for analyses. Oligonucleotides were synthesized on an Agilent Technologies platform (AMADID 012798, http://www.agilent.com). Because there are multiple probes on the chip for single CG's, we need to distinguish between analyses performed for individual probes and those performed for multiple probes at the level of CG or gene family. Therefore, we will be performing 2 levels of analysis, one for single probes and one for clusters of transcripts.
Microarray hybridizations were performed at the Interdisciplinary Center for Biotechnology Research Microarray Core, University of Florida. Four independent biological replicates for each line and sex combination were hybridized in a dye-swap design. Hybridization occurred for 17 h at 60 °C in accordance with the manufacturer's instructions, and arrays were scanned using an Agilent Microarray Scanner. There were 7 technical failures due to low signal-to-noise ratio, leaving 25 successful hybridizations (details available upon request to AG). Additionally, Agilent reported a manufacturing error that affected 2,310 spots on each chip, including 153 of the 503 negative controls. The failed chips and defective spots were removed from further consideration.
Images were analyzed using Imagene software version 6.0 at the Purdue University Genomics Database Facility. The mean intensity for each quantified spot and the corresponding mean background signal was exported into a ".csv" file. Individual files were collated for analysis at the Purdue University Genomics Database Facility. Transcript abundance was estimated as the natural log of the spot mean minus the mean of the local background. A sample was said to have hybridized to a probe if the spot intensity at the probe position was greater than the intensity of 95% of the negative controls for that slide and dye combination. Probes were considered to be detected for a particular line and sex combination if 50% or more of the replicates were detected, and probes that were not detected in at least one treatment were considered uninformative and not analyzed further. A total of 13,874 probes were detected in at least one treatment.
Genetic Variation
We localized the transcript on the genome using the cytological position given in flybase. For genes where both genotypes (Ore and 2b3) were present in the offspring at this cytological position, the individual probes were then analyzed for differences among genotypes using the following model: Yijkn= µ + di + lj + sk + lsjk +
ijkn. Yijkn is the transcript abundance for the nth replicate, µ is the overall mean, d is the effect of dye (i = 1, 2), l is the effect of line (j = 1,...,8), s is the effect of sex (k = 1, 2), ls is the interaction between line and sex, and
is the error. As type I and type II errors are inversely related, we decided to minimize the chance of making a type II error in this stage of analysis at the increased risk of having probes included that were in fact type I errors. Accordingly, all probes significant at a false discovery rate (FDR) of 0.2 (P = 0.0029 in this case) for a line effect were retained for further analysis (Benjamini and Hochberg 1995
; Verhoeven et al. 2005
). For the genes that had multiple transcripts, if any probes showed the evidence for a line effect, we included all probes with the same CG number ("computed gene" identifier) in subsequent analyses for a total of 3,685 probes representing 2,728 genes for further analysis (supplementary table S1, Supplementary Material online).
Infinitesimal Model
We first assumed an infinitesimal model, where the transcriptional variation is under control of numerous regulatory effects. Some of them might be at the gene location (cis effect), and others (the composite of them being a trans effect) are spread at random across the genome. If a cis effect alone is contributing to the transcript abundance variation of a particular gene, an RI line will then resemble a parental line from which it inherited the allele at that gene. Likewise, if trans effects are prevalent, an RI line that inherited a "larger fraction" of Ore genome will more likely resemble Ore parent transcription pattern because it has more trans effects in common with this parental line. In this model, cis effects are approximated by the allele at each locus (Ore or 2b3) and trans effects are approximated by the proportion of the genome outside that locus that is from the Ore parent. The infinitesimal model is a foundation of most quantitative genetic analyses (Falconer and Mackay 1996
). It assumes that a phenotype is influenced by numerous factors randomly spread over chromosomes with individually small effects. The majority of these factors will not be gene specific, and the effects of these "trans factors" will be relevant to very large groups of genes and thus may "blend" in the genetic background that we estimated as the fraction of recombinant line genome inherited from 1 parental line. Only transcript level from the RI lines was included in this analysis.
For all probes, we fit the following model: Yijklm = µ + di +cj + tk + sl + ctjk +
ijklm. For the genes with multiple transcripts, we extended the model to include the potential probe effect as follows: Yijklnm= µ +di + cj + tk + sl + pm +ctjk + cpjm + tpkm +
ijklm. In both models, Yijklm is the transcript abundance for dye i, the cis effect of allele j, the trans effect for line k, sex l, and replicate n. In the multiple transcript model, we included the effect of the probe m and interactions between the probe and cis (cp) and trans (tp) effects. The parameter µ is the overall mean of the transcript abundance. In both models, the effects of dye and sex were considered fixed whereas other effects were considered random. To calculate the nominal P value for the tests of random effects, we used a permutation approach (empirical P values were calculated from 1,000 permutations). A multiple test correction (FDR 0.2; Benjamini and Hochberg 1995
; Verhoeven et al. 2005
) was then applied to the nominal P values. The average phenotypic effect of an allelic substitution (for cis effect), and of the genome background differences (for trans effect), was estimated by the covariance parameter estimate in our mixed model.
Contrast Models
Although an infinitesimal assumption is a reasonable first approach, this model assumes numerous small effects and thus could substantially underestimate trans contributions, whenever they are represented only by a few genes rather than by many, especially if their effects are in opposite directions. In a second approach, we inferred the regulatory effects in a fixed effect model by using contrasts among lines. We fit the model Yijkl= µ + di + gj + sk + gsjk+
ijkl, where Yijkl is the transcript abundance, d is the effect of dye, g is the effect of allele j at the transcribed locus, k is the effect of sex, and
is the error. The parameter µ is the overall mean of the transcript abundance for that probe. We examined the model for conformation to the assumption of normality using a Shapiro–Wilkes test. If we found evidence of departure from normality (P < 0.05), we permuted the data to obtain an empirical P value (Edgington 1995
; Good 2006
). Multiple test corrections were performed separately for each contrast (FDR threshold at 0.2). Contrasts were performed between offspring carrying a different cis allele (oOre vs. o2b) and between parents and offspring sharing the same cis allele (2 contrasts: pOre vs. oOre and p2b vs. o2b). As only 3 contrasts were independent, we omitted a fourth possible comparison between the parents. We then classified the results into cis, trans, or combined cis and trans categories as follows: if the offspring carrying different parental alleles also exhibit significant differences in transcript level, this is evidence for cis effects. If the offspring carrying the same allele as one of the parents have a statistically significantly different mean from that parent, this is evidence for trans effects (table 2). Intuitively, this analysis should perform the best when a few segregating trans effects are not in linkage disequilibrium with cis effects.
|
Following this initial analysis, we noted that many probes we included did not show evidence for cis or trans effects. This could be due to 1) limitation of detecting segregating variants from our genetic map, 2) power limitations, or 3) heteroscedasticity of variance among lines or heteroscedasticity due to transgressive effects of regulatory factors in the recombinants. To address the issue of heteroscedasticity of variance among lines, we expanded the fixed effect model above to include a new term, line nested within allelic effect (l(g)) as follows: Yijkl = µ + di + gj + sk + gsjk + l(gj) +
ijkl. We then performed the contrasts as described above and using the same inferences to categorize genetic effects on transcript abundance as previously. We refer to this model as the nested contrast model (as opposed to the simple contrast model described above) throughout the manuscript. This model was expanded to account for multiple transcripts as follows; for each set of probes represented by a single gene, the model Yijkl = µ + di + gj + sk + gsjk + l(gj) + pm + pgmj +
ijklm was fitted where the terms are defined as above with the addition of the probe effect p for m probes and the interaction among the probe and the genotype. From both models, we tested the contrasts as described above. To see if there was any lack of power to detect trans effects, we compared differences among pairs of lines for lines carrying the same allele at the target locus using an extremely lenient nominal threshold of 0.2. We deliberately disregard type I error in this analysis in order to determine whether there is any evidence for trans effects.
Distribution of Regulatory Effects
We tested for correlations between the pattern of recombination rate in the Drosophila genome and the nature of regulatory effects, using the overall recombination rate estimate of D. melanogaster calculated from all publicly available data, following the same methodology as in Comeron et al. (1999)
. Briefly, recombination rates were estimated after obtaining the polynomial curves as a function of the quantity of DNA estimated from optical density of polytene chromosome (Sorsa 1988
), in each division along each chromosome, relative to the change of the cytogenetic map position (http://flybase.bio.indiana.edu/data/maps/cytotable.txt). A single polynomial curve was obtained for the third chromosome, whereas 2 independent curves were obtained for chromosomes X and II to better fit the observed pattern of recombination rates. The correlation between recombination rate and the type of regulatory effects was calculated using Kendall's rank correlation
for each chromosome arm, excluding chromosome 4, which does not recombine. A permutation testing approach was used to determine the experiment-wise significant threshold at P < 0.05 and P < 0.01 from 10,000 permutations.
All statistical analyses were performed using SAS software version 9.1 (SAS Institute, Cary, NC).
| Results |
|---|
|
|
|---|
Screen for Genotypic Differences Associated with Transcriptional Variation
We used the marker data generated for the RI lines by Nuzhdin et al. (1997)
|
There were 13,784 probes detected above background in at least one treatment. There were 3,685 probes (2,728 genes) with both parental alleles at the gene position (Ore and 2b) that showed at least one probe for the gene with evidence for variation among the lines (fig. 1, supplementary table S2 [Supplementary Material online]).
|
Detecting Small Trans Effects Dispersed Throughout the Genome Using an Infinitesimal Model
We tested cis effects, trans effects, and the interaction between cis and trans effects for each of the probes selected as having genetic variation for transcript abundance. At the FDR threshold of 0.2, 32 probes showed a significant cis effect. We found only 3 probes significant for trans effects, and 10 probes were significant for cis-by-trans interactions (among the 10 probes that were significant for cis-by-trans interactions, 2 of the probes belonged to the same gene, CG33092; see supplementary table S3 [Supplementary Material online]). Although more cis-regulated genes than trans or cis-by-trans regulated genes were detected, this analysis detects only trans effects that fit an infinitesimal approximation and thus is likely an underestimate of the overall contribution of trans effects. Altogether, we classified 44 probes into the categories cis and/or trans effects (1 probe for CG13091 was significant for cis and for cis-by-trans effects).
"Biological processes" were unknown for 43% of these genes. Interestingly, we found 2 genes known to be regulators of Pol II with a cis effect (tim and Hnf4). We also found that the categories proteolysis and peptidolysis (5/26 vs. 92/2049, P = 0.006), chymotrypsin activity (3/21 vs. 10/1955, P = 0.02), and receptor activity (2/21 vs. 19/1955, P = 0.02) were overrepresented among genes significant for cis effects. The functional category "metabolism," formerly found to be overrepresented in another panel of flies (Hughes et al. 2006
), did not show significant enrichment in significance for cis or trans effects in our sample.
As this model is for random effects, we estimated the variance component associated with each effect using restricted maximum likelihood. On average, the variance component of cis effects was smaller than trans or cis-by-trans effects (1.3 ± 0.58 in cis, 10.7 ± 5.5 in trans, and 10.5 ± 6.7 in cis-by-trans) (see supplementary fig. S1, Supplementary Material online). However, because confidence intervals on each estimate are large (particularly for trans effects) and we only detected a few genes in each category (3 genes for trans and 9 genes for cis-by-trans effects), more experiments would be needed to conclusively demonstrate that trans effects are truly of larger magnitude than cis effects (supplementary table S3, Supplementary Material online). Note that the relatively large intervals for trans and cis-by-trans interactions are evidence for reduced power to detect these effects.
As our microarray reliably detects multiple transcripts per gene, we can potentially study how cis and trans effects contribute to genetic variation at the level of alternative splicing (genes significant for this component of variation were described by McIntyre et al. [2006
] for the data set presented here). In our model, the cis-by-probe interaction term can be interpreted as a cis contribution to genetic variation in alternative splicing and the trans-by-probe interaction term as a trans contribution. We tested 306 multiple transcript genes for cis-by-probe and trans-by-probe interactions, and we found no gene that was significant after correcting for multiple testing in this infinitesimal model. This is likely due to the lack of power.
Detecting "Large" Trans Effects Using Contrasts among Genotypes
We used the contrast models, detailed in table 2, to detect which probes were differentially transcribed due to cis effects, trans effects, or a combination of the 2. Transcript level was analyzed in 2 ways: simple or nested contrast models (see Materials and Methods). Out of 3,685 probes tested, we found 1,943 significant probes overall, representing 598 probes for the simple contrast model and 1,345 probes for the nested contrast model at FDR = 0.2. When the results from both models are compared, they are largely concordant with only one probe apparently switching roles from cis effects to trans effects. Due to increased power in the nested contrast model, some genes significant for either cis (49) or trans (15) in the simple contrast model are also significant for both cis and trans effects in the nested contrast model. The list of detected probes and genes from each model is annotated in supplementary table S4 (Supplementary Material online).
The biological function category "regulation of transcription" was overrepresented with 19 out of 29 genes (P = 0.0011); for example, for the category "negative regulation of the Pol II promoter," there are 8 genes in total on our chip, 7 of which are significantly overrepresented for either cis or trans effects (P = 0.0033). Similarly, transcription factor activity (21/40, P = 0.046) was also overrepresented. In addition, 47 of the 97 genes in the "proteolysis and peptidolysis" classification are significant and that accounts for 6.5% of the significant genes (vs. 3% of genes that are not significant, P = 0.0044). "Phosphate metabolism" (8/12, P = 0.028), "muscle development" (7/10, P = 0.0385), and mitochondrial electron transport (nicotinamide adenine dinucleotide to ubiquinone; 13/20, P = 0.017) are also overrepresented among significant genes overall (any effect, cis or trans).
As with the infinitesimal model, we found that trans effects were associated with larger transcriptional differences than cis effects (on average 0.54 ± 0.044 vs. 0.31 ± 0.006), with cis-by-trans effects being intermediate (0.41 ± 0.02) (fig. 2). The coefficient of variation is higher for trans than for cis, potentially signaling reduced power for the detection of trans effects.
|
In the nested contrast model, we identified 15 genes with a probe-by-cis interaction term (see supplementary table S4 [Supplementary Material online], genes are highlighted in yellow), possibly indicating that the genetic variation in the regulation of alternative transcripts.
Overall, we classified 1,476 genes as affected by cis- or trans-regulatory factors and by their interactions. Figure 3 summarizes the number of probes and genes significant from the infinitesimal and contrast models for each category of regulatory effect. Among the 2 approaches, the infinitesimal model did detect the smallest number of genes. It is, however, interesting to note that for the infinitesimal model, when calculating the multiple corrections only for the probes involved in single transcripts (n = 2,421), the number of significant genes substantially increases (200 genes for cis, 78 for trans, and 62 for cis-by-trans). The nested contrast model detected far more genes than the other models (53% of the genes are uniquely represented in cis, 33% in trans, and 57% in cis and trans). Consistent with our assumptions, no genes significant for trans effects are overlapping between the infinitesimal and contrast models.
|
In order to determine whether trans effects were perhaps under detected in these models due to lack of power, we calculated the pairwise differences among lines with the same genotype at the transcriptional locus. We found 1,081 probes with evidence for trans effects. Additional comparisons showed 1,212 with evidence for cis effects (691 overlapping). Although it is important to note that this analysis was done at a nominal level of 0.2, the results for cis are almost identical to the previous nested contrast model. In contrast, we detect more trans effects in this analysis indicating that trans effects may indeed be present.
Genome-Wide Distribution of Regulatory Effects
We looked at the correlation of gene distribution within each chromosome with the overall recombination rate of the genome of D. melanogaster. Our objective was to see if the physical distribution of genes under regulatory effects "captured" in our sample is nonrandom and how it relates to recombination history. Note that whereas genes under cis control and the causal cis variants are cosegregating, the sequence variants causing trans effects will not be located in our analysis, but the genes subject to trans regulation will be. Overall, the physical distribution of regulatory effects from the contrast model was significantly correlated with recombination rate for chromosomes X and 3 (figs. 4A and B). For the X chromosome, the correlation was significant after 10,000 permutations (P < 0.05), with more genes clustered in regions with reduced recombination (fig. 4A). Similarly, more cis effects were detected in regions with low recombination rate on both arms of the third chromosome (P < 0.01). However, one simple explanation for this pattern is the increased power to detect cis effects in these regions due to linkage disequilibrium in our sample of RI lines. Cis and trans (or cis-by-trans) regulation was more frequently detected in regions of intermediate recombination rate. While interpreting these data, we acknowledge that they also reflect the specific nature of the genome in the 6 RI lines (e.g., the proximal half the X chromosome is inherited from Ore parent only, therefore, we have no power to detect any variation in that region, certainly biasing the results; see figs. 4A and B top panel of each graph).
|
| Discussion |
|---|
|
|
|---|
We studied transcript levels using a new whole genome, alternative-splicing microarray (McIntyre et al. 2006
Different statistical approaches were used to categorize the nature of regulatory variation: an infinitesimal model and 2 contrast models. These different analyses were employed to detect qualitatively different kinds of trans effects (epistatic effects with many loci interacting with each others [either in trans or in cis-by-trans] in the infinitesimal model and fewer loci of large average effect in the contrast models). We indeed saw no overlap of trans-regulated genes between the 2 approaches. Estimates from the infinitesimal model and the contrast models both suggest that trans effects were of larger magnitude than cis effects, with a larger coefficient of variation. The increased coefficient of variation in trans effects may affect the statistical power and may explain why we detected more cis effects. The increased coefficient of variation is also suggestive of multiple epistatic effects. To date, most analyses on multiple model species have pointed to cis effects as having a significant contribution, perhaps of similar magnitude as trans contribution (Ranz and Machado 2006
). Two similar conclusions appear in eQTL studies (Schadt et al. 2003
; Morley et al. 2004
). One: eQTL found in cis typically had larger magnitude than those found in trans. Another is that when a genomic region was found to be a QTL for one gene, it generally affected transcript levels of other genes, suggesting the presence of master trans regulators. Although, caution is required in interpreting these patterns, that eQTL detected in these studies appear to behave like Mendelian large effect modifiers of transcript level is not surprising. Such a conclusion is expected from the Beavis effect alone (Beavis 1994
). Unfortunately, back calculation of true QTL effects from the estimated ones is nearly impossible (Otto and Jones 2000
). Although several yeast transcriptome papers made careful account of most of these confounding effects (see Yvert et al. 2003
), for higher organisms they remain unresolved.
Although we detected relatively few trans effects, there are 3 important points that need to be considered. First, we find that the trans effects have a higher coefficient of variation than the cis effects. If this pattern extends to effects that are undetected, it implies that the power to detect trans effects will be lower. Second, cis effects are regulatory mutations at the gene or its proximity whereas trans effects are associated with polymorphic sites in other genes upstream in transcriptional and developmental cascades. Note that in the current study, we may include in our cis effects all "nearby" trans effects. Lastly, when pairs of lines with the same allele at the locus of interest are examined for differences at a lenient threshold, we find that there is evidence for trans effects. As an interesting point of comparison, Harbison et al. (2005)
found a majority of trans effects between Ore and 2b. However, among the 65 genes considered by both studies, the number of shared genes as having trans effects was small. This may be due to the detection of different subsets of trans factors in an overall large number with relatively high variance.
Overall, our results reveal a large, on the order of a thousand, number of cis-regulatory variants from a few genotypes. Because classical models predict that every cis variant will result in multiple trans modifications (unless compensated due to robustness of transcriptional and developmental networks), this has implications in the study of natural variation in transcriptional networks.
| Suppplemental Material |
|---|
|
|
|---|
Supplementary figure S1 and tables S1–S4 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
| Acknowledgements |
|---|
|
|
|---|
Authors are grateful to J.P. Comeron for estimating recombination rates in D. melanogaster genome, L.M. Bono for statistical analyses, B. Foley and anonymous reviewers for valuable comments on the manuscript. This work was supported by the National Institutes of Health (NIH)-GLUE 5R24GM065513 and the NIH 1R01GM077618-01A1 grants.
| Footnotes |
|---|
Douglas Crawford, Associate Editor
| References |
|---|
|
|
|---|
Benjamini Y, Hochberg Y. Controlling the false discovery rate—a practical approach to multiple testing. J R Stat Soc Ser B (1995) 57:289–300.
Beavis WD. The power and deceit of QTL experiments: lessons from comparative QTL studies. In: In: Wilkinson DB, editor. 49th Annual Corn and Sorghum Research Conference (1994) Washington (DC): American Seed Trade Association. 250–266.
Bray NJ, Buckland PR, Owen MJ, O'Donovan MC. Cis-acting variation in the expression of a high proportion of genes in human brain. Hum Genet (2003) 113:149–153.[Web of Science][Medline]
Brem RB, Kruglyak L. The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc Natl Acad Sci USA (2005) 102:1572–1577.
Brem RB, Yvert G, Clinton R, Kruglyak L. Genetic dissection of transcriptional regulation in budding yeast. Science (2002) 296:752–755.
Brown RP, Feder ME. Reverse transcriptional profiling: non-correspondence of transcript level variation and proximal promoter polymorphism. BMC Genomics (2005) 6:110.[CrossRef][Medline]
Butte AJ, Kohane IS. Creation and implications of a phenome-genome network. Nat Biotechnol (2006) 24:55–62.[CrossRef][Web of Science][Medline]
Cheung VG, Spielman RS. The genetics of variation in gene expression. Nat Genet (2002) 32:522–525.[CrossRef][Web of Science][Medline]
Coffman CJ, Wayne ML, Nuzhdin SV, Higgins LA, McIntyre LM. Identification of co-regulated transcripts affecting male body size in Drosophila. Genome Biol (2005) 6:R53.[CrossRef][Medline]
Comeron JP, Kreitman M, Aguadé A. Natural selection on synonymous sites is correlated with gene length and recombination in Drosophila. Genetics (1999) 151:239–249.
Cowles CR, Hirschhorn JN, Altschuler D, Lander ES. Detection of regulatory variation in mouse genes. Nat Genet (2002) 32:432–437.[CrossRef][Web of Science][Medline]
Davis GK, Wittkopp PJ, Stern DL. The evolution and regulation of Ubx in the pupal legs of Drosophila. Integr Comp Biol (2004) 44:687–687.
Doss S, Schadt EE, Drake TA, Lusis AJ. Cis-acting expression quantitative trait loci in mice. Genome Res (2005) 15:681–691.
Edgington ES. Randomization tests (1995) New York: Marcel Dekker, Inc.
Falconer DS, Mackay TFC. Introduction to quantitative genetics (1996) Essex (UK): Longman Group Ltd.
Gibert J-M, Simpson P. Evolution of cis-regulation of the proneural genes. Int J Dev Biol (2003) 47:643–651.[Web of Science][Medline]
Gibson G, Riley R, Harshman L, Kopp A, Vacha S, Nuzhdin S, Wayne M. Extensive sex-specific non-additivity of gene expression in Drosophila melanogaster. Genetics (2004) 167:1791–1799.
Gibson G, Weir B. The quantitative genetics of transcription. Trends Genet (2005) 21:616–623.[CrossRef][Web of Science][Medline]
Gompel N, Prud'homme B, Wittkopp PJ, Kassner VA, Carroll SB. Chance caught on the wing: cis-regulatory evolution and the origin of pigment patterns in Drosophila. Nature (2005) 433:481–487.[CrossRef][Medline]
Good P. Resampling methods (2006) 3rd ed. Boston: Birkhauser.
Harbison ST, Chang S, Kamdar KP, Mackay TFC. Quantitative genomics of starvation stress resistance in Drosophila. Genome Biol (2005) 6:R36.[CrossRef][Medline]
Henikoff S, Furuyama T, Ahmad K. Histone variants, nucleosome assembly and epigenetic inheritance. Trends Genet (2004) 20:320–326.[CrossRef][Web of Science][Medline]
Hughes KA, Ayroles JE, Reedy MM, Drnevich JM, Rowe KC, Ruedi EA, Caceres CE, Paige KN. Segregating variation in the transcriptome: cis regulation and additivity effects. Genetics (2006) 173:1347–1355.
Jansen RC, Nap JP. Genetical genomics: the added value from segregation. Trends Genet (2001) 17:388–391.[CrossRef][Web of Science][Medline]
Kiekens R, Vercauteren A, Moerkerke B, Goetghebeur E, Van Den Daele H, Sterken R, Kuiper M, van Eeuwijk F, Vuylsteke M. Genome-wide screening for cis-regulatory variation using a classical diallel crossing scheme. Nucleic Acids Res (2006) 34:3677–3686.
Landry CR, Wittkopp PJ, Taubes CH, Ranz JM, Clark AG, Hartl DL. Compensatory cis-trans evolution and the dysregulation of gene expression in interspecific hybrids of Drosophila. Genetics (2005) 171:1813–1822.
Lindsley DL, Zimm G. The genome of Drosophila melanogaster (1992) San Diego (CA): Academic Press, Inc.
Lo HS, Wang ZN, Hu Y, Yang HH, Gere S, Buetow KH, Lee MP. Allelic variation in gene expression is common in the human genome. Genome Res (2003) 13:1855–1862.
McIntyre LM, Bono LM, Genissel A, Westerman R, Junk D, Telonis-Scott M, Harshman L, Wayne ML, Kopp A, Nuzhdin SV. Sex-specific expression of alternative transcripts in Drosophila. Genome Biol (2006) 7:R79.[CrossRef][Medline]
Morley M, Molony CM, Weber TM, Devlin JL, Ewens KG, Spielman RS, Cheung VG. Genetic analysis of genome-wide variation in human gene expression. Nature (2004) 430:743–747.[CrossRef][Medline]
Nuzhdin SV, Pasyukova EG, Mackay TFC. Sex-specific quantitative trait loci affecting longevity in Drosophila melanogaster. Proc Natl Acad Sci USA (1997) 94:9734–9739.
Osada M, Kohn MH, Wu C-I. Genomic inference of cis-regulatory nucleotide polymorphism underlying gene expression differences between Drosophila melanogaster mating races. Mol Biol Evol (2006) 23:1585–1591.
Otto SP, Jones CD. Detecting the undetected: estimating the total number of loci underlying a quantitative trait. Genetics (2000) 156:2093–2107.
Pastinen T, Sladek R, Gurd S, et al, (20 co-authors). A survey of genetic and epigenetic variation affecting human gene expression. Physiol Genomics (2004) 16:184–193.
Pasyukova EG, Nuzhdin SV. Doc and copia instability in an isogenic Drosophila melanogaster stock. Mol Gen Genet (1993) 240:302–306.[CrossRef][Web of Science][Medline]
Ranz JM, Machado CA. Uncovering evolutionary patterns of gene expression using microarrays. Trends Ecol Evol (2006) 21:29–37.[CrossRef][Medline]
Rockman MV, Wray GA. Abundant raw material for cis-regulatory evolution in humans. Mol Biol Evol (2002) 19:1991–2004.
Romano LA, Wray GA. Conservation of Endo16 expression in sea urchins despite evolutionary divergence in both cis and trans-acting components of transcriptional regulation. Development (2003) 130:4187–4199.
Schadt EE, Monks SA, Drake TA, et al, (14 co-authors). Genetics of gene expression surveyed in maize, mouse and man. Nature (2003) 422:297–302.[CrossRef][Medline]
Sorsa V. Chromosome maps of Drosophila (1988) Boca Raton (FL): CRC Press.
Sucena E, Stern DL. Divergence of larval morphology between Drosophila sechellia and its sibling species caused by cis-regulatory evolution of ovo/shaven-baby. Proc Natl Acad Sci USA (2000) 97:4530–4534.
Verhoeven KJF, Simonsen KL, McIntyre LM. Implementing false discovery rate control: increasing your power. Oikos (2005) 108:643–64.[CrossRef][Web of Science]
Wayne ML, Pan YJ, Nuzhdin SV, McIntyre LM. Additivity and trans-acting effects on gene expression in male Drosophila simulans. Genetics (2004) 168:1413–1420.
Whitehead A, Crawford DL. Variation within and among species in gene expression: raw material for evolution. Mol Ecol (2006) 15:1197–1211.[CrossRef][Medline]
Wittkopp PJ, Haerum BK, Clark AG. Evolutionary changes in cis and trans gene regulation. Nature (2004) 430:85–88.[CrossRef][Medline]
Yan H, Yuan W, Velculescu VE, Vogelstein B, Kinzler KW. Allelic variation in human gene expression. Science (2002) 297:1143.
Yoder JH, Carroll SB. The evolution of abdominal reduction and the recent origin of distinct Abdominal-B transcript classes in Diptera. Evol Dev (2006) 8:241–251.[CrossRef][Web of Science][Medline]
Yvert G, Brem RB, Whittle J, Akey JM, Foss E, Smith EN, Mackelprang R, Kruglyak L. Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nat Genet (2003) 35:57–64.[Web of Science][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
S. Zhong, S. P. Miller, D. E. Dykhuizen, and A. M. Dean Transcription, Translation, and the Evolution of Specialists and Generalists Mol. Biol. Evol., December 1, 2009; 26(12): 2661 - 2678. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. M. Graze, L. M. McIntyre, B. J. Main, M. L. Wayne, and S. V. Nuzhdin Regulatory Divergence in Drosophila melanogaster and D. simulans, a Genomewide Analysis of Allele-Specific Expression Genetics, October 1, 2009; 183(2): 547 - 561. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Gruber and A. D. Long Cis-regulatory Variation Is Typically Polyallelic in Drosophila Genetics, February 1, 2009; 181(2): 661 - 670. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




