Skip Navigation


MBE Advance Access originally published online on June 29, 2006
Molecular Biology and Evolution 2006 23(9):1652-1655; doi:10.1093/molbev/msl048
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
23/9/1652    most recent
msl048v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Drouin, G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Drouin, G.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

Letter

Processed Pseudogenes Are More Abundant in Human and Mouse X Chromosomes than in Autosomes

Guy Drouin

Département de Biologie et Centre de Recherche Avancée en Génomique Environnementale, Université d'Ottawa, Ottawa, Ontario, Canada

E-mail: gdrouin{at}science.uottawa.ca.


    Abstract
 TOP
 Abstract
 Methods
 Acknowledgements
 References
 
Two different hypotheses have been proposed to explain the observation that some genomes contain more processed pseudogenes than others. One predicts that processed pseudogene abundance is inversely proportional to the substrate specificity of the reverse transcriptase that generates processed pseudogenes. The other predicts that the amount of processed pseudogenes found in genomes is proportional to the length of oogenesis. Here, we test the oogenesis hypothesis by analyzing the data from 6 studies that described the number of pseudogenes on different chromosomes of the human and/or mouse genomes. Our results show a significant overabundance of processed pseudogenes in the X chromosomes and a significant underrepresentation of processed pseudogenes in the Y chromosome of the human genome. These observations support the hypothesis that the number of processed pseudogenes is proportional to the length of oogenesis.

Key Words: processed pseudogenes • X chromosomes • Y chromosomes • oogenesis • human genome • mouse genome

Processed pseudogenes are the result of the random integration of reverse-transcribed mature RNA molecules into genomes (Vanin 1985Go; Weiner et al. 1986Go). They are characterized by a lack of introns, the presence of a poly(A)-tail, and the presence of flanking direct repeats. Because mature RNA molecules do not contain promoter sequences, processed sequences are usually not expressed and they quickly accumulate frameshifts and/or premature stop codons and, in the vast majority of cases, become pseudogenes (Brosius 1999Go).

Processed pseudogenes are common in mammalian species but are much less abundant in other animal species. Whereas thousands of processed pseudogenes are present in the mouse and human genomes (Gonçalves et al. 2000Go; Zhang et al. 2002Go, 2003Go, 2004Go; Ohshima et al. 2003Go; Torrents et al. 2003Go; Khelifi et al. 2005Go; Bischof et al. 2006Go), the Caenorhabditis elegans genome contains only 208 processed pseudogenes, the chicken genome contains at most 51 processed pseudogenes, and the Drosophila melanogaster genome contains at most 34 processed pseudogenes (Harrison et al. 2001Go, 2003Go; Misra et al. 2002Go; International Chicken Genome Sequencing Consortium 2004Go). Recent results indicate that part of these differences in processed pseudogene abundance is likely due to the substrate specificity of the reverse transcriptase that generates processed pseudogenes (International Chicken Genome Sequencing Consortium 2004Go). However, differences in oogenesis are also likely to be responsible for these differences because the number of processed pseudogenes in animal species is proportional to the length of the lampbrush stage of these species (Weiner et al. 1986Go). One way to test this hypothesis is to examine whether processed pseudogenes are more abundant in X chromosomes, and less abundant in Y chromosomes, than in autosomes (Weiner et al. 1986Go; Graur and Li 2000Go). Whereas autosomes spend half their time in males and females, X chromosomes spend two-thirds of their time in females and only one-third of their time in males and Y chromosomes are always in males (Miyata et al. 1987Go). Therefore, according to the oogenesis hypothesis, processed pseudogenes should be roughly 33% more abundant in X chromosomes than in autosomes and not present in Y chromosomes. Here, we tested these predictions using data from 6 studies (Ohshima et al. 2003Go; Torrents et al. 2003Go; Zhang et al. 2003Go, 2004Go; Khelifi et al. 2005Go; Bischof et al. 2006Go).

Whereas both mouse genome studies identified about 4,000 processed pseudogenes, the numbers of human processed pseudogenes identified by different groups are very different (table 1). In the human genome, these numbers range from 3,664 to 17,759, a difference of almost 5-fold. As discussed in Zhang and Gerstein (2004)Go, these differences result from the criteria used by individual research groups to define processed pseudogenes. Groups using more stringent criteria identified fewer processed pseudogenes than those using less stringent criteria. For example, the much larger number of processed pseudogenes identified by Torrents et al. (2003)Go is due mainly to the fact that these authors did not limit the minimal size of processed pseudogenes (Zhang and Gerstein 2004Go).


View this table:
[in this window]
[in a new window]

 
Table 1 Number and Density of Pseudogene Sequences in Human and Mouse Chromosomes

 
Table 1 shows that processed pseudogenes are significantly more abundant in X chromosomes than in autosomes. All but 1 of the 11 processed pseudogene data sets show a significant excess of processed pseudogenes in the human and mouse X chromosomes (table 1). The only exception is the data set of human ribosomal protein processed pseudogenes, where such sequences are significantly less abundant than expected. Interestingly, the human genome data of Bischof et al. (2006)Go and the human and mouse data of Khelifi et al. (2005)Go all show a 28–32% excess of processed pseudogenes on the X chromosome compared with what would be expected given the density of processed pseudogenes in autosomes. This is very close to the 33% expectation. The human data of Bischof et al. (2006)Go are particularly interesting because they contain an internal control. These authors not only identified processed pseudogenes but also identified nonprocessed pseudogenes. Interestingly, there is no significant excess of nonprocessed pseudogenes on the X chromosome. The fact that the X chromosome bias is limited to processed pseudogenes supports the oogenesis hypothesis. The human data set of Zhang et al. (2003)Go also shows a 26% excess of processed pseudogenes on the X chromosome if one only considers processed pseudogenes that are not processed pseudogenes of ribosomal protein–coding genes. Both this observation and the significant underrepresentation of ribosomal protein processed pseudogenes on the human X chromosome might be the consequence of the fact that Zhang et al. (2003)Go treated ribosomal protein processed pseudogenes differently from other processed pseudogenes. In fact, because previous experimental studies had suggested that each ribosomal protein was coded by a single functional gene in the human genome, they counted all intronless ribosomal protein sequences as being processed pseudogenes. On the other hand, accounting for this possible bias does not explain why the excess of processed pseudogenes on the mouse X chromosome is much greater than expected.

Table 1 also shows that processed pseudogenes are significantly less abundant in Y chromosomes than in autosomes. The data from the studies of Ohshima et al. (2003)Go, Khelifi et al. (2005)Go, and Bischof et al. (2006)Go all show a clear and significant underrepresentation of processed pseudogenes on the human Y chromosome. The data from the study of Zhang et al. (2003)Go also support an underrepresentation of processed pseudogenes on the Y chromosome, but this support is entirely due to ribosomal protein processed pseudogene sequences. Again, the data of Bischof et al. (2006)Go are particularly interesting because of their internal control. In contrast to processed pseudogenes, there is a significant excess of nonprocessed pseudogenes on the Y chromosome. The fact that only processed pseudogenes are underrepresented on the Y chromosome supports the oogenesis hypothesis. The presence of processed pseudogenes on the Y chromosome, whereas the oogenesis hypothesis predicts they should be absent from this chromosome, is likely due to the transfer of processed pseudogenes from X chromosomes to Y chromosomes by recombination in the pseudoautosomal region (Graur and Li 2000Go; Skaletsky et al. 2003Go). Finally, it is not clear why the data from Torrents et al. (2003)Go do not support the underrepresentation of processed pseudogenes on the human Y chromosome. However, the fact that this study did not identify any nonprocessed pseudogene on both the X and Y chromosomes may indicate that their data might be of limited usefulness to test the oogenesis hypothesis.

In conclusion, data from 6 studies on the number of pseudogenes show significant overabundance of processed pseudogenes in the X chromosomes of the human and mouse genomes and a significant underrepresentation of processed pseudogenes in the Y chromosome of the human genome. These observations support the hypothesis that the number of processed pseudogenes found in genomes is proportional to the length of their oogenesis.


    Methods
 TOP
 Abstract
 Methods
 Acknowledgements
 References
 
The data on human processed and nonprocessed pseudogenes of Bischof et al. (2006)Go were retrieved from http://genome.uiowa.edu/pseudogenes. The data on human and mouse processed pseudogenes of Khelifi et al. (2005)Go were retrieved from the Hoppsigen database using the WWW-Query query engine at http://pbil.univ-lyon1.fr/databases/hoppsigen.html. Only sequences homologous to coding regions were used (by using the CDE key word in our queries). The human processed pseudogene data of Ohshima et al. (2003)Go were retrieved from their Table 3. The data on human processed and nonprocessed pseudogenes of Torrents et al. (2003)Go were retrieved from http://www.bork.embl.de/Docu/Human_Pseudogenes/build34/stat.html. The data on human processed pseudogene of Zhang et al. (2003)Go were retrieved from their Table 1. Only the numbers corresponding to the "True (RP)" processed pseudogenes described in this table were used. The data on mouse processed pseudogene of Zhang et al. (2004)Go were retrieved from their Table 1. Only the numbers corresponding to the "Type 1 (RP)" processed pseudogenes described in this table were used. Statistical tests were performed using Excel (Microsoft Corporation, Redmond, WA).


    Acknowledgements
 TOP
 Abstract
 Methods
 Acknowledgements
 References
 
This work was supported by a Discovery Grant from the National Science and Engineering Research Council of Canada.


    Footnotes
 
Dan Graur, Associate Editor


    References
 TOP
 Abstract
 Methods
 Acknowledgements
 References
 

    Bischof JM, Chiang AP, Scheetz TE, Stone EM, Casavant TL, Sheffield VC, Braun TA. 2006. Genome-wide identification of pseudogenes capable of disease-causing gene conversion. Hum Mutat 27:545–52.[CrossRef][Web of Science][Medline]

    Brosius J. 1999. RNAs from all categories generate retrosequences that may be exapted as novel genes or regulatory elements. Gene 238:115–34.[CrossRef][Web of Science][Medline]

    Gonçalves I, Duret L, Mouchiroud D. 2000. Nature and structure of human genes that generate retropseudogenes. Genome Res 10:672–8.[Abstract/Free Full Text]

    Graur D, Li W-H. 2000. Fundamentals of molecular evolution. 2nd ed. Sunderland, MA: Sinauer Associates.

    Harrison PM, Echols N, Gerstein MB. 2001. Digging for dead genes: an analysis of the characteristics of the pseudogene population in the Caenorhabditis elegans genome. Nucleic Acids Res 29:818–30.[Abstract/Free Full Text]

    Harrison PM, Milburn D, Zhang Z, Bertone P, Gerstein MB. 2003. Identification of pseudogenes in the Drosophila melanogaster genome. Nucleic Acids Res 31:1033–7.[Abstract/Free Full Text]

    International Chicken Genome Sequencing Consortium. 2004. Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432:695–716.[CrossRef][Medline]

    Khelifi A, Duret L, Mouchiroud D. 2005. HOPPSIGEN: a database of human and mouse processed pseudogenes. Nucleic Acids Res 33:D59–66.[Abstract/Free Full Text]

    Misra S, Crosby MA, Mungall CJ, et al. (30 co-authors). 2002. Annotation of the Drosophila melanogaster euchromatic genome: a systematic review. Genome Biol 3:research0083.

    Miyata T, Hayashida H, Kuma K, Mitsuyasu K, Yasunaga T. 1987. Male-driven molecular evolution: a model and nucleotide sequence analysis. Cold Spring Harbor Symp Quant Biol 52:863–7.[Abstract/Free Full Text]

    Ohshima K, Hattori M, Yada T, Gojobori T, Sakaki Y, Okada N. 2003. Whole-genome screening indicates a possible burst of formation of processed pseudogenes and Alu repeats by particular L1 subfamilies in ancestral primates. Genome Biol 4:R74.[CrossRef][Medline]

    Skaletsky H, Kuroda-Kawaguchi T, Minx PJ, et al. (40 co-authors). 2003. The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature 423:825–37.[CrossRef][Medline]

    Torrents D, Suyama M, Zdobnov E, Bork P. 2003. A genome-wide survey of human pseudogenes. Genome Res 13:2559–67.[Abstract/Free Full Text]

    Vanin EF. 1985. Processed pseudogenes: characteristics and evolution. Annu Rev Genet 19:253–72.[CrossRef][Web of Science][Medline]

    Weiner AM, Deininger PL, Efstratiadis A. 1986. Nonviral retroposons: genes, pseudogenes, and transposable elements generated by the reverse flow of genetic information. Annu Rev Biochem 55:631–61.[CrossRef][Web of Science][Medline]

    Zhang Z, Carriero N, Gerstein MB. 2004. Comparative analysis of processed pseudogenes in the mouse and human genomes. Trends Genet 20:62–7.[CrossRef][Web of Science][Medline]

    Zhang Z, Gerstein MB. 2004. Large-scale analysis of pseudogenes in the human genome. Curr Opin Genet Dev 14:328–35.[CrossRef][Web of Science][Medline]

    Zhang Z, Harrison P, Gerstein MB. 2002. Identification and analysis of over 2000 ribosomal protein pseudogenes in the human genome. Genome Res 12:1466–82.[Abstract/Free Full Text]

    Zhang Z, Harrison PM, Liu L, Gerstein MB. 2003. Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in the human genome. Genome Res 13:2541–58.[Abstract/Free Full Text]

Accepted for publication June 23, 2006.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
23/9/1652    most recent
msl048v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Drouin, G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Drouin, G.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?