MBE Advance Access originally published online on June 17, 2008
Molecular Biology and Evolution 2008 25(9):1897-1907; doi:10.1093/molbev/msn135
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
Similar Numbers but Different Repertoires of Olfactory Receptor Genes in Humans and Chimpanzees
,

* Department of Organismic and Evolutionary Biology, Harvard University
Department of Biosystems Science, Graduate University for Advanced Studies (Sokendai), Hayama, Kanagawa, Japan
Laboratory for Biodiversity (Global COE Project), Graduate School of Science, Kyoto University, Inuyama, Aichi, Japan
Department of Bioinformatics, Medical Research Institute, Tokyo Medical and Dental University, Bunkyo-ku, Tokyo, Japan
E-mail: yasuhirogo{at}gmail.com.
| Abstract |
|---|
|
|
|---|
Animals recognize their external world through the detection of tens of thousands of chemical odorants. Olfactory receptor (OR) genes encode proteins for detecting odorant molecules and form the largest multigene family in mammals. It is known that humans have fewer OR genes and a higher fraction of OR pseudogenes than mice or dogs. To investigate whether these features are human specific or common to all higher primates, we identified nearly complete sets of OR genes from the chimpanzee and macaque genomes and compared them with the human OR genes. In contrast to previous studies, here we show that the number of OR genes (
810) and the fraction of pseudogenes (51%) in chimpanzees are very similar to those in humans, though macaques have considerably fewer OR genes. The pseudogenization rates and the numbers of genes affected by positive selection are also similar between humans and chimpanzees. Moreover, the most recent common ancestor between humans and chimpanzees had a larger number of functional OR genes (>500) and a lower fraction of pseudogenes (41%) than its descendents, suggesting that the OR gene repertoires are in a phase of deterioration in both lineages. Interestingly, despite the close evolutionary relationship between the 2 species, approximately 25% of their functional gene repertoires are species specific due to massive gene losses. These findings suggest that the tempo of evolution of OR genes is similar between humans and chimpanzees, but the OR gene repertoires are quite different between them. This difference might be responsible for the species-specific ability of odor perception.
Key Words: olfactory receptor gene human evolution chimpanzee macaque multigene family gene gain and loss
| Introduction |
|---|
|
|
|---|
Olfaction, the sense of smell, is essential for the survival of most animals. Versatile odorant molecules in the environment are detected by olfactory receptors (ORs) that are mainly expressed in sensory neurons of the main olfactory epithelium. It is known that OR genes form a very large multigene family, and there are
1,000 OR genes in mammalian genomes (Buck and Axel 1991
Due to recent progress in whole-genome sequencing, the entire repertoires of OR genes from various species have been identified, including humans (Glusman et al. 2001
; Zozulya et al. 2001
; Niimura and Nei 2003
), mice (Young et al. 2002
; Zhang and Firestein 2002
; Godfrey et al. 2004
; Zhang et al. 2004
; Niimura and Nei 2005a
), dogs (Quignon et al. 2003
, 2005
; Olender et al. 2004
; Niimura and Nei 2007
), and other vertebrates (Niimura and Nei 2005b
, 2007
). These studies have revealed wide variations in the number of OR genes among different species. The number of functional OR genes ranges from <100 in some fishes and <400 in humans to >1,200 in rats and the fraction of OR pseudogenes ranges from <20% in opossums and
25% in mice or dogs to >50% in humans or platypuses (Niimura and Nei 2005b
, 2007
).
It is generally thought that higher primates are vision-oriented animals and that their olfactory abilities have therefore retrogressed. Gilad et al. (2004)
sequenced
100 OR genes that were randomly chosen from each of 19 primate species, including apes, Old World Monkeys (OWMs), New World Monkeys (NWMs), and prosimians (see Gilad et al. [2007
] for correction). They found that the fractions of OR pseudogenes in apes and OWMs are significantly higher than those in most NWMs and prosimians. Interestingly, there was one exceptional NWM species, the howler monkey, whose pseudogene fraction was as high as those of apes and OWMs. From these observations, they hypothesized that the loss of OR genes in primates coincides with the acquisition of full trichromatic vision. In the same paper, they also argued that the fraction of OR pseudogenes in humans is even much higher than that of apes or OWMs. In another paper (Gilad et al. 2003
), the same group sequenced 50 orthologous OR genes from chimpanzees, gorillas, and orangutans and compared them with human OR genes, reporting that the accumulation of mutations disrupting the coding regions is much faster in humans than in chimpanzees and other apes. In Gilad et al. (2005)
, they identified 899 OR gene candidates from the first version of the chimpanzee genome sequences (4x coverage). By analyzing these data, they again concluded that humans have a significantly higher fraction of OR pseudogenes than chimpanzees. Therefore, these studies have suggested that the OR gene repertoire has been declining particularly in the human lineage after the separation from our closest relative, chimpanzees.
In this paper, we ask whether or not the extent of the deterioration of OR genes is truly higher in humans than in chimpanzees. This is an important question for understanding human-specific characteristics and the evolution of the sensory system in primates. For this purpose, we used an improved version of the chimpanzee genome sequences with 6x coverage and macaque OR genes as outgroups for human–chimpanzee comparisons. As a result, in contrast to the previous assertion, we did not find any evidence of significant differences between humans and chimpanzees in the number of functional OR genes, the fraction of OR pseudogenes, or the extent of positive selection. We did, however, find that the repertoires of OR genes differ substantially between the 2 species.
| Materials and Methods |
|---|
|
|
|---|
OR Gene Identification
The assembled genome sequences of chimpanzees (panTro2, released in March 2006) and macaques (rheMac2, released in January 2006) were retrieved from the FTP server of the University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu). The method to identify functional OR genes and pseudogenes was described in detail in a previous paper (Niimura and Nei 2007
H* pseudogenes (see Results) were identified by a method similar to that in Niimura and Nei (2005a)
. We conducted BlastP searches (Altschul et al. 1997
) against all functional OR genes in humans by using translated amino acid sequences of OR pseudogenes in humans, chimpanzees, and macaques as queries. We defined a pseudogene that best hits to HsOR19.2.14 as an H* pseudogene. H* pseudogenes identified by this method are almost identical to 7E pseudogenes in Gilad et al. (2005)
and those in the Human Olfactory Receptor Data Exploratorium (HORDE), Build 42 (http://bioportal.weizmann.ac.il/HORDE/).
The chimpanzee OR genes identified in this study were compared with the previous data. The sequences of 100 and 899 chimpanzee OR genes in Gilad et al. (2004
, 2005
, 2007
) were obtained from the Gilad laboratory Web site (http://giladlab.uchicago.edu/) and HORDE, Build 42, respectively. Each of the 100 sequences in Gilad et al. (2004)
was compared with the translated amino acid sequences in Gilad et al. (2005)
and those in this study by BlastX searches (Altschul et al. 1997
), and the gene showing the highest sequence identity was identified. When the nucleotide sequence identity between 2 sequences was >98%, they were regarded to be the same gene.
Identification of Orthologous Gene Sets
Orthologous OR gene sets among humans, chimpanzees, and macaques were determined in the following way.
- (i) To identify orthologous relationships, we first constructed a phylogenetic tree using OR genes that are
600 bp long. There are 2,038 such sequences in total (749, 730, and 559 sequence for humans, chimpanzees, and macaques, respectively). A multiple sequence alignment was made from these sequences together with an outgroup sequence (a zebrafish OR gene; GenBank accession number AF283560) by ClustalW (Thompson et al. 1994
) using a GAPOPEN parameter of 20 and GAPEXT of 8. From the alignment, we reconstructed a phylogenetic tree by MEGA3 (Kumar et al. 2004
) using the Neighbor-Joining method (Saitou and Nei 1987
). The nucleotide p distance and the pairwise deletion option were used. From the tree obtained, we identified monophyletic clades containing genes from at least 2 species among humans, chimpanzees, and macaques as candidate orthologous gene sets. The genes contained in each of the clades were aligned by ClustalW. When the nucleotide sequence identity between human and chimpanzee genes in a clade was
96%, the genes were regarded as orthologous. Similarly, when the nucleotide identity between human and macaque genes or that between chimpanzee and macaque genes in a clade was
90%, they were regarded as orthologous. Almost all the nucleotide identities for human–chimpanzee and human–macaque comparisons were
96% and
90%, respectively, after excluding outliers; for this reason, these cutoff values were chosen. Most (>90%) of the phylogenetic clades for orthologous gene sets identified here were supported with a 100% bootstrap value.
- (ii) The sequences that were <600 bp long (53, 83, and 47 sequences for humans, chimpanzees, and macaques, respectively) were treated in the following way. By using the 53 human sequences as queries, we conducted BlastN searches (Altschul et al. 1997
) against all chimpanzee and macaque OR genes. A given query sequence and the best-hit sequence were aligned by ClustalW, and the nucleotide sequence identity between them was calculated. When the nucleotide identity was
96%, they were assumed to be orthologous to each other. Similarly, when the nucleotide identity between the best-hit sequence in macaques and the human query sequence was
90%, they were assumed to be orthologous. Moreover, by using the 83 chimpanzee sequences and the 47 macaque sequences as queries, BlastN searches were carried out against all human and macaque OR genes and against all human and chimpanzee OR genes, respectively. Orthlogous genes were identified by using the same cutoff values of nucleotide sequence identity as above (96% for human–chimpanzee orthologs and 90% for human–macaque and chimpanzee–macaque orthologs).
- (iii) After processes (i) and (ii), there were several sequences (41, 71, and 91 sequences for humans, chimpanzees, and macaques, respectively) for which no orthologous genes were identified. These sequences may contain paralogous genes that were generated by lineage-specific gene duplications. To identify such sequences, we conducted intraspecies BlastN searches. By using the 41 human OR genes, self-against-self BlastN searches were performed. A given query sequence and the best-hit sequence (excluding the query sequence itself) were aligned by ClustalW, and the nucleotide sequence identity between them was calculated as above. When the nucleotide identity between the query and the best-hit sequences was
96%, they were assumed to have been generated by gene duplication in the human lineage after the divergence from chimpanzees. When the nucleotide identity was between 90% and 96%, it was assumed that gene duplication occurred after the divergence from macaques and before the divergence from chimpanzees. Furthermore, we calculated the nucleotide sequence identity between the paralogous gene sets identified above and merged them as genes generated by lineage-specific multiplication using the same criteria of nucleotide identity. We also conducted self-against-self BlastN searches for the chimpanzee sequences and identified paralogous genes in the same manner as human genes. As for macaque sequences, when the nucleotide identity between the query and the best-hit sequences was
90%, they were assumed to have been generated by macaque-specific gene duplication.
- (ii) The sequences that were <600 bp long (53, 83, and 47 sequences for humans, chimpanzees, and macaques, respectively) were treated in the following way. By using the 53 human sequences as queries, we conducted BlastN searches (Altschul et al. 1997
All orthologous relationships of OR genes are listed in supplementary table 1 (Supplementary Material online). The
values for orthologous OR genes shown in supplementary figure 1 (Supplementary Material online) were calculated by the modified Nei–Gojobori method (Zhang et al. 1998
) with the transition/transversion ratio of 2.0.
Analysis of Gene Gains and Gene Losses
The number of OR genes in each ancestral node and those of gene gains and gene losses in each lineage were estimated by using the orthologous relationships determined above (supplementary table 1, Supplementary Material online).
To estimate the numbers of functional OR genes and pseudogenes in the most recent common ancestor (MRCA) of humans and chimpanzees, we inferred whether each of the OR genes in the MRCA was a functional gene or a pseudogene in the following way. When at least one of the human and chimpanzee orthologous genes was functional, the MRCA gene was assumed to be functional. Even when both the human and chimpanzee orthologous genes were pseudogenes, if they did not share any disruptive mutations, the gene in the MRCA was assumed to be functional. On the other hand, when both the human and chimpanzee orthologous genes were pseudogenes and they shared at least one disruptive mutation, we assumed that the MRCA gene was a pseudogene. When a pseudogene in one species did not have an orthologous counterpart in the other species, we regarded the MRCA gene as a pseudogene for simplicity. Note that this assumption may cause the fraction of pseudogenes in the MRCA to be overestimated. However, this does not affect the conclusion that the MRCA has a lower fraction of pseudogenes than the extant humans or chimpanzees (see Results).
The probability of the occurrence of i independently pseudogenized orthologous pseudogenes (IPOPs; see Results) under the assumption that pseudogenization events occurred randomly in the human and chimpanzee lineages is given by the hypergeometric distribution
![]() |
Selection Test
To identify the genes that are subject to lineage-specific positive or negative selection, we conducted maximum likelihood analyses using PAML (Yang and Nielsen 2000
; Yang et al. 2000
). Orthologous OR gene sets among the 3 species were chosen in the following way. In our data set, there are 266 one-to-one orthologous pairs of functional OR genes between humans and chimpanzees (see table 2). Among them, 219 pairs have orthologous genes in macaques, whether they are functional or not. There were several cases in which 2 or more orthologous genes were present in macaques because of macaque-specific gene duplication. In these cases, when functional genes were present in the macaque orthologs, all functional genes were used; otherwise, all pseudogenes were used.
|
For each of the orthologous gene sets identified above, we applied 2 kinds of statistical tests. In the branch test, 5 different models were examined. In Model I, the same
value is assumed along all 3 branches. Model II allows a different
value in the human lineage from the other lineages. Similarly, Models III and IV allow a different
value in the chimpanzee and macaque lineages, respectively. Model V allows different
values in the 3 lineages. The best of the 5 models was determined in the following way. Because the numbers of parameters in Models II, III, and IV are the same, the model showing the highest likelihood score among the 3 models was chosen first and hereafter is called Model X. We then conducted a likelihood ratio test to compare Model X (the alternative hypothesis) with Model I (the null hypothesis) by using
2 distribution with 1 degree of freedom (df). When Model I was not rejected, it was assumed to be the best model. Otherwise, a similar statistical test was performed to compare Model V (the alternative hypothesis) with Model X (the null hypothesis) by
2 distribution with 1 df. When Model X was rejected, Model V was regarded as the best model. Otherwise, Model X was chosen as the best model.
In the branch-site test, 3 different models were considered (Zhang et al. 2005
). In Model I, all codon sites along all branches were assumed to be under negative (
0 < 1) or neutral (
1 = 1) selection. The fractions of codon sites under negative and neutral selection are p0 and p1, respectively. In Model II, some codons (pH) in the human lineage were allowed to evolve under positive selection (
H > 1). Similarly, in Model III, some codons (pC) in the chimpanzee lineage were allowed to evolve under positive selection (
C > 1). The model that showed a higher likelihood score between Models II and III was chosen and that model (the alternative hypothesis) was compared with Model I (the null hypothesis) by using
2 distribution with 2 df.
| Results |
|---|
|
|
|---|
OR Genes in Humans, Chimpanzees, and Macaques
Table 1 shows the numbers of OR genes and their chromosomal locations in humans, chimpanzees, and macaques. The numbers of functional OR genes and OR pseudogenes in humans and chimpanzees are similar to each other, although macaques have
25% fewer OR genes than humans and chimpanzees (Niimura and Nei 2007
50%). To compare the fractions of pseudogenes among different species, some caution is necessary because a functional gene embedded in a low-quality genome sequence may appear to be a pseudogene. We therefore separately counted the truncated genes that were at the contig ends and that did not contain any nonsense or frameshift mutations. The truncated genes could become functional upon completion of the genome sequence. However, the estimated fractions of pseudogenes did not substantially change even when all truncated genes were assumed to be functional or were assumed to pseudogenes (table 1).
|
Gilad et al. (2005)
To examine the quality of the OR genes identified here in more detail, we compared our data and the Gilad2005 data with the 100 genes in Gilad et al. (2004
, 2007
) that were experimentally confirmed. We call this set of 100 genes the Gilad2004 data. We found that 19 out of the 100 genes in the Gilad2004 data were not present in the Gilad2005 data, whereas only 4 genes in the Gilad2004 data were not identified in this study (supplementary table 2, Supplementary Material online). Moreover, 8 genes were regarded as pseudogenes in the Gilad2005 data but were annotated as functional in both the Gilad2004 data and in this study. These observations suggest that the quality of chimpanzee OR genes in this study is greatly improved compared with that in the Gilad2005 data.
Humans have a group of pseudogenes named 7E (Newman and Trask 2003
) or H* pseudogenes (Niimura and Nei 2005a
). Most of these pseudogenes appear to have been generated by gene duplications after they were pseudogenized (Niimura and Nei 2005a
). Thus, they apparently did not contribute to olfactory ability in the ancestral species. For this reason, we examined the fractions of pseudogenes after excluding H* pseudogenes. We identified 84, 85, and 8 H* pseudogenes from humans, chimpanzees, and macaques, respectively. The corrected fractions of pseudogenes without H* pseudogenes are very similar among the 3 species (table 1).
Menashe et al. (2006)
developed an algorithm to detect candidates of intact OR pseudogenes based on deviations from a functionally crucial consensus (classifier for olfactory receptor pseudogenes [CORP]). We applied the CORP algorithm to the 387 and 380 functional genes in humans and chimpanzees, respectively. As a result, 162 human genes and 156 chimpanzee genes were identified as candidate intact pseudogenes. The fractions of putative intact pseudogenes are also similar between the 2 species (41.9% and 41.1% for humans and chimpanzees, respectively).
Orthologous Relationships of OR Genes among 3 Primates
By conducting phylogenetic analyses and homology searches, we identified 689 human–chimpanzee and 470 human–macaque orthologous gene sets (table 2). Most of the orthologous relationships are one-to-one (647 out of 689 human–chimpanzee orthologous gene sets and 404 out of 470 human–macaque orthologous gene sets), suggesting that gene duplication was not frequent in the evolution of OR genes in the 3 species. We also identified 443 orthologous gene sets among humans, chimpanzees, and macaques (supplementary table 1, Supplementary Material online). Among them, 371 orthologous relationships are one-to-one-to-one trios. The organization of genomic clusters of OR genes is generally well conserved among the 3 species, though there are several cases in which genomic rearrangements have occurred in the regions of OR gene clusters (supplementary fig. 2, Supplementary Material online). Orthologous relationships were identified for almost all OR gene clusters between humans and chimpanzees (supplementary table 1, Supplementary Material online).
We found 266 one-to-one orthologous pairs of functional OR genes between humans and chimpanzees. For all cases, the orthologous genes are located on syntenic chromosomes between these species. The mean of the nucleotide sequence identities between orthologous genes among the 266 pairs is 98.77% (table 2). This value is much lower than the mean (99.41%) of the nucleotide identities among 13,454 orthologous gene pairs between the 2 species and is exactly the same as that in the whole-genome sequences containing noncoding regions (The Chimpanzee Sequencing and Analysis Consortium 2005
). We identified 157 one-to-one functional orthologous gene pairs between humans and macaques. The mean nucleotide identity among them is 95.82%, which is considerably lower than the mean in the entire coding sequences (97.5%) but is higher than that in the whole-genome sequences (93.54%) (The Rhesus Macaque Genome Sequencing and Analysis Consortium 2007
). There are 140 one-to-one-to-one functional orthologous gene trios.
We calculated the ratio of nonsynonymous to synonymous substitutions (
) for each of the 266 human–chimpanzee orthologous gene pairs and the 157 human–macaque orthologous gene pairs. The mean of the
values for the human–chimpanzee pairs is 0.937 (median 0.713), which is much larger than that for the human–macaque pairs (0.442, median 407; table 2). The distributions of
values are significantly different between human–chimpanzee and human–macaque comparisons (P < 2.2 x 10–16, Mann–Whitney U test; supplementary fig. 1, Supplementary Material online). Moreover, the fraction of gene pairs with
> 1 is much higher for the human–chimpanzee pairs (90/266) than for the human–macaque pairs (2/157). It is known that the mutation rate of CpG dinucleotides is >10 times faster than that of the other sites (Sved and Bird 1990
). For this reason, we examined whether or not the presence of CpG dinucleotides distorts the distribution of
. However, we found no significant correlation between the
values and the fractions of CpG dinucleotides in OR genes (data not shown).
As mentioned above, it was previously suggested that the pseudogenization rate of OR genes is higher in the human lineage than in the chimpanzee lineage (Gilad et al. 2003
, 2004
, 2005
). As shown in table 2, among the 647 one-to-one orthologous pairs between humans and chimpanzees, 66 gene pairs contain a human functional gene and a chimpanzee pseudogene (or a truncated gene) and 70 gene pairs contain a human pseudogene and a chimpanzee functional gene. To exclude the possibility of sequencing errors in the chimpanzee genome, we examined the Phred quality score at each disruptive mutation in chimpanzee pseudogenes. However, we did not find any sites showing a Phred score <40 (99.99% base-call accuracy). It therefore appears that pseudogenization rates are similar between the human and chimpanzee lineages. Even when all truncated genes in chimpanzees were assumed to be functional, the null hypothesis of an equal rate of pseudogenization between the 2 lineages could not be rejected (P = 0.21,
2 test). For 28 and 39 genes out of the 66 human-specific and 70 chimpanzee-specific pseudogenes, respectively, their orthologs were regarded as intact pseudogenes by the CORP algorithm (Menashe et al. 2006
; see above).
Wang et al. (2006)
reported 80 human-specific nonprocessed pseudogenes, including 36 OR genes, by conducting whole-genome analyses. Almost all (33) these 36 genes are contained in the 70 human-specific OR pseudogenes identified above. As for the remaining 3 genes, their chimpanzee orthologs were annotated as pseudogenes in our data set.
Though the numbers of functional genes are similar between humans and chimpanzees, we found that the repertoires of OR genes differ considerably between the 2 species. Interestingly, 98 of 387 human functional genes are not orthologous to chimpanzee functional genes, and 94 of 380 functional genes in chimpanzees are nonorthologous to those in humans (fig. 1A). In other words, 25% of the functional OR gene repertoires in humans and chimpanzees are species specific. This observation suggests that the MRCA between the 2 species had a larger number of functional OR genes than humans or chimpanzees, and many genes were lost in a lineage-specific manner (see below). Among the 98 human-specific and 94 chimpanzee-specific genes, 40 and 47 genes were assigned to be intact pseudogenes by the CORP algorithm (Menashe et al. 2006
; see above). We also investigated the phylogenetic distribution of these species-specific functional genes. The result showed that some species-specific genes are unevenly distributed in the phylogenetic tree (supplementary fig. 3, Supplementary Material online). For example, clade E in supplementary figure 3 (Supplementary Material online) contains 7 human-specific genes but does not contain any chimpanzee-specific genes.
|
Evolutionary Trend of Gene Gains and Gene Losses in Primates
To investigate the evolutionary trends of gains and losses of OR genes in primates, we counted the numbers of gene gains (gene duplication) and gene losses (elimination from the genome) that occurred in each lineage by using orthologous relationships (supplementary table 1, Supplementary Material online). From these numbers, we estimated the numbers of OR genes in the ancestral species in the evolution of 3 primate species (fig. 1B). Note that the numbers in this figure include both functional genes and pseudogenes. The result showed that there are more genes in the ancestral species than in each of the 3 current species, suggesting that gene losses were the major trend in the evolution of OR genes in these species. In the macaque lineage, it was estimated that only 32 gene gains have occurred after the divergence from humans/chimpanzees, whereas 258 genes were deleted from the genome in the same period. The human and chimpanzee lineages also lost many genes; the estimated numbers of gene losses are 2 or 3 times larger than those of gene gains (fig. 1B). The number of gene losses in the human (81) and chimpanzee (88) lineages during
6 Myr is nearly the same as that (82) in the human/chimpanzee lineage before the separation of the 2 species during
24 Myr, suggesting that the rate of gene loss has recently accelerated in the human/chimpanzee lineage. To examine the evolutionary change in the fraction of pseudogenes, we estimated the numbers of functional OR genes and pseudogenes in the MRCA between humans and chimpanzees. For this purpose, we inferred whether an OR gene in the MRCA was functional or nonfunctional by using the parsimony principle (see Materials and Methods). As figure 1C suggests, the MRCA had a much larger number (510) of functional genes and a significantly lower fraction (40.5%) of pseudogenes than do humans and chimpanzees. We also estimated that among the 510 functional genes in the MRCA, 113 became pseudogenes and 19 were deleted from the genome in the human lineage, whereas 109 were pseudogenized and 23 were eliminated in the chimpanzee lineage. These observations again support the notion of rapid deterioration of OR genes in both human and chimpanzee lineages and similar rates of gene loss between the 2 lineages.
As mentioned above, similar numbers of pseudogenization events occurred in the human and chimpanzee lineages. We then ask whether or not the sets of genes that became pseudogenes are similar between the 2 species. For this purpose, we counted the functional genes in the MRCA that were independently pseudogenized in both the human and chimpanzee lineages (see Materials and Methods). We refer to the pseudogenes derived from such MRCA genes as IPOPs. We found that, among the 113 and 109 genes that were pseudogenized in the human and chimpanzee lineages, respectively, 43 genes are IPOPs. The expected number of IPOPs under the assumption that pseudogenization events occurred randomly in the human and chimpanzee lineages is 24, and the probability of the occurrence of 43 or more IPOPs (P(i
43)) is <10–5 (see Materials and Methods). Therefore, the same functional gene in the MRCA tends to have become a pseudogene independently in the 2 lineages, although we cannot exclude the possibility that some of these genes were already dysfunctional in the MRCA but had not yet acquired any disruptive mutations. This observation is not surprising because some common environmental factors in humans and chimpanzees should have affected the dispensability of genes. However, if we note that >60% of the pseudogenization events were lineage specific, the effects of environmental factors on OR gene pseudogenization in these lineages appear not to be large.
One may criticize that the fraction of OR pseudogenes in the MRCA calculated above is underestimated because some of the OR pseudogenes in the MRCA might have been eliminated from both the human and chimpanzee genomes. To consider this possibility, we investigated the genes that were inferred to have existed in the MRCA among humans, chimpanzees, and macaques but that have no surviving descendent genes in both humans and chimpanzees. There are 82 such genes (supplementary table 1, Supplementary Material online). These 82 genes were eliminated from the genome in the human/chimpanzee lineage at some time between 30 MYA and the present. As an extreme case, it is possible to assume that all the 82 genes were present as pseudogenes in the MRCA genome. Even in that case, the fraction of pseudogenes in the MRCA is calculated to be 45.7% ([348 + 82]/[858 + 82]), which is still considerably smaller than the fractions in humans and chimpanzees. In reality, some of the 82 genes should have remained functional in the MRCA, and others should already have been eliminated from the MRCA genome. Therefore, our conclusion of the lower fraction of OR pseudogenes in the MRCA than in humans and chimpanzees does not change.
Selective Forces on OR Genes
Several genome-scale studies have suggested that OR genes may have experienced positive selection in either the human or chimpanzee lineage or both lineages (Clark et al. 2003
; Bustamante et al. 2005
; Nielsen et al. 2005
; Arbiza et al. 2006
; Voight et al. 2006
). However, some of these studies may be inaccurate because they used distantly related species (e.g., mice or rats) as outgroups (Clark et al. 2003
; Arbiza et al. 2006
) or did not use any outgroup species (Nielsen et al. 2005
). Here we used macaque OR genes as the outgroup and investigated the selective forces to which human and chimpanzee OR genes were subject. We conducted 2 kinds of statistical tests, the branch test (fig. 2A) and the branch-site test (fig. 2B), on the basis of the maximum likelihood method. One-to-one orthologous gene pairs of functional genes between humans and chimpanzees and their orthologous genes in macaques were used for the analyses. We used 219 orthologous gene sets among the 3 species for the following analysis.
|
In the branch test, we examined 5 different models (Models I–V in fig. 2A). Using PAML (Yang and Nielsen 2000
values for all these 29 cases are more than 1 (supplementary table 3, Supplementary Material online), possibly suggesting that they were under positive selection. However, it is also possible that these genes are in the process of pseudogenization. We found no significant differences in the number of genes that were under accelerated (13 vs. 16) or decelerated (1 vs. 3) evolution between the human and chimpanzee lineages (fig. 2C). We also corrected the effect of multiple comparisons by controlling the false discovery rate (FDR) (see details in Benjamini and Hochberg 1995
Because positive selection affects only a subset of codons within a coding region, we carried out the branch-site test, in which the
value is allowed to vary among codon sites. In this test, we examined 3 models (Models I–III in fig. 2B). In Model I, it is assumed that all sites along all branches are under either negative or neutral selection, whereas in Models II and III, some codons in the human and chimpanzee lineages, respectively, are allowed to evolve under positive selection. The result showed that the numbers of orthologous gene pairs supporting Models II and III are 12 and 7, respectively (P < 0.05 in the likelihood ratio test; fig. 2D and supplementary table 4 [Supplementary Material online]). With an FDR level of 0.05, we found only one case supporting Model II and one case supporting Model III. Therefore, there is little difference in the extent of positive selection between the 2 species. Moreover, these results suggest that the contribution of positive selection to the evolution of OR genes is not large in the human and chimpanzee lineages (Gimelbrant et al. 2004
).
| Discussion |
|---|
|
|
|---|
Comparison with the Previous Studies
In this study, we showed that the fractions of OR pseudogenes are similar between humans and chimpanzees. This result is in sharp contrast to the previous studies (Gilad et al. 2003
54%. They then compared the Gilad2005 data with the 30 intact OR genes that were experimentally determined in Gilad et al. (2003)
41%. We should note that this estimation would contain a large extent of sampling errors because it is based on only 30 genes.
The second reason for the discrepancy is the difference in the fraction of pseudogenes in humans. Gilad et al. (2005)
mentioned that, on the basis of the HORDE (http://bioportal.weizmann.ac.il/HORDE/), the fraction of OR pseudogenes in humans after excluding 7E pseudogenes was
51%. This value is considerably higher than our result (table 1). However, according to the latest version of HORDE (Build 42), which contains 391 functional genes and 464 pseudogenes including 85 7E pseudogenes in humans, the fraction of OR pseudogenes without 7E pseudogenes is calculated to be
49%. This value is closer to, but still higher than, that in table 1. The smaller number of OR pseudogenes in our data than in HORDE might be due to the slightly conservative criteria for our OR gene identification (Niimura and Nei 2007
), suggesting that the fraction of pseudogenes can vary depending on the gene identification method. The finding of similar fractions of pseudogenes between humans and chimpanzees was obtained by applying the same criteria to the genome sequences of the 2 species. Therefore, the result in this paper is expected to be more reliable than that in Gilad et al. (2005)
.
In another paper, Gilad et al. (2004)
also claimed that the fraction of OR pseudogenes is significantly higher in humans than in chimpanzees, based on a survey of 100 randomly chosen OR genes. There are a number of mistakes in that paper (see Gilad et al. 2007
). After the correction of these mistakes, there are 38 pseudogenes in chimpanzees in the Gilad2004 data (supplementary table 2, Supplementary Material online). We should note, however, that the fraction obtained from these data (38%) does not represent the fraction of OR pseudogenes in the chimpanzee genome. The reason is that they used degenerate primers that covered only
670 bp out of a
1-kb coding region of an OR gene. Therefore, pseudogenes that carry disruptive nonsense or frameshift mutations only in the region not amplified by these primers were regarded to be intact in the Gilad2004 data. In fact, we found 6 such cases (PC1_17_i, PC2_29_i, PC2_55_i, PC2_56_i, PC2_57_i, and PC2_69_i; see supplementary table 2, Supplementary Material online). If we use this number, the fraction of pseudogenes without 7E genes is estimated to be 44%, which is fairly close to the result in table 1.
Evolution of Human and Chimpanzee OR Genes
This paper revealed that, although the numbers of functional genes and pseudogenes are similar between humans and chimpanzees,
25% of their functional gene repertoires are nonorthologous to each other (fig. 1A). The difference in the functional OR gene repertoires between the 2 species might be responsible for the difference in the ability of odorant detection and may affect their behaviors. Niimura and Nei (2007)
showed that, although the numbers of functional OR genes are similar (
1,000) among several mammalian species (mice, rats, dogs, cows, and opossums), their OR gene repertoires were generated after the occurrence of hundreds of gene gains and losses from their common ancestor. Therefore, it appears that OR gene repertoires in mammals are generally highly diverse among different species.
Olfactory information is first transmitted to the olfactory bulb, which is involved in the integration of signals from different ORs (Mori et al. 1999
). It is therefore reasonable to assume that the size of the olfactory bulb is a good indicator of olfactory ability. Comparative anatomical studies showed that the olfactory bulb in chimpanzees is more than twice as large as that in humans (114 mm3 for humans vs. 257 mm3 for chimpanzees) (Stephan et al. 1981
). The difference becomes even more prominent when the size of the olfactory bulb is normalized by body weight and brain volume (Schoenemann 1997
). Moreover, it was reported that wild chimpanzees use several olfactory cues in both social and sexual contexts (Goodall 1986
; Nishida 1997
; Boesch and Boesch-Achermann 2000
). These observations may suggest that the olfactory ability is better in chimpanzees than in humans. However, to our knowledge, there have been no quantitative studies comparing the olfactory abilities between humans and chimpanzees. It is also possible that the number of functional genes is not necessarily well correlated with olfactory ability in each species. For example, dogs are generally regarded to have a keen sense of olfaction, but they do not have a particularly large number of functional OR genes (
800) (Niimura and Nei 2007
).
We should note that, however, the exact numbers of functional genes and pseudogenes in humans and chimpanzees cannot be determined in practice. Recently Nozawa et al. (2007)
investigated the copy number variation of OR genes among 270 humans. They found that the difference in the number of functional OR genes between 2 randomly chosen individuals is
11 on average. Moreover, the extent of copy number variation is almost the same between functional genes and pseudogenes, suggesting that a small change of the number of OR genes is nearly neutral. It is also known that there are many OR segregating pseudogenes that display both functional and nonfunctional alleles in humans (Menashe et al. 2003
, 2007
).
The results in figure 1B and C clearly showed that the OR gene repertoires in the 3 primate species are in a phase of deterioration. These species may have lost OR genes because they possess trichromatic vision, as previously suggested (Gilad et al. 2004
). Although a general evolutionary trend of OR genes in primates is the reduction of functional constraints, it is possible that some genes are under positive selection (Clark et al. 2003
; Bustamante et al. 2005
; Nielsen et al. 2005
; Arbiza et al. 2006
; Voight et al. 2006
). We therefore examined whether or not OR genes were subject to positive selection by using PAML (Yang and Nielsen 2000
; Yang et al. 2000
). PAML is often used to detect lineage-specific or site-specific positive selection (Clark et al. 2003
; Gilad et al. 2005
; Arbiza et al. 2006
; Yu et al. 2006
), but this method is known to produce false positives under certain conditions (Suzuki and Nei 2002
, 2004
). However, our goal is not to count the number of genes that are subject to positive selection but to compare evolutionary forces between the human and chimpanzee lineages. Biases introduced by PAML should affect human and chimpanzee genes equally. As a result, we found no significant difference in the number of genes that were under accelerated evolution (fig. 2C and D). Because it is known that a closely related species used as an outgroup gives a more accurate estimation of lineage-specific changes (Yu et al. 2006
; Bakewell et al. 2007
), our use of macaques rather than mice as the outgroup is expected to improve the accuracy of the results.
By the branch-site test with an FDR level of 0.05, we found only one case (HsOR7.6.15) in which positive selection was suggested in the human lineage. This result is consistent with Gimelbrant et al. (2004)
; that study suggested that the effect of positive selection on the evolution of OR genes in the human lineage is not large. In the HsOR7.6.15 gene, 13 nonsynonymous substitutions and only 1 synonymous substitution were inferred to have occurred along the human lineage. On the other hand, in the chimpanzee lineage leading to the chimpanzee ortholog (PatrOR7.6.18), it was inferred that 4 nonsynonymous and 5 synonymous changes occurred. Therefore, HsOR7.6.15 is a candidate gene that has experienced human-specific positive selection. Among the 13 nonsynonymous substitutions that have taken place in the human lineage, 1 substitution (from methionine to threonine at position 259) occurred at a putative ligand-binding site (Katada et al. 2005
), which might have altered ligand-binding affinity in a way that was advantageous to humans. However, further experimental studies should be conducted in order to elucidate how the difference in OR gene repertoires between the 2 species is responsible for the difference in physiological and behavioral traits.
| Supplementary Material |
|---|
|
|
|---|
Supplementary data sets 1 and 2, figures 1–3, and tables 1–4 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
| Acknowledgements |
|---|
|
|
|---|
We thank Pierre Fontanillas, Erik Dopman, Rob Kulathinal, and Trevor Bedford for their critical readings of the manuscript. We thank Doron Lancet and Idan Menashe for providing us the perl script of the CORP algorithm and Masafumi Nozawa for the perl script to calculate
values. We also thank Masatoshi Nei and Daniel Hartl for stimulating discussions and comments on the manuscript. This study was supported by the Japanese Society for Promotion of Science grant 17-02667 (Y.G.), 19-589 (Y.G.), and by the Ministry of Education, Culture, Sports, Science and Technology, Japan, grant 17710162 and 20770192 (Y.N.). | Footnotes |
|---|
Adriana Briscoe, Associate Editor
| References |
|---|
|
|
|---|
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped Blast and PSI-Blast: a new generation of protein database search programs. Nucleic Acids Res (1997) 25:3389–3402.
Arbiza L, Dopazo J, Dopazo H. Positive selection, relaxation, and acceleration in the evolution of the human and chimpanzee genome. PLoS Comput Biol (2006) 2:e38.[CrossRef][Medline]
Bakewell MA, Shi P, Zhang J. More genes underwent positive selection in chimpanzee evolution than in human evolution. Proc Natl Acad Sci USA (2007) 104:7489–7494.
Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B (1995) 57:289–300.
Boesch C, Boesch-Achermann H. The chimpanzees of the Taï forest: behavioural ecology and evolution (2000) Oxford: Oxford University Press.
Buck L, Axel R. A novel multigene family may encode odorant receptors: a molecular basis for odor recognition. Cell (1991) 65:175–187.[CrossRef][Web of Science][Medline]
Bustamante CD, Fledel-Alon A, Williamson S, et al, (14 co-authors). Natural selection on protein-coding genes in the human genome. Nature (2005) 20:1153–1157.
Clark AG, Glanowski S, Nielsen R, et al, (17 co-authors). Inferring nonneutral evolution from human-chimpanzee-mouse orthologous gene trios. Science (2003) 302:1960–1963.
Gilad Y, Man O, Glusman G. A comparison of the human and chimpanzee olfactory receptor gene repertoires. Genome Res (2005) 15:224–230.
Gilad Y, Man O, Pääbo S, Lancet D. Human specific loss of olfactory receptor genes. Proc Natl Acad Sci USA (2003) 100:3324–3327.
Gilad Y, Wiebe V, Przeworski M, Lancet D, Pääbo S. Loss of olfactory receptor genes coincides with the acquisition of full trichromatic vision in primates. PLoS Biol (2004) 2:e5.[CrossRef][Medline]
Gilad Y, Wiebe V, Przeworski M, Lancet D, Pääbo S. Correction: loss of olfactory receptor genes coincides with the acquisition of full trichromatic vision in primates. PLoS Biol (2007) 5:e148.[CrossRef]
Gimelbrant AA, Skaletsky H, Chess A. Selective pressures on the olfactory receptor repertoire since the human-chimpanzee divergence. Proc Natl Acad Sci USA (2004) 101:9019–9022.
Glusman G, Bahar A, Sharon D, Pilpel Y, White J, Lancet D. The olfactory receptor gene superfamily: data mining, classification, and nomenclature. Mamm Genome (2000) 11:1016–1023.[CrossRef][Web of Science][Medline]
Glusman G, Yanai I, Rubin I, Lancet D. The complete human olfactory subgenome. Genome Res (2001) 11:685–702.
Godfrey PA, Malnic B, Buck LB. The mouse olfactory receptor gene family. Proc Natl Acad Sci USA (2004) 101:2156–2161.
Goodall J. The chimpanzees of Gombe (1986) Cambridge (MA): Harvard University Press.
Katada S, Hirokawa T, Oka Y, Suwa M, Touhara K. Structural basis for a broad but selective ligand spectrum of a mouse olfactory receptor: mapping the odorant-binding site. J Neurosci (2005) 25:1806–1815.
Kumar S, Tamura K, Nei M. MEGA3: integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief Bioinform (2004) 5:150–163.
Menashe I, Abaffy T, Hasin Y, Goshen S, Yahalom V, Luetje CW, Lancet D. Genetic elucidation of human hyperosmia to isovaleric acid. PLoS Biol (2007) 5:e284.[CrossRef][Medline]
Menashe I, Aloni R, Lancet D. A probabilistic classifier for olfactory receptor pseudogenes. BMC Bioinformatics (2006) 7:393.[CrossRef][Medline]
Menashe I, Man O, Lancet D, Gilad Y. Different noses for different people. Nat Genet (2003) 34:143–44.[CrossRef][Web of Science][Medline]
Mombaerts P. Genes and ligands for odorant, vomeronasal and taste receptors. Nat Rev Neurosci (2004) 5:263–278.[CrossRef][Web of Science][Medline]
Mori K, Nagao H, Yoshihara Y. The olfactory bulb: coding and processing of odor molecule information. Science (1999) 286:711–715.
Newman T, Trask BJ. Complex evolution of 7E olfactory receptor genes in segmental duplications. Genome Res (2003) 13:781–793.
Nielsen R, Bustamante C, Clark AG, et al, (13 co-authors). A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS Biol (2005) 3:e170.[CrossRef][Medline]
Niimura Y, Nei M. Evolution of olfactory receptor genes in the human genome. Proc Natl Acad Sci USA (2003) 100:12235–12240.
Niimura Y, Nei M. Comparative evolutionary analysis of olfactory receptor gene clusters between humans and mice. Gene (2005a) 346:13–21.[CrossRef][Web of Science][Medline]
Niimura Y, Nei M. Evolutionary dynamics of olfactory receptor genes in fishes and tetrapods. Proc Natl Acad Sci USA (2005b) 102:6039–6044.
Niimura Y, Nei M. Evolutionary dynamics of olfactory and other chemosensory receptor genes in vertebrates. J Hum Genet (2006) 51:505–517.[CrossRef][Web of Science][Medline]
Niimura Y, Nei M. Extensive gains and losses of olfactory receptor genes in mammalian evolution. PLoS ONE (2007) 2:e708.[CrossRef]
Nishida T. Sexual behavior of adult male chimpanzees of the Mahale Mountains National Park, Tanzania. Primates (1997) 38:379–398.[CrossRef][Web of Science]
Nozawa M, Kawahara Y, Nei M. Genomic drift and copy number variation of sensory receptor genes in humans. Proc Natl Acad Sci USA (2007) 104:20421–20426.
Olender T, Fuchs T, Linhart C, Shamir R, Adams M, Kalush F, Khen M, Lancet D. The canine olfactory subgenome. Genomics (2004) 83:361–372.[CrossRef][Web of Science][Medline]
Quignon P, Giraud M, Rimbault M, et al, (11 co-authors). The dog and rat olfactory receptor repertoires. Genome Biol (2005) 6:R83.[CrossRef][Medline]
Quignon P, Kirkness E, Cadieu E, Touleimat N, Guyon R, Renier C, Hitte C, Andre C, Fraser C, Galibert F. Comparison of the canine and human olfactory receptor gene repertoires. Genome Biol (2003) 4:R80.[CrossRef][Medline]
Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol (1987) 4:406–425.[Abstract]
Schoenemann PT. An MRI study of the relationship between human neuroanatomy and behavioral ability [dissertation] (1997) [Berkeley (CA)]: University of California.
Stephan H, Frahm H, Baron G. New and revised data on volumes of brain structures in insectivores and primates. Folia Primatol (1981) 35:1–29.[CrossRef][Medline]
Suzuki Y, Nei M. Simulation study of the reliability and robustness of the statistical methods for detecting positive selection at single amino acid sites. Mol Biol Evol (2002) 19:1865–1869.
Suzuki Y, Nei M. False-positive selection identified by ML-based methods: examples from the Sig1 gene of the diatom Thalassiosira weissflogii and the tax gene of a human T-cell lymphotropic virus. Mol Biol Evol (2004) 21:914–921.
Sved J, Bird A. The expected equilibrium of the CpG dinucleotide in vertebrate genomes under a mutation model. Proc Natl Acad Sci USA (1990) 87:4692–4696.
The Chimpanzee Sequencing and Analysis Consortium. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature (2005) 437:69–87.[CrossRef][Web of Science][Medline]
The Rhesus Macaque Genome Sequencing and Analysis Consortium. Evolutionary and biomedical insights from the rhesus macaque genome. Science (2007) 316:222–234.
Thompson JD, Higgins DG, Gibson TJ. ClustalW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res (1994) 22:4673–4680.
Voight BF, Kudaravalli S, Wen X, Pritchard JK. A map of recent positive selection in the human genome. PLoS Biol (2006) 4:e72.[CrossRef][Medline]
Wang X, Grus W, Zhang J. Gene losses during human origins. PLoS Biol (2006) 4:e52.[CrossRef][Medline]
Yang Z, Nielsen R. Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol Biol Evol (2000) 17:32–43.
Yang Z, Nielsen R, Goldman N, Pedersen AM. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics (2000) 155:431–449.
Young JM, Friedman C, Williams EM, Ross JA, Tonnes-Priddy L, Trask BJ. Different evolutionary processes shaped the mouse and human olfactory receptor gene families. Hum Mol Genet (2002) 11:535–546.
Yu XJ, Zheng HK, Wang J, Wang W, Su B. Detecting lineage-specific adaptive evolution of brain-expressed genes in human using rhesus macaque as outgroup. Genomics (2006) 88:745–751.[CrossRef][Web of Science][Medline]
Zhang J, Nielsen R, Yang Z. Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol (2005) 22:2472–2479.
Zhang J, Rosenberg HF, Nei M. Positive Darwinian selection after gene duplication in primate ribonuclease genes. Proc Natl Acad Sci USA (1998) 95:3708–3713.
Zhang X, Firestein S. The olfactory receptor gene superfamily of the mouse. Nat Neurosci (2002) 5:124–133.[Web of Science][Medline]
Zhang X, Rodriguez I, Mombaerts P, Firestein S. Odorant and vomeronasal receptor genes in two mouse genome assemblies. Genomics (2004) 83:802–811.[CrossRef][Web of Science][Medline]
Zozulya S, Echeverri F, Nguyen T. The human olfactory receptor repertoire. Genome Biol (2001) 2:research0018.1–research0018.12.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
D. Dong, G. He, S. Zhang, and Z. Zhang Evolution of Olfactory Receptor Genes in Primates Dominated by Birth-and-Death Process Gen Biol Evol, August 20, 2009; 2009(0): 258 - 264. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Niimura On the Origin and Evolution of Vertebrate Olfactory Receptor Genes: Comparative Genome Analysis Among 23 Chordate Species Gen Biol Evol, June 22, 2009; 2009(0): 34 - 44. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||



