MBE Advance Access published online on July 26, 2007
Molecular Biology and Evolution, doi:10.1093/molbev/msm149
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Article |
Context-Dependent Mutation Rates May Cause Spurious Signatures of a Fixation Bias Favoring Higher GC-Content in Humans

* Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14850, USA
Clinical Science, Cornell University, Ithaca, New York 14850, USA
Corresponding Author: Carlos D. Bustamante, Department of Biological Statistics and Computational Biology, 101 Biotechnology Building, Cornell University, Ithaca, NY 14850, phone (607)255-1640, fax (607)255-4698, email cdb28{at}cornell.edu
Received for publication March 15, 2007. Revision received June 30, 2007. Accepted for publication July 5, 2007.
Understanding the proximate and ultimate causes underlying the evolution of nucleotide composition in mammalian genomes is of fundamental interest to the study of molecular evolution. Comparative genomics studies have revealed that many more substitutions occur from G and C nucleotides to A and T nucleotides than the reverse, suggesting that mammalian genomes are not at equilibrium for base composition. Analysis of human polymorphism data suggests that mutations that increase GC-content tend to be at much higher frequencies than those that decrease or preserve GC-content when the ancestral allele is inferred via parsimony using the chimpanzee genome. These observations have been interpreted as evidence for a fixation bias in favor of G and C alleles due either to positive natural selection or biased gene conversion. Here, we test the robustness of this interpretation to violation of the parsimony assumption using a data set of 21,488 non-coding SNPs discovered by the NIEHS SNPs project via direct resequencing of n = 95 individuals. Applying standard non-parametric and parametric population genetic approaches we replicate the signatures of a fixation bias in favor of G and C alleles when the ancestral base is assumed to be the base found in the chimpanzee outgroup. However, upon taking into account the probability of misidentifying the ancestral state of each SNP using a context dependent mutation model, the corrected distribution of SNP frequencies for for GC-content increasing SNPs are nearly indistinguishable from the patterns observed for other types of mutations, suggesting that the signature of fixation bias is a spurious artifact of the parsimony assumption.
Key Words: ancestral misidentification biased gene conversion context-dependence GC-content human natural selection single nucleotide polymorphism site-frequency spectrum