MBE Advance Access originally published online on September 6, 2006
Molecular Biology and Evolution 2006 23(12):2355-2360; doi:10.1093/molbev/msl106
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
Robustness of Coalescent Estimators to Between-Lineage Mutation Rate Variation
Department of Genome Sciences, University of Washington, Seattle
E-mail: mkkuhner{at}gs.washington.edu.
Data from HIV and from human neoplastic cells can show substantial between-lineage mutation rate variation even within a single population. Such variation may affect estimators of population quantities such as
= 4Neµ. Using simulated DNA data, I measured the effect of rate variation on recovery of
by the summary-statistic estimator of Watterson (Watterson GA. 1975. On the number of segregating sites in genetical systems without recombination. Theor Popul Biol. 7:256276) and the coalescent maximum likelihood algorithm LAMARC (Kuhner MK. 2006. LAMARC 2.0: maximum likelihood and Bayesian estimation of population parameters. Bioinformatics. Advance Access doi: 10.1093/bioinformatics/btk051). Watterson's estimator showed a downward bias, as expected, with high values of
. LAMARC's mean estimate was accurate for all tested values of
and rate variation except for a downward bias when rate variation was maximal (i.e., the slow rate was zero). LAMARC had consistently narrower confidence intervals (CIs) than Watterson's estimator. Both methods tended to reject the truth too often when rate variation was 8x or greater and independent among branches, as well as when variation was 4x or greater and correlated among branches. In the case of Watterson's estimate, this excess rejection was fully attributable to variation among genealogies in the amount of total branch length associated with the fast and slow rates. However, in the case of LAMARC, some excess rejection was still observed even when between-genealogy variation was taken into account. Both estimators are robust to modest rate variation; however, their use should be coupled with a statistical test to rule out extreme rate variation as the resulting CIs may not be reliable.
Key Words: coalescent mutation rate variation maximum likelihood