MBE Advance Access published online on June 26, 2008
Molecular Biology and Evolution, doi:10.1093/molbev/msn145
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Article |
Likelihood Based Clustering (LiBaC) for Codon Models, a method for grouping sites according to similarities in the underlying process of evolution
1 Department of Mathematics and Statistics, Dalhousie University, Halifax Nova Scotia, B3H 4J1, Canada
2 Department of Biology, Dalhousie University, Halifax Nova Scotia, B3H 4J1, Canada
Received for publication March 11, 2008. Revision received May 26, 2008. Accepted for publication May 30, 2008.
Models of codon evolution are useful for investigating the strength and direction of natural selection via a parameter for the nonsynonymous/synonymous rate ratio (
= dN /dS). Different codon models are available to account for diversity of the evolutionary patterns among sites. Codon models which specify data partitions as fixed effects allow the most evolutionary diversity among sites, but require that sites-partitions are a priori identifiable. Models which use a parametric distribution to express the variability in the
ratio across sites do not require a priori partitioning of sites, but they permit less among-site diversity in the evolutionary process. Simulation studies presented in this paper indicate that differences among sites in estimates of
under an overly simplistic analytical model can reflect more than just natural selection pressure. We also find that the classic LRTs for positive selection have a high false positive rate in some situations. In this paper we developed a new method for assigning codon sites into groups where each group has a different model, and the likelihood over all sites is maximized. The method, called Likelihood Based Clustering (LiBaC), can be viewed as a generalization of the family of Model Based Clustering (MBC) approaches to models of codon evolution. We report the performance of several LiBaC-based methods, and selected alternative methods, over a wide variety of scenarios. We find that LiBaC, under an appropriate model, can provide reliable parameter estimates when the process of evolution is very heterogeneous among groups of sites. Certain types of proteins, such as transmembrane proteins, are expected to exhibit such heterogeneity. A survey of genes encoding transmembrane proteins suggests that overly-simplistic models could be leading to false signal for positive selection among such genes. In these cases, LiBaC-based methods offer an important addition to a "tool box" of methods thereby helping to uncover robust evidence for the action of positive selection.
Key Words: Codon model Likelihood Based Clustering Bayes Error Rate nonsynonymous/synonymous rate ratio Positive Darwinian selection
Current address: Department of Biology, Dalhousie University, Halifax, Nova Scotia B3H 4J1, Canada, Phone: (902) 494-7844, Fax: (902) 494-3736 e-mail: j.bielawski{at}dal.ca
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
W. Delport, K. Scheffler, and C. Seoighe Models of coding sequence evolution Brief Bioinform, January 1, 2009; 10(1): 97 - 109. [Abstract] [Full Text] [PDF] |
||||
