MBE Advance Access published online on December 8, 2006
Molecular Biology and Evolution, doi:10.1093/molbev/msl193
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Article |
Phylogenetic Methodology For Detecting Protein Interactions
1 Department of Statistics and Department of Biological Sciences, University of South Carolina, Columbia, SC 29208, USA, waddell{at}stat.sc.edu
2 Laboratory of Biometry and Bioinformatics, Graduate School of Agriculture and Life Sciences, University of Tokyo, 1-1-1 Yayoi Bunkyo-ku, Tokyo 113-8657, Japan, kishino{at}wheat.ab.a.u-tokyo.ac.jp
3 Allan Wilson Centre, Institute of Molecular BioSciences, Massey University, Palmerston North, New Zealand, otarissa{at}hotmail.com
# Current address 2, Communication preferred by email, please advise by email before using other forms of communication
* Corresponding author. Email: waddell{at}med.sc.edu
Detecting protein-protein interactions and assigning proteins to functional complexes are key challenges of modern biology. The rise of genomics has lead to evidence that correlated patterns of presence/absence and/or fusing of proteins in any organism, suggest these proteins interact. Unfortunately, methods based on such data work best with divergent genomes, whereas major sequencing efforts in vertebrates, for example, are yielding alignments of the same set of proteins sampled from the same set of taxa (species). Using vertebrate mitochondrial genomes to illustrate a novel method, we associate proteins based on vectors of their evolutionary tree edge (branch or internode) lengths. This approach is based on the expectation that molecular coevolution is greatest between proteins that interact in some way. mtDNA-encoded proteins are associated into groups largely consistent with the complexes they come from. This association is apparently not due to the tree structure or mutation processes, leaving coevolution as the best explanation.
We show that it is important that the tree used to derive the edge length vector is estimated accurately in terms of both topology and edge lengths. While more complex substitution models reduce systematic error, they also inflate stochastic error. This makes the use of less complex substitution models preferable in some circumstances. We describe a method to estimate correlations of pairwise evolutionary distances, that adjusts for nonindependent correlations due to shared evolutionary history. Associations of proteins based on their edge length vectors are visualized and assessed using a variety of hierarchical clustering and multidimensional scaling methods. New formula for estimating the fit of data-to-model, including the average percent standard deviation of distances on least squares trees, are presented. Use of edge length vectors is compared and contrasted with correlated distance methods, correlated rates methods and site-specific evidence of coevolution.
Key Words: detecting correlated evolution mitochondrial genome metabolic complexes molecular co-evolution protein-protein interaction
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
E. R.M. Tillier and R. L. Charlebois The human protein coevolution network Genome Res., October 1, 2009; 19(10): 1861 - 1871. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Nabholz, S. Glemin, and N. Galtier Strong Variations of Mitochondrial Mutation Rate across Mammals--the Longevity Hypothesis Mol. Biol. Evol., January 1, 2008; 25(1): 120 - 130. [Abstract] [Full Text] [PDF] |
||||

