Skip Navigation

This Article
Right arrow FREE Full Text (PDF) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (14)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Sankoff, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Sankoff, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Molecular Biology and Evolution, Vol 7, 255-269, Copyright © 1990 by Society for Molecular Biology and Evolution


ORIGINAL ARTICLE

Designer invariants for large phylogenies

D Sankoff
Centre de Recherches Mathematiques, Univeriste de Montreal, Quebec, Canada.

The Cavender-Felsenstein edge-length invariants for binary characters on 4-trees provide the starting point for the development of "customized" invariants for evaluating and comparing phylogenetic hypotheses. The binary character invariants may be generalized to k- valued characters without losing the quadratic nature of the invariants as functions of the theoretical frequencies f(UVXY) of observable character configurations (U at organism 1, V at 2, etc.). The key to the approach is that certain sets of these configurations constitute events which are probabilistically independent from other such sets, under the symmetric Markov change models studied. By introducing more complex sets of configurations, we find the quadratic invariants for 5- trees in the binary model and for individual edges in 6-trees or, indeed, in any size tree. The same technique allows us to formulate invariants for entire trees, but these are cubic functions for 6-trees and are higher-degree polynomials for larger trees. With k-valued characters and, especially, with large trees, the types of configuration sets (events) used in the simpler examples are too rare (i.e., their predicted frequencies are too low) to be useful, and the construction of meaningful pairs of independent events becomes an important and nontrivial task in designing invariants suited to testing specific hypotheses. In a very natural way, this approach fits in with well-known statistical methodology for contingency tables. We explore use of events such as "only transitions occur for character i (i.e., position i in a nucleic acid sequence) in subtree a" in analyzing a set of data on ribosomal RNA in the context of the controversy over the origins of archaebacteria, eubacteria, and eukaryotes.
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.