MBE Advance Access originally published online on April 14, 2004
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Mol. Biol. Evol. 21(7):1401-1408. 2004
DOI: 10.1093/molbev/msh138
© 2004 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038
Research Article |
Genome Phylogenetic Analysis Based on Extended Gene Contents

,
* Department of Genetics, Development, and Cell Biology
Center for Bioinformatics and Biological Statistics, Iowa State University
Department of Mathematics and Statistics, University of West Florida
E-mail: xgu{at}iastate.edu
Abstract
With the rapid growth of entire genome data, whole-genome approaches such as gene content become popular for genome phylogeny inference, including the tree of life. However, the underlying model for genome evolution is unclear, and the proposed (ad hoc) genome distance measure may violate the additivity. In this article, we formulate a stochastic framework for genome evolution, which provides a basis for defining an additive genome distance. However, we show that it is difficult to utilize the typical gene content datai.e., the presence or absence of gene families across genomesto estimate the genome distance. We solve this problem by introducing the concept of extended gene content; that is, the status of a gene family in a given genome could be absence, presence as single copy, or presence as duplicates, any of which can be used to estimate the genome distance and phylogenetic inference. Computer simulation shows that the new tree-making method is efficient, consistent, and fairly robust. The example of 35 microbial complete genomes demonstrates that it is useful not only to study the universal tree of life but also to explore the evolutionary pattern of genomes.
Key Words: Gene content additive genome distance phylogenetic inference comparative genomics
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
W. Iwasaki and T. Takagi Reconstruction of highly heterogeneous gene-content evolution across the three domains of life Bioinformatics, July 1, 2007; 23(13): i230 - i239. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. F. Doolittle and E. Bapteste Inaugural Article: Pattern pluralism and the Tree of Life hypothesis PNAS, February 13, 2007; 104(7): 2043 - 2049. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Borenstein, T. Shlomi, E. Ruppin, and R. Sharan Gene loss rate: a probabilistic measure for the conservation of eukaryotic genes Nucleic Acids Res., January 12, 2007; 35(1): e7 - e7. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Hao and G. B. Golding The fate of laterally transferred genes: Life in the fast lane to adaptation or death. Genome Res., May 1, 2006; 16(5): 636 - 643. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. W. Hahn, T. De Bie, J. E. Stajich, C. Nguyen, and N. Cristianini Estimating the tempo and mode of gene family evolution from comparative genomic data Genome Res., August 1, 2005; 15(8): 1153 - 1160. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Gu, W. Huang, D. Xu, and H. Zhang GeneContent: software for whole-genome phylogenetic analysis Bioinformatics, April 15, 2005; 21(8): 1713 - 1714. [Abstract] [Full Text] [PDF] |
||||



