MBE Advance Access originally published online on June 26, 2006
Molecular Biology and Evolution 2006 23(9):1648-1651; doi:10.1093/molbev/msl046
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Letter |
Evidence that the Root of the Tree of Life Is Not within the Archaea

,

,
,
,
* Department of Molecular, Cellular, and Developmental Biology, University of California, Los Angeles;
UCLA Astrobiology Institute, University of California, Los Angeles;
Molecular Biology Institute, University of California, Los Angeles; and
Department of Human Genetics, University of California, Los Angeles
E-mail: lake{at}mbi.ucla.edu.
| Abstract |
|---|
|
|
|---|
The Archaea occupy uncommon and extreme habitats around the world. They manufacture unusual compounds, utilize novel metabolic pathways, and contain many unique genes. Many suspect, due to their novel properties, that the root of the tree of life may be within the Archaea, although there is little direct evidence for this root. Here, using gene insertions and deletions found within protein synthesis factors present in all prokaryotes and eukaryotes, we present statistically significant evidence that the root of life is outside the Archaea.
Key Words: tree of life root indels Archaea cenancestor
The Archaea are a group of prokaryotic organisms found in unusual and extreme habitats around the world (Wiegel and Adams 1998
). Frequently they inhabit environments characterized by an abundance of geochemically generated energy-rich compounds thought to be likely substrates for the origin of life (Russell and Martin 2004
; Kelly et al. 2005
) and may have been present for 3.4 billion years (Ueno et al. 2006
).
Many suspect that the root of the tree of life, or cenancestor (Fitch and Upper 1987
), may be within the Archaea. Some (Pace 2006
) support the traditional root (Gogarten et al. 1989
; Iwabe et al. 1989
; Brown and Doolittle 1997
), others think that it is elsewhere (Lopez et al. 1999
; Cavalier-Smith 2002
), and still others suggest that no unique root exists (Kurland et al. 2006
). Here, using gene insertions and deletions (indels) found within ubiquitous protein synthesis factors, we present evidence that the root of life is outside the Archaea.
Indel rooting (Rivera and Lake 1992
; Baldauf and Palmer 1993
; Gupta 1998
) is illustrated in figure 1 using homologous, aligned regions from 2 ubiquitous genes, labeled "Ortholog 1" and "Ortholog 2." The sequence alignment at the top center of the figure contains a deletion, marked by shading, whereas the bottom two-thirds of Ortholog 1 and all of Ortholog 2, marked "common," lack the deletion. We refer to Ortholog 1 as the informative group, or ingroup, and Ortholog 2 as the outgroup because the former contains the phylogenetically informative indel and the latter serves as the outgroup. The part of the tree that corresponds to the deletion is shown by shading on the reference trees at the left and right of the figure and includes 2 terminal taxa and their common ancestor. Most parsimoniously, the root of the tree of life cannot be located within any shaded region. To appreciate this, compare Root 1, shown by arrows on the left side of the tree, with Root 2, shown by arrows on the right side of the tree. Any root that does not connect to the shaded portion of the tree requires only a single change (one deletion), whereas any root that connects within the shaded area requires 1 deletion and 1 reinsertion or 2 insertions, depending on the root location. Thus, the analyses of indels in orthologous/paralogous gene pairs can be used to exclude the root from regions of the tree.
|
Gene indel analyses have practical advantages. Because they can persist over long timescales without changing length or position, as evidenced by the fact that numerous amino acid replacements occur within indels, they are potentially less affected by artifacts of phylogenetic reconstruction. As in sequence analyses, the evolution of indels is shaped by both clonal and coalescent (gene transfer) processes (Doolittle 1999
Protein synthesis factors EF-Tu, EF-G, and IF-2 are well suited for indel-rooting studies. They are ubiquitous and are unambiguously produced by ancient gene duplications, and their inserts change slowly (Baldauf et al. 1996
; Hashimoto and Hasegawa 1996
; Gupta 1998
). In eukaryotes and archaebacteria, EF-Tu is named EF-1alpha and EF-G is named EF-2. To avoid confusion, the names EF-Tu and EF-G will be used here. All 3 factors shuttle transfer RNAs on the ribosome during protein synthesis using guanosine triphosphate (GTP) as an energy source, and the indels analyzed below are from the GTP-binding domains.
Two protein synthesis indel sets exclude the root from the combined clade of the Archaea and the Eukaryotes, but other protein synthesis indels have been previously analyzed (Rivera and Lake 1992
; Baldauf and Palmer 1993
). The root-informative indel summarized in table 1 is contained in the VNK/NPD region corresponding to amino acid positions 141152 in the Escherichia coli EF-G outgroup sequence. All archaebacterial and eukaryotic ingroup IF-2 sequences contain a single insert, 1amino acid long in this region, and this insert is absent from all other IF-2 sequences. The insert is also absent from all outgroup, EF-G, sequences. The lack of length variation within this region and the absence of the indel in all eubacteria argue against these indels being transferred between the archaeal and bacterial domains. This indel excludes the root from the archaealeukaryotic clade, as shown in figure 2a.
|
|
A second root-informative insert occurs within the RGIT/DTPGH region of EF-G. This region, corresponding to amino acids 5985 in the EF-Tu E. coli outgroup sequence, contains a 4amino acid insert that excludes the root from the Archaea and the Eukaryotes. It also contains indels, unrelated to the 4amino acid insert, at a second site within some bacterial sequences. A single 4-amino acidlong insert, corresponding to sequence MTHE and variants, is found in all ingroup archaeal and eukaryotic sequences, as shown on the top left side of table 2. In addition, all eukaryotic EF-G sequences contain variants of the sequence with an additional insert, indicated by a "
." Additional eubacterial insertions are present in the EF-G sequence at a second site, approximately 4 amino acids downstream of the MTHE insert. These are present in some alphaproteobacteria, in beta- and gammaproteobacteria, and in chlorobi and are absent in firmicutes, cyanobacteria, actinobacteria, spirochetes, and other bacterial groups. Eubacterial inserts present in all members of a phylum are shown as solid ellipses in figure 2b or as a dashed ellipse when they have a patchy distribution. The insert present in beta- and gammaproteobacteria may reflect a common ancestry but is conservatively assumed to be due to 2 independent inserts. All outgroup EF-Tu sequences are of constant length. This indel also excludes the root from the Archaea and from the archaealeukaryotic clade, as shown in figure 2b.
|
Finally, we address some shortcomings of indel analyses. Notably, indel analyses have been hindered by the lack of statistical tests. Here we present a new method (for details see Supplementary Material online) and demonstrate that the 2 indels analyzed here exclude the root from the archaebacterialeukaryotic clade at a statistically significant level (2%). Also, the analysis of the archaeal RGIT/DTPGH indel is complicated by the nearness of a second eubacterial indel. However, this indel does not affect our conclusion. Because the archaealeukaryotic insert is unambiguously absent from all outgroup clades and from the firmicute, actinobacterial, and cyanobacterial ingroup clades, this insert must exclude the root from the archaealeukaryotic clade, even if some proteobacterial inserts were to be orthologous to the archaebacterial one. Finally, although some indels are chaotically distributed within the eukaryotic enolases (Harper and Keeling 2004
Our findings indicate that the root of the tree of life is excluded from the Archaea. The ubiquitous presence of the ribosome and its components gives one reasonable confidence that the results obtained here are representative of the evolution of life. Consistent with this, literature searches provide little support for a root within the Archaea, as opposed to many reports that support the traditional root between the Archaea and the Bacteria (Gogarten et al. 1989
; Iwabe et al. 1989
), but see Philippe and Forterre (1999)
. For example, a recent phylogenetic analysis of ancient ortholog/paralog sets (Zhaxybayeva and Gogarten 2004
) found only a single gene set that placed the root within the Archaea, whereas 9 sets supported the traditional root between the Archaea and the Bacteria and 7 sets supported a root in the Bacteria.
Given how little is known about the location of the root, excluding it from even a part of the tree of life provides important information that may constrain scenarios for life's early evolution. Few ubiquitous genes containing useful indels exist, and the protein synthesis genes contain some of the best. If new analytical methods can be found to increase the number of useful indels and make their interpretation more reliable, then perhaps some day the root may even be mapped to a unique region of the tree of life.
| Supplementary Material |
|---|
|
|
|---|
Alignments of individual sequences are provided in Table 1 and detailed alignments of 1,129 EF-G and EF-Tu sequences, the National Center for Biotechnology Information database, 23 February 2006, is provided in Table 2 of the Supplemental Alignments section that is available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
| Acknowledgements |
|---|
|
|
|---|
This work was supported by grants from the National Institutes of Health University of California, Los Angeles (NIH UCLA) Cell and Molecular Training Program to J.A.S., from the NIH UCLA Genomic Analysis Training Program to C.W.H., and from the National Science Foundation (NSF) UCLA Bioinformatics Training Program to R.G.S. and by research grants from the NSF Systematics Program, the National Aeronautics and Space Administration UCLA Astrobiology Institute, and the Department of Energy to J.A.L.
| Footnotes |
|---|
Martin Embley, Associate Editor
| References |
|---|
|
|
|---|
Baldauf S, Palmer J. 1993. Animals and fungi are each other's closest relatives: congruent evidence from multiple proteins. Proc Natl Acad Sci USA 90:1155862.
Baldauf SL, Palmer JD, Doolittle WF. 1996. The root of the universal tree and the origin of eukaryotes based on elongation factor phylogeny. Proc Natl Acad Sci USA 93:774954.
Brown JR, Doolittle WF. 1997. Archaea and the prokaryote-to-eukaryote transition. Microbiol Mol Biol Rev 61:456502.[Abstract]
Cavalier-Smith T. 2002. The neomuran origin of archaebacteria, the negibacterial root of the universal tree and bacterial megaclassification. Int J Syst Evol Microbiol 52:776.[Abstract]
Doolittle WF. 1999. Phylogenetic classification and the universal tree. Science 284:21248.
Fitch WM, Upper K. 1987. The phylogeny of tRNA sequences provides evidence for ambiguity reduction in the origin of the genetic code. Cold Spring Harbor Symp Quant Biol 52:75967.
Gogarten JP, Kibak H, Dittrich P, et al. (13 co-authors). 1989. Evolution of the vacuolar H+-Atpaseimplications for the origin of eukaryotes. Proc Natl Acad Sci USA 86:66615.
Gupta RS. 1998. Protein phylogenies and signature sequences: a reappraisal of evolutionary relationships among archaebacteria, eubacteria, and eukaryotes. Microbiol Mol Biol Rev 62:143591.
Harper JT, Keeling PJ. 2004. Lateral gene transfer and the complex distribution of insertions in eukaryotic enolase. Gene 340:22735.[CrossRef][Web of Science][Medline]
Hashimoto T, Hasegawa M. 1996. Origin and early evolution of eukaryotes inferred from the amino acid sequences of translation elongation factors 1alpha/Tu and 2/G. Adv Biophys 32:73120.[CrossRef][Web of Science][Medline]
Inagaki Y, Susko E, Roger AJ. 2006. Recombination between elongation factor 1alpha genes from distantly related archaeal lineages. Proc Natl Acad Sci USA 103:452833.
Iwabe N, Kuma K, Hasegawa M, Osawa S, Miyata T. 1989. Evolutionary relationship of archaebacteria, eubacteria, and eukaryotes inferred from phylogenetic trees of duplicated genes. Proc Natl Acad Sci USA 86:93559.
Kelly D, Karson J, Fruh-Green G, et al. (25 co-authors). 2005. A serpentinite-hosted ecosystem: the Lost City hydrothermal field. Science 307:142834.
Kurland CG, Collins LJ, Penny D. 2006. Genomics and the irreducible nature of eukaryote cells. Science 312:10114.
Lake JA, Moore JE, Simonson AB, Rivera MC. 2005. Fulfilling Darwin's dream. In: Sapp J, editor. Microbial evolution: concepts and controversies. New York: Oxford University Press. p 184206.
Lopez P, Forterre P, Philippe H. 1999. The root of the tree of life in the light of the covarion model. J Mol Evol 49:496508.[CrossRef][Web of Science][Medline]
Pace NR. 2006. Time for a change. Nature 441:289.
Pearson WR. 1996. Effective protein sequence comparison. Methods Enzymol 266:22758.[Web of Science][Medline]
Philippe H, Forterre P. 1999. The rooting of the universal tree of life is not reliable. J Mol Evol 49:50923.[CrossRef][Web of Science][Medline]
Rivera MC, Lake JA. 1992. Evidence that eukaryotes and eocyte prokaryotes are immediate relatives. Science 257:746.
Russell M, Martin W. 2004. The rocky roots of the acetyl-CoA pathway. Trends Biochem Sci 29:35863.[CrossRef][Web of Science][Medline]
Ueno Y, Yamada K, Yoshida N, Maruyama S, Isozaki Y. 2006. Evidence from fluid inclusions for microbial methanogenesis in the early Archaean era. Nature 440:5169.[CrossRef][Medline]
Wiegel J, Adams M. 1998. Thermophiles: the keys to molecular evolution and the origin of life? London: Taylor & Francis Inc.
Zhaxybayeva O, Gogarten JP. 2004. Cladogenesis, coalescence and the evolution of the three domains of life. Trends Genet 20:1827.[CrossRef][Web of Science][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J. A. Lake, R. G. Skophammer, C. W. Herbold, and J. A. Servin Genome beginnings: rooting the tree of life Phil Trans R Soc B, August 12, 2009; 364(1527): 2177 - 2185. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Lake, J. A. Servin, C. W. Herbold, and R. G. Skophammer Evidence for a New Root of the Tree of Life Syst Biol, December 1, 2008; 57(6): 835 - 843. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. J. Workun, K. Moquin, R. A. Rothery, and J. H. Weiner Evolutionary Persistence of the Molybdopyranopterin-Containing Sulfite Oxidase Protein Fold Microbiol. Mol. Biol. Rev., June 1, 2008; 72(2): 228 - 248. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


, Pß, P
, P
, and P
, respectively), the Chlorobi (Ch), the Cyanobacteria (Cy), the Spirochetes (Sp), the Firmicutes (F), and the Actinobacteria (A). Regions excluded by indels are circled by ellipses and shaded, see text.

