Skip Navigation


MBE Advance Access originally published online on December 20, 2006
Molecular Biology and Evolution 2007 24(3):743-756; doi:10.1093/molbev/msl202
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrowOA All Versions of this Article:
24/3/743    most recent
msl202v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Google Scholar
Right arrow Articles by Brindefalk, B.
Right arrow Articles by Andersson, S. G. E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Brindefalk, B.
Right arrow Articles by Andersson, S. G. E.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2006 The Authors
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (
http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.


Research Articles

Origin and Evolution of the Mitochondrial Aminoacyl-tRNA Synthetases

Björn Brindefalk, Johan Viklund, Daniel Larsson, Mikael Thollesson and Siv G. E. Andersson

Department of Molecular Evolution, Evolutionary Biology Center, Uppsala University, Uppsala, Sweden

E-mail: siv.andersson{at}ebc.uu.se.


    Abstract
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Many theories favor a fusion of 2 prokaryotic genomes for the origin of the Eukaryotes, but there are disagreements on the origin, timing, and cellular structures of the cells involved. Equally controversial is the source of the nuclear genes for mitochondrial proteins, although the {alpha}-proteobacterial contribution to the mitochondrial genome is well established. Phylogenetic inferences show that the nuclearly encoded mitochondrial aminoacyl-tRNA synthetases (aaRSs) occupy a position in the tree that is not close to any of the currently sequenced {alpha}-proteobacterial genomes, despite cohesive and remarkably well-resolved {alpha}-proteobacterial clades in 12 of the 20 trees. Two or more {alpha}-proteobacterial clusters were observed in 8 cases, indicative of differential loss of paralogous genes or horizontal gene transfer. Replacement and retargeting events within the nuclear genomes of the Eukaryotes was indicated in 10 trees, 4 of which also show split {alpha}-proteobacterial groups. A majority of the mitochondrial aaRSs originate from within the bacterial domain, but none specifically from the {alpha}-Proteobacteria. For some aaRS, the endosymbiotic origin may have been erased by ongoing gene replacements on the bacterial as well as the eukaryotic side. For others that accurately resolve the {alpha}-proteobacterial divergence patterns, the lack of affiliation with mitochondria is more surprising. We hypothesize that the ancestral eukaryotic gene pool hosted primordial "bacterial-like" genes, to which a limited set of {alpha}-proteobacterial genes, mostly coding for components of the respiratory chain complexes, were added and selectively maintained.

Key Words: mitochondria • phylogeny • aminoacyl-tRNA synthetase


    Introduction
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
The evolutionary origin of the eukaryotic genome is debated, particularly the extent to which it is the product of chimerism, genome fusion, or endosymbiotic events (Embley and Martin 2006Go; Kurland et al. 2006Go). A hypothesis based on a recent analysis of gene content data is that the eukaryotic genome is the result of a fusion of 2 prokaryotic genomes (Rivera and Lake 2004Go). Likewise, various endosymbiotic models for the origin of mitochondria and chloroplasts from {alpha}-Proteobacteria and Cyanobacteria imply transfers of bacterial genes into the nuclear genome of the eukaryotic host (reviewed in Gray et al. 1999Go; Dyall et al. 2004Go; Kurland et al. 2006Go). Yet, the question of whether the eubacterial partner in the fusion event corresponds to the endosymbiont that contributed the mitochondrial genome (Martin and Koonin 2006Go) is left unresolved (Martin and Embley 2004Go; Bapteste and Walsh 2005Go). Also debated is whether the host was a nucleus-less archaebacterium (Martin 2005Go) or a highly compartmentalized eukaryotic cell (Mans et al. 2003Go).

Endosymbiotic models for the origin of mitochondria posit that there was a massive transfer of genes from the bacterial endosymbiont to the nuclear genome of the host. Thus, the expectation is that mitochondrial and nuclear genomes of the Eukaryotes contain genes of {alpha}-proteobacterial ancestry, as verified by case studies of mitochondrial proteins involved in aerobic respiration (Gray et al. 1999Go; Hrdy et al. 2004Go; Fitzpatrick et al. 2006Go). However, broader phylogenomic studies show that less than 20% of the proteins examined in the mitochondrial proteomes can be traced back with confidence to the {alpha}-Proteobacteria (Gabaldon and Huynen 2003Go; Karlberg and Andersson 2003Go). More than 50% of the proteins in the mitochondrial proteomes have no bacterial homologs, and for the remaining circa 30% that have bacterial homologs, a confirmation of the {alpha}-proteobacterial descent is lacking (Karlberg and Andersson 2003Go). The problems in identifying the source of the many bacterial-like genes in the eukaryotic genome have been attributed to rampant horizontal gene transfer, differential gene deletion, extensive duplication, and loss of the phylogenetic signal at deep evolutionary divergences (Creevey et al. 2004Go; Martin and Embley 2004Go; Bapteste and Walsh 2005Go). Indeed, only a small set of essential and broadly distributed core proteins may retain enough of the phylogenetic signal to support inferences of very deep evolutionary relationships.

The focus of this study is on the evolutionary origin of the genes for the aminoacyl-tRNA synthetases (aaRS). Because of their ubiquity, conservation, specificity, and defined interactions in protein synthesis, the aaRS represent important keys to resolve early cellular evolution (Diaz-Lazcoz et al. 1998Go; Wolf et al. 1999Go; Woese et al. 2000Go; Brown 2001Go, 2003Go). Although the canonical patterns have been partially eroded by duplication, divergence, and horizontal gene transfers (Brown et al. 2003Go), traces of ancestral relationships are still evident in many aaRS trees (Woese et al. 2000Go). A particular advantage with the aaRSs for the purpose of this study is that there are normally 2 nuclear genes of different origins that are targeted to the cytoplasm and the mitochondrion (Kurland and Andersson 2000Go). This makes it easy to identify and exclude cases of intranuclear gene duplication and replacement events, disguised as duplicate genes of the same origin or as a single gene of either bacterial or eukaryotic descent (Kurland and Andersson 2000Go).

Our analysis of the aaRS shows that the phylogenetic signal is retained in a majority of cases, as indicated by well-resolved {alpha}-proteobacterial species divergences. Yet, none of the mitochondrial aaRS cluster within the {alpha}-proteobacterial clade. This contrasts with the divergence pattern observed for other proteins in the mitochondrial energy production system, which cluster with the {alpha}-Proteobacteria. The implication of these conflicting signals is discussed in the context of suggested fusion models for the origin of the nuclear genomes of the Eukaryotes.


    Methods
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Sequence Data
Complete genome sequence data as well as protein sequences of the aaRS were directly retrieved from the National Center for Biotechnology Information database (http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi). In the very few cases where no previous annotation was available, the aaRS genes were identified using the BlastP algorithm (Altschul et al. 1997Go) with the sequence from the most closely related species used as a query and the best hit in the genome presumed to be the homolog. The data sets used in the analysis consisted of a total of 70 species, selected from completely sequenced genomes (as of 1 October 2005). All published {alpha}-proteobacterial genome sequences, a representative selection of other eubacterial groups, and a number of Archaea and Eukaryotes were included for the phylogenetic analysis of the mitochondrial aaRS. In some cases the number of species was lower when no apparent homolog could be found.

Sequence Alignment
Protein sequences were aligned using ClustalW 1.81 (Thompson et al. 1994Go), and gaps were manually edited with the aid of the Seaview alignment editor (Galtier et al. 1996Go). The program SOAP, version 1.1 (Löytynoja and Milinkovitch 2001Go) was used to check for regions of ambiguous alignment by realignments using a wide range of different parameter settings. Nucleotide sequences were aligned using the DiAlign software, version 2.2.1 (Morgenstern 2004Go), with similarity calculated at the peptide level.

Model Selection
Appropriate protein models were selected for each of the aaRS using the software MODELGENERATOR, version 0.82 (Keane et al. 2006Go). In all cases the proposed model was WAG (Whelan and Goldman 2001Go).

Phylogenetic Inference
Neighbor-Joining (NJ) trees were constructed using the amino acid sequences after global and pairwise gap removal, using Phylo_win, version 2.0 (Galtier et al. 1996Go) on Poisson distances. Maximum likelihood (ML) analyses were done with PHYML, version 2.4.4 (Guindon and Gascuel 2003Go) using the most appropriate model of protein evolution. Bootstrap support values were derived from 500 replicates for the NJ analyses and 100 replicates for the ML analyses, with the exception of the ML LeuRS tree, for which 500 bootstrap replicates were done. The above steps were performed for all the 18 aaRS, unless otherwise stated.

Bayesian analyses were done on the PheRS and LeuRS alignments using protein models with MrBayes (MPI version; Hulsenbeck and Ronquist 2001). We applied a fixed rate (empirical) mixture model with a gamma distributed rate variation; the overwhelmingly dominating contribution came from the WAG model. Two independent Markov chains, each with 4 differently heated chains, were run for 106 generations, and the first 105 generations were discarded as burn-in; all free parameters were examined for convergence. LogDet distances (Lockhart et al. 1994Go) were calculated and to account for differences in amino acid bias using LDDist version 1.4 (Thollesson 2004Go). NJ trees as well as Neighbor Net networks to visualize conflicting phylogenetic signals (Bryant and Moulton 2004Go) were constructed with SplitsTree, version 4 (Huson 1998Go). NJ trees were also constructed using the distance method of Gaultier and Gouy (1995) on nucleotides from first and second codon positions.

Phylogenetic Hypothesis Testing
To test the hypothesis that the best phylogeny where the mitochondrial sequences form a clade with the {alpha}-Proteobacteria is not significantly worse than the unconstrained best hypothesis, we used the SOWH test (Swofford et al. 1996Go) as described by Goldman and coworkers (as posPfud; Goldman et al. 2000Go) on the LeuRS and PheRS data sets. The test statistic is the (double) difference in log likelihood between the unconstrained and constrained optimal trees, and the null distribution is calculated using parametric bootstrapping. TreeFinder, version of October 2005 (Jobb et al. 2004Go), was used to find the best unconstrained and constrained (forcing mitochondrial and {alpha}-proteobacterial sequences to form a clade) and simulated data sets (100 replicates) were created with Seq-Gen, version 1.3.2 (Rambaut and Grassly 1997Go) on the best-constrained hypothesis. The substitution model and parameters used were the one found by the Bayesian analysis (WAG, gamma distributed heterogeneity). For the simulated data sets, amino acid sites corresponding to gaps in the real aligned data were replaced with gaps. Recombination tests were performed using Topali V2, version 2.09 (Milne et al. 2004Go).

Relative Substitution Frequency Estimates
To assess if LeuRS and/or PheRS sequences show an elevated substitution rate in Eukaryotes compared with the {alpha}-Proteobacteria, we did a separate Bayesian analysis of the 2 groups. The data were composed of 6 selected amino acid sequences that in Eukaryotes represent mitochondrial and nuclear genes, as well as mitochondrial and cytoplasmic proteins, in addition to PheRS and LeuRS. The data was partitioned in 8 separate character partitions, each with its own rate heterogeneity parameter, but with the same topology. Each partition, however, had its own rate multiplier allowing for different relative substitution rates. These relative rates from the analyses of the eukaryote and {alpha}-proteobacterial groups were then compared.

Phylogenomic Analyses
The sequences from the 630 putative protomitochondrial proteins were extracted from the supplementary data in Gabaldon and Huynen (2003)Go and blasted against all prokaryotic genomes (as of 1 October 2005, http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi) using BlastP (Altschul et al. 1997Go). Best hits (E < 10–3) were extracted and aligned using ClustalW 1.81 (Thompson et al. 1994Go). NJ trees were generated for our updated set of homologous proteins and compared with the published set of homologous proteins (Gabaldon and Huynen 2003Go). The constructed trees were placed in a database that also included the original trees (Gabaldon and Huynen 2003Go). A viewer was written in-house to enable visual inspection and comparison of the 2 sets of trees.


    Results
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
We were interested in examining the {alpha}-proteobacterial contribution to the eukaryotic genome by phylogenetic inference of the mitochondrial aaRS. To this end we inferred tree topologies for each of the 20 aaRS extracted from more than 20 {alpha}-proteobacterial and 10 eukaryotic genomes among a total of more than 70 genomes using NJ and ML methods. We sorted the aaRS trees into 3 broad groups based on the cohesion of the {alpha}-Proteobacteria and the number of nuclear genes for the mitochondrial and cytoplasmic aaRS. The rationale was that the proteins most likely to disclose the origin of the mitochondrial aaRS are those that resolve the {alpha}-proteobacterial divergence pattern and for which 2 different nuclear genes are present in the eukaryotic genome, 1 for the mitochondrial and 1 for the cytoplasmic enzyme.

Cohesion of the {alpha}-Proteobacteria
We first tested the retention of the phylogenetic signal for divergences corresponding to the deepest node in the {alpha}-proteobacterial clade (fig. 1). The cohesion of the {alpha}-proteobacterial subdivision was supported in 12 of the 20 examined aaRS trees, 11 with bootstrap support values above 85% in the NJ and/or the ML analysis (supplementary fig. 1, Supplementary Material online). A species tree was previously inferred for the {alpha}-proteobacterial species for which complete genome data are available (Boussau et al. 2004Go; Fitzpatrick et al. 2005). The species topology suggests that members of the Rhizobiales (including human and animal pathogens such as Bartonella and Brucella spp. and plant-associated species such as Sinorhizobium spp.) cluster distinctly from obligate intracellular bacteria of the order Rickettsiales (Rickettsia, Wolbachia, Anaplasma, and Ehrlichia) (Boussau et al. 2004Go; Fitzpatrick et al. 2005). This divergence pattern within the {alpha}-Proteobacteria was observed in all of the 12 cases, typically with bootstrap support values above 90% (supplementary fig. 2, Supplementary Material online). Thus, in two-thirds of the cases, the tree topologies seem not to be distorted by horizontal gene transfer events that involve the {alpha}-Proteobacteria.


Figure 1
View larger version (23K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 1.— The cohesion and divergence patterns of the {alpha}-proteobacterial genomes are supported by 12 of the aaRSs. (A) The topology, bootstrap support values, and genes included in the {alpha}-proteobacterial "species tree" were taken from Boussau et al. (2004)Go. (B) The species tree topology is broadly supported by 12 aaRSs according to a ML analysis of the same set of 13 {alpha}-proteobacterial species used for construction of the species tree (supplementary fig. 2, Supplementary Material online). Bootstrap support values in boxes are based on ML analyses applied to the PheRS, LeuRS, AspRS, ProRS, TyrRS, TrpRS, AlaRS, AsnRS, CysRS, GlyRS, SerRS, ThrRS, ValRS, MetRS, ArgRS, HisRS, IleRS, and GlnRS alignments in that order, to be read from the left to the right and from the top to the bottom. Bootstrap support values shown in the box at node A were taken from supplementary figure 1 (Supplementary Material online) and those inside the boxes at other nodes were taken from supplementary figure 2 (Supplementary Material online). Bootstrap support values not inside boxes were taken from ML analyses of the PheRS and LeuRS alignments (supplementary fig. 2, Supplementary Material online). Only bootstrap support values above 50% are shown, "_" refer to nodes that are not resolved. The animals and plants indicate the eukaryotic hosts for modern {alpha}-proteobacterial species.

 
A schematic representation of the 20 aaRS trees is shown in figure 2 (for the original trees, see supplementary fig. 1, Supplementary Material online). In 6 of the 12 trees that supported a monophyletic grouping of the {alpha}-Proteobacteria, 2 eukaryotic genes of different origins were identified (PheRS, LeuRS, ProRS, AspRS, TrpRS, and TyrRS). These represent our best candidates for tracing the ancestral origin of the mitochondrial aaRS. Losses and putative replacements of the nuclear genes were observed in the other 6 of these 12 trees (AlaRS, GlyRS, CysRS, ThrRS, AsnRS, and SerRS). The remaining 8 trees revealed 2 or more divergent {alpha}-proteobacterial clusters (HisRS, ValRS, IleRS, GluRS, ArgRS, MetRS, GlnRS, and LysRS), indicative of lineage-specific loss of paralogous genes and/or horizontal gene transfers across the {alpha}-proteobacterial borders. Below, we discuss the placement of the mitochondrial lineage in each of the 20 tree topologies.


Figure 2
Figure 2
Figure 2
View larger version (72K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 2.— Schematic illustration of the relative placement of {alpha}-proteobacterial and mitochondrial taxa in each of the 20 aaRS trees. Letters and colors of the triangles refer to broad species categories: Red, Alfa = {alpha}-Proteobacteria; white, Bac = Bacteria; yellow, Mito = mitochondria; blue, Cyto = eukaryote cytoplasmic; black, Arc = Archaea; green, Plas = plastid and/or plant mitochondria; and mixed blue and yellow, Euk = eukaryotic cytoplasmic and mitochondrial; shaded colors indicate mixed groups and a single line in a triangle indicate a single species of a different category. Sizes of the triangles are proportional to the number of species in each group. Numbers refer to bootstrap support values (>75%) for inferences based on ML and NJ methods in that order. The rightmost values in the PheRS and LeuRS trees were taken from a Bayesian analysis.

 
Phylogenetic Inference of PheRS, LeuRS, AspRS, ProRS, TyrRS, and TrpRS
To begin, we examined the topology of the PheRS, LeuRS, AspRS, ProRS, TyrRS, and TrpRS trees, each of which resolved the {alpha}-proteobacterial divergence pattern and was represented by 2 nuclear genes for the mitochondrial and cytoplasmic aaRS, respectively. Phylogenetic hypotheses were constructed with the aid of ML, NJ (supplementary fig. 1, Supplementary Material online), and Neighbor Net methods (supplementary fig. 3, Supplementary Material online). Additionally, we applied Bayesian analyses to the PheRS and LeuRS alignments (fig. 3). All trees were consistent with the 3-domain hypothesis in that the cytoplasmic and archaeal aaRS formed a group to the exclusion of the bacterial and mitochondrial proteins (Wolf et al. 1999Go; Woese et al. 2000Go). However, irrespective of the method used and the aaRS analyzed, the results consistently placed the node of the mitochondrial divergence within the bacterial domain, but distinct from the {alpha}-proteobacterial clade (fig. 2).


Figure 3
Figure 3
View larger version (95K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 3.— The tree topologies obtained from alignments of (A) PheRS and (B) LeuRS in a representative set of species from Bacteria, Archaea, and Eukaryotes. Numbers refer to bootstrap support values (>75%) with ML, NJ, and Bayesian methods, respectively. Colors refer to broad species categories (red, {alpha}-Proteobacteria; yellow, mitochondria; blue, Eukaryotes; green, plant mitochondria/chloroplast). The PheRS tree was rooted between the divergence of Bacteria and Eukaryotes–Archaea. As outgroups for the LeuRS tree, we used the ValRS and IleRS. A complete set of trees for all of the 20 aaRSs are shown in supplementary figure 1 (Supplementary Material online).

 
The PheRS tree (fig. 3A; supplementary fig. 1, Supplementary Material online) showed similar divergence patterns within the mitochondrial and cytoplasmic clusters, with each clade being supported by bootstrap support values of 100% in the ML analysis. The mitochondrial aaRS represented a deeply diverging clade within the bacterial domain, whereas the cytoplasmic aaRS were most similar to their homologs in the Archaea. The bacterial PheRS consist of 2 subunits {alpha} and ß and are encoded by the pheS and pheT genes that are normally situated in the same operon (Brown 2001Go). The mitochondrial PheRS is a fusion product of the N-terminal part of the {alpha}-subunit and the C-terminal part of the ß subunit (Roy et al. 2005Go). We concatenated the 2 subunits in species where the protein was split. However, to ensure that the PheRS topology was not obscured by different signals from the 2 PheRS subunits, we also inferred phylogenetic trees separately for the {alpha}- and ß-chains. Both subunits yielded individually the same topology as that obtained for the combined sequences (data not shown).

The LeuRS tree (fig. 3B; supplementary fig. 1, Supplementary Material online) showed a similar separation of a broad bacterial group (that includes mitochondria) from an archaeal-cytosolic group, except that the LeuRS from Halobacterium sp. was of the bacterial type (Dohm et al. 2006Go). As in the analysis of PheRS, identical divergence patterns were observed for the 2 sets of nuclear-encoded proteins, with each of the cytoplasmic and mitochondrial clusters being supported by high bootstrap support values. The mitochondrial lineage clustered within the bacterial domain, but showed no particular affiliation with either the {alpha}-Proteobacteria or any other bacterial group included in the analysis.

The ProRS tree (fig. 2; supplementary fig. 1, Supplementary Material online) showed some exceptions to the universal pattern as also noted previously (Woese et al. 2000Go), such as the placement of the plastids and some bacterial species within the eukaryotic domain. Nevertheless, similar divergence patterns were observed for the 2 sets of nuclear-encoded proteins, with the mitochondrial aaRS representing a deeply diverging branch in the bacterial domain. The mitochondrial AspRS lineage formed a cluster with 100% bootstrap support and was embedded in the bacterial domain (fig. 2; supplementary fig. 1, Supplementary Material online), but again was not affiliated with any particular bacterial group. The plant proteins clustered with the Cyanobacteria, consistent with their endosymbiotic origin. An interesting detail in this tree is that Chlorobium tepidum showed a close relationship with the {alpha}-Proteobacteria. Also surprising was that Bacillus anthracis and Deinococcus radiodurans clustered within the archaeal domain.

Five bacterial subtypes were previously suggested for TrpRS, with D. radiodurans containing 2 variants (Woese et al. 2000Go). Like in the previous tree topologies, the mitochondrial enzymes were embedded within the bacterial domain, whereas the cytoplasmic aaRS clustered with those of the Archaea (fig. 2; supplementary fig. 1, Supplementary Material online). The plant organelle aaRS clustered with the Cyanobacteria, suggesting that they were acquired via the chloroplast endosymbiont. Two or more bacterial subtypes were also suggested for TyrRS (Woese et al. 2000Go). In the TyrRS tree presented here, the mitochondrial aaRS from animals and fungi represent a distinct clade in the bacterial domain, distantly related to a large group of bacteria including Chlamydia spp., Escherichia coli, and Bacillus subtilis. The cytoplasmic TyrRS cluster with the Archaea as expected, but there is an interesting split between the animal, fungal, and Encephalitozoon aaRSs in one clade and the plant cytoplasmic aaRS together with Trypanosoma cruzi and Giardia intestinalis in another clade (fig. 2; supplementary fig. 1, Supplementary Material online).

Split {alpha}-Proteobacterial Tree Topologies
Deviations from the inferred {alpha}-proteobacterial species tree were observed in the remaining 8 aaRS trees in that not all {alpha}-proteobacterial species formed a monophyletic clade (fig. 2; supplementary fig. 1, Supplementary Material online). Rather, the {alpha}-proteobacterial genes for HisRS, ValRS, IleRS, GluRS, GlnRS, ArgRS, MetRS, and LysRS were split into 2 or more clusters, each of which was supported by bootstrap support values higher than 90%. The divergence pattern within each cluster was typically congruent with the expected species divergence pattern, suggesting differential loss of ancient paralogs or rare cases of gene exchanges across the domain borders that involved the {alpha}-Proteobacteria. The mitochondrial aaRS were often but not always of bacterial origin, but in either case showed no affiliation with any of the partial {alpha}-proteobacterial clades.

For example, 2 highly divergent {alpha}-proteobacterial clades were observed in the HisRS tree, one of which encompasses the Rickettsiales and additional bacterial species, whereas the other clade contains the Rhizobiales, other Bacteria, the Eukaryotes, and the Archaea. Recently, we demonstrated perfect coevolution of the HisRS sequence and the tRNAHis identity elements for the 2 {alpha}-proteobacterial clades (Ardell and Andersson 2006Go).

The IleRS and ValRS trees also suggested a split of the {alpha}-Proteobacteria into 2 clades, but unlike the HisRS tree, members of the Rickettsiales clustered in the same group as the Archaea. It has previously been shown that 2 bacterial genes, ileS1 and ileS2, are maintained in the Pseudomonas fluorescens, Bacillus cereus, and B. anthracis genomes. The latter gene, ileS2, belongs to the Archaea-cytoplasmic clade and confers drug resistance to naturally produced antibiotic compounds (Brown et al. 2003Go). Here, we show that members of the Rickettsiales cluster with the ileS2 group, whereas the mitochondrial IleRS clusters with Bacteria—only of the ileS1 type, including members of the Rhizobiales.

Likewise, a second gene for MetRS has been identified in Streptococcus pneumoniae that confers resistance to antibiotic compounds (Brown et al. 2003Go). Homologs of this second gene were also observed in gram-positive bacteria, and it was speculated that antibiotic resistance has served as the selection agent for horizontal gene transfer of MetRS genes (Brown et al. 2003Go). Considering this, it is perhaps surprising that the {alpha}-proteobacterial divergences are nevertheless well resolved in our MetRS tree, with the exception of the fresh and salt water isolates Caulobacter crescentus and Silicibacter pomeroyi that cluster with the cytoplasmic and archeael homologs (fig. 2; supplementary fig. 1, Supplementary Material online). The MetRS tree topology is consistent with an early diverging mitochondrial clade within the Bacteria, distinct from the rest of the {alpha}-Proteobacteria.

GluRS conforms to the classical 3-domain pattern in that the archaeal and cytoplasmic aaRS were well separated from the bacterial group (Woese 2000Go). This is the only aaRS that is encoded by 2 or more paralogous genes in some of the {alpha}-proteobacterial species (fig. 2; supplementary fig. 1, Supplementary Material online). The mitochondrial aaRS are clearly of bacterial origin, but not affiliated with any of the 3 partial {alpha}-proteobacterial clades. Finally, we show that a majority of the {alpha}-proteobacterial species contain LysRS of class I, whereas the single eukaryotic gene for LysRS (which is unlikely to represent the ancestral mitochondrial gene because it clusters with the Archaea) belongs to class II, as does also the plant-associated {alpha}-Proteobacteria.

Nuclear Replacements of Mitochondrial and Cytoplasmic aaRS Genes
In a total of 8 trees, we identified only a single eukaryotic aaRS gene in animals and fungi, suggesting loss and replacement of either the ancestral mitochondrial or cytoplasmic gene. Four of these aaRS (AlaRS, GlyRS, CysRS, and ThrRS) revealed a monophyletic grouping of the {alpha}-Proteobacteria, whereas the other 4 (HisRS, ValRS, GlnRS, and LysRS) showed 2 or more divergent {alpha}-proteobacterial clades (fig. 2; supplementary fig. 1, Supplementary Material online). The clustering of the single eukaryotic protein with archaeal proteins in the AlaRS, GlyRS, and the LysRS trees might indicate that the ancestral cytoplasmic gene has been retained. Vice versa, the clustering of the eukaryotic lineage with bacterial species in the CysRS, ThrRS, ValRS, and GlnRS trees signals retention of the ancestral mitochondrial aaRS, which now has a dual function in both the cytoplasm and the mitochondrion. None of the single eukaryotic aaRS clustered with any of the {alpha}-proteobacterial clades.

Although we identified 2 nuclear genes for the cytoplasmic and mitochondrial enzymes in the AsnRS and SerRS trees, neither displayed the characteristic mitochondrial-bacterial and cytoplasmic-archaeal patterns (fig. 2; supplementary fig. 1, Supplementary Material online). The AsnRS tree suggests several different subtypes, with the mitochondrial and the cytoplasmic enzymes belonging to the same overall broad group, consistent with ancestral gene duplication and replacement events. We also found a peculiar pattern for SerRS; the mitochondrial aaRS clustered with the Archaea, whereas the cytoplasmic enzyme grouped with bacteria other than the {alpha}-Proteobacteria. Taken together, our analysis suggests that the evolution of these mitochondrial aaRS has been accompanied by loss, duplication, and intranuclear gene replacement events. As in all other trees, none clusters with the {alpha}-Proteobacteria.

Testing for Systematic Errors
We took great precautions to avoid systematic errors (e.g., long-branch attraction) by applying phylogenetic methods designed to minimize such problems. For example, we assessed the effects of removing fast-evolving sites (as designated by the Shannon–Wiener index) when calculating LogDet distances on the alignments and subjected them to Neighbor Net analyses (supplementary fig. 3, Supplementary Material online), as exemplified with PheRS and LeurRS (fig. 4). The mitochondrial grouping essentially remained stable until the phylogenetic signal was lost due to the removal of too many sites. The only exception was the TrpRS tree in which the mitochondrial clade approached that of the {alpha}-Proteobacteria following the gradual removal of variable sites. We also tested the observed ML estimate for the position of the mitochondrial clade in the PheRS and LeuRS trees against a forced position within the {alpha}-proteobacterial group with a likelihood ratio test using the SOWH procedure. The test confirmed in both cases that a position of mitochondria within the {alpha}-proteobacterial group had a significantly lower likelihood than the ML estimate (P < 0.01). Recombination tests revealed no instances of recombination events that could explain the separate clusterings of mitochondria and {alpha}-Proteobacteria (data not shown).


Figure 4
View larger version (50K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 4.— Neighbor net analyses of (A, C) PheRS and (B, D) LeuRS based on LogDet distances. The effects of removing the fastest evolving sites (here ranked according to the Shannon–Wiener index) from the analyses (C, D) are illustrated with 30% of the sites excluded. Taxonomic groups are indicated by letters: red = {alpha}-Proteobacteria, yellow = mitochondria, and blue = eukaryote. Abbreviations of species names are spelled out in supplementary table 2, Supplementary Material online.

 
Finally, to examine whether the mitochondrial LeuRS and/or PheRS showed an elevated substitution rate relative to the {alpha}-proteobacterial aaRS, we did a separate Bayesian analysis of the 2 groups (fig. 5). For comparison, we also examined 2 mitochondrially encoded proteins (COX-1 and COB) as well as 4 mitochondrial components of the pyruvate (PdhA, PdhB, and PdhC) and NADH (NADH dehydrogenase subunit F) dehydrogenase components encoded by the nuclear genomes. These were selected because they support a clustering of mitochondria and {alpha}-Proteobacteria (supplementary fig. 4, Supplementary Material online). In most instances, the amino acid substitution frequencies were slightly higher for the mitochondrial group, but this effect was seen irrespectively of the different tree topologies. Thus, the relative rate enhancement was no different for genes that supported a clustering of mitochondria and {alpha}-Proteobacteria with those that failed to do so, suggesting that rate acceleration alone cannot explain the lack of affiliation of the mitochondrial aaRS with the {alpha}-Proteobacteria.


Figure 5
View larger version (25K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 5.— Relative substitution rates in the {alpha}-proteobacterial and the eukaryotic clades. The relative rates were obtained from a Bayesian analysis of each group where the 8 different partitions (=genes) were allowed to have different relative rates. The rate is normalized so the mean relative rate of PdhA, PdhC, and PdhD equals 1 within each group. The 8 partitions analyzed were cytochrome b (COB), cytochrome oxidase subunit 1 (COX-1), NADH dehydrogenase subunit F (NuoF) and subunits of the pyruvate dehydrogenase complex, E1 component alpha subunit (PdhA), dihydrolipoamide dehydrogenase E2 component (PdhC), and dihydrolipoamide acyltransferase E3 component (PdhD). The PheRS and LeuRS trees are shown in supplementary figure 1 (Supplementary Material online). The COX, COB, NuoF, PdhA, PdhC, and PdhD trees are shown in supplementary figure 4 (Supplementary Material online).

 
The Protomitochondrial Proteome Revisited
A previous application of phylogenetic methods to a set of more than 70,000 bacterial and eukaryotic proteins identified 630 nuclear eukaryotic genes as ancestrally derived from the {alpha}-Proteobacteria endosymbiont (Gabaldon and Huynen 2003Go). This is based on an automatic survey at the whole-genome level for clusters that contain mitochondria and proteobacterial species. We reexamined the 630 suggested genes in the protomitochondrion with a 4-fold larger {alpha}-proteobacterial species set using more stringent criteria. To avoid spurious relationships across small sets of taxa, we included only genes that were broadly present in Bacteria and Eukaryotes, defined as gene presence in at least 10 out of 25 {alpha}-proteobacterial species, in 4 out of 9 Eukaryotes, and in 15 additional bacterial species. Using this cutoff level for inclusion, we reevaluated 120 genes in the protomitochondrial data set. Circa 40 trees supported a clustering of {alpha}-Proteobacteria with mitochondria, the most significant of which were observed for proteins involved in the pyruvate dehydrogenase and respiratory chain complexes (see supplementary fig. 4, Supplementary Material online for a few examples). None of the aaRS trees showed a similar support for the clustering of mitochondria and {alpha}-Proteobacteria.


    Discussion
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
The early evolution of the eukaryotic cell and the origin of the many bacterial-like genes in the eukaryotic genome is an enigma that has proven hard to resolve. These genes may either have been acquired en masse from the endosymbionts that gave rise to mitochondria (reviewed in Gray et al. 1999Go; Dyall et al. 2004Go; Embley and Martin 2006Go) and/or obtained via horizontal transfer of individual bacterial genes (Lester et al. 2005Go) and/or from a consortium of bacterial endosymbionts. Unlike most previous attempts to discern the origin of the eukaryotic genome, such as most recently in the "Ring of Life" hypothesis (Rivera and Lake 2004Go), our aim was to place the origin of the bacterial-like genes for aminoacylation processes "in relation to" the previously demonstrated {alpha}-proteobacterial origin for a subset of genes coding for components of the mitochondrial respiratory chain system (Gray et al. 1999Go; Hrdy et al. 2004Go; Fitzpatrick et al. 2005).

There is no reason to expect a priori that every single mitochondrial gene of a truly {alpha}-proteobacterial origin should branch with the {alpha}-Proteobacteria in a phylogenetic analysis. This is because of the inefficiency of single-gene sequences to recover true evolutionary relationships, which places constraints on our ability to identify mitochondrial protein ancestors in retrospect. Such concerns should be taken seriously, as demonstrated by Esser et al. (2004)Go who examined all genes in the mitochondrial genomes by rigorous phylogenetic methods and found that although all are likely to share a common ancestor and truly be of {alpha}-proteobacterial origin, not every gene supported a clustering with the {alpha}-Proteobacteria. Indeed, many genes have been proven unsuitable for phylogenetic studies at deep divergences due to duplications, horizontal gene transfers, and rapid sequence evolution (Martin 1999Go; Creevey et al. 2004Go; Martin and Embley 2004Go; Bapteste and Walsh 2005Go).

The novelty of our approach is that we have tested for the retention of the phylogenetic signal at the level of the individual gene by examining the congruence of the aaRS tree with the {alpha}-proteobacterial species tree (fig. 1). This approach is valid under the condition that an {alpha}-proteobacterial species tree exists and can be inferred with confidence. Indeed, the same {alpha}-proteobacterial species tree topology was inferred in 2 studies independently using different approaches. In one study, the topology was inferred from a concatenated alignment of 20–40 genes with conserved gene order structures in the Rhizobiales (Boussau et al. 2004Go). Another analysis utilized a super-tree approach, in which a combined set of 418 single-gene families served as the input data (Fitzpatrick et al. 2005). Incompatibilities were estimated for less than 20% of the information genes and 50% of the operational genes, which set the upper limits for horizontal gene transfer, gene paralogy, or systematic biases in the inference methods (Fitzpatrick et al. 2005). We conclude that a meaningful {alpha}-proteobacterial species relationship can be inferred with high confidence (Boussau et al. 2004Go; Fitzpatrick et al. 2005).

A concordance between the aaRS trees with the {alpha}-proteobacterial species tree was observed in 12 cases, suggesting retention of the phylogenetic signal at the deepest level of the {alpha}-proteobacterial clade. The presence of 2 different genes for the mitochondrial and cytoplasmic aaRS in 6 of these trees makes it unlikely that gene replacements on either side have obscured the underlying phylogenetic signals. In these cases, the gene geneaology seems not to have been massively eroded by horizontal gene transfer events or rapid sequence evolution. Yet, no evidence for a clustering of mitochondrial and {alpha}-Proteobacteria aaRS was found.

In the remaining 8 trees, we observed a split of the {alpha}-Proteobacteria species into 2 unrelated groups, indicative of horizontal gene transfers and/or differential loss of ancestral paralogs. Among the 12 that supported a monophyletic {alpha}-proteobacterial clade, potential gene replacement events within the nuclear genome of the Eukaryotes were observed in 6 cases. Such retargeting events within the eukaryotic genome place constraints on our ability to discern the bacterial origin of these mitochondrial aaRS. The HisRS and ValRS trees were particularly complex, signaling gene transfers on the {alpha}-proteobacterial as well as the eukaryotic side.

We are aware that even aaRS that support the cohesion of the {alpha}-Proteobacteria and that are encoded by 2 different nuclear genes may fail to reveal the true origin of the mitochondrial proteins because of methodological problems. The position of the mitochondrial clade in the aaRS trees could potentially be an artifact of an evolutionary rate acceleration in the mitochondrial lineages relative to the {alpha}-proteobacterial branches, as observed for some of the most rapidly evolving genes in intracellular symbionts and parasites (Canback et al. 2004Go; Thomarat et al. 2004Go). However, given the many tests performed in this study, the tree topologies seem robust. In particular, mitochondrial proteins that fail to support the {alpha}-proteobacterial origin do not evolve more rapidly than those that support such a relationship. Because the nuclear genes coding for components of the mitochondrial respiratory processes accurately disclosed an affiliation with the {alpha}-Proteobacteria, we are convinced that our methods would be fully capable of identifying an {alpha}-proteobacterial origin also for the aaRS, had it existed.

The different phylogenetic placements of the mitochondrial aaRS and the respiratory chain proteins are difficult to reconcile with hypotheses that try to explain the origin of the Eukaryotes by the symbioses of 2 partners of well-defined groups, such as Archaea with {alpha}-Proteobacteria, Thermoplasma with spirochetes, methanogens with {delta}-Proteobacteria, Sulfolobus with Clostridium, or Pyrococcus with {gamma}-Proteobacteria (for a review of suggested partners, see Embley and Martin 2006Go). In these hypotheses, the underlying assumption is that Bacteria and Archaea evolved as 2 identifiable lineages long before the Eukaryotes emerged and that the partners involved in the fusion process had already at this stage separated from their most closely related sister groups. The explicit suggestion is that the gene set of the Eukaryotes is the sum of what was present in the 2 partners plus some extra genes added later by horizontal gene transfer. If so, we expect the majority of nuclear genes for mitochondrial proteins to show an affiliation that is consistent with their natural history, such as for example with the {alpha}-Proteobacteria.

However, despite the essentiality of aaRS in protein synthesis, we failed to recover the anticipated {alpha}-proteobacterial source of these genes. These results are consistent with previous phylogenetic studies (Brown 2001Go, 2003Go) that have also noted the lack of evolutionary concordance with the classical endosymbiotic theory. In a previous study of the PheRS {alpha}-subunit that included 2 {alpha}-proteobacterial species (Brown 2001Go) as well as in our study that included a larger and more representative set of {alpha}-proteobacterial species (fig. 3A), the mitochondrial lineage was positioned near the base of the bacterial tree rather than with the {alpha}-Proteobacteria. Like the aaRS, eukaryotic genes for glycolysis are bacterial like but also do not cluster specifically with the {alpha}-Proteobacteria (Canback et al. 2002Go). The strong support for an affiliation between mitochondria and {alpha}-Proteobacteria for some genes and the lack thereof for others suggest the acquisition of bacterial-like genes into the ancestral eukaryotic genome from various different sources.

One possibility is that the {alpha}-proteobacterial ancestor itself contained a naturally diverse collection of genes (Martin and Koonin 2006Go) or arose long before the emergence of the {alpha}-proteobacterial species analyzed in this study. Alternatively, some of the aaRS may be remnants of endosymbiotic attempts that failed (Doolitte 1998; Brown 2003Go) prior to the successful establishment of an endosymbiotic relationship with the {alpha}-Proteobacteria. A replacement of primary endosymbiont gene functions by those of secondary endosymbionts has been observed in the case of aphid endosymbionts (Koga et al. 2003Go; Perez-Brocal et al. 2006Go). If replacements of bacterial gene functions have occurred in endosymbiotic relationships that are only a few hundred million years old, it is perhaps not unreasonable to think that remnants of premitochondrial endosymbionts could explain some of the anomalies of the mitochondrial proteins that have evolved over a billion year time period. If this holds true, the implication is that a protoeukaryotic cell might have existed already prior to the acquisition of the mitochondrion.

Another possibility is that only a limited set of {alpha}-proteobacterial genes were transferred into the mitochondrial genome and processed by endosymbionts or bacterial genes acquired via other evolutionary routes. The recent identification of intramitochondrial {alpha}-proteobacterial symbionts in ovarian cells of ticks provides conditions under which {alpha}-proteobacterial genes may be transferred into the mitochondrial genomes even in modern times (Beninati et al. 2004Go; Sassera et al. 2006Go). In light of this, it is interesting to note that it has been shown that the sex-ratio distorter Wolbachia pipientis have transferred some of its genes into the nuclear genome of their arthropod hosts (Kondo et al. 2002Go). The transfer of {alpha}-proteobacterial genes for components of the respiratory chain complexes may hide acquisitions of other genes into the eukaryotic genome that are more difficult to trace evolutionarily.

In this context, it is noteworthy that other mitochondrial information processing enzymes, such as RNA polymerase, DNA polymerase, and replicative helicases, is not of {alpha}-proteobacterial origin, but rather similar to homologs of T-odd bacteriophages (Filée et al. 2003Go; Filée and Forterre 2005Go; Shutt and Gray 2005Go, 2006Go; Forterre 2006Go). Most of these genes are nucleus encoded, although T phage–like RNA polymerase genes have been identified in mitochondrial genomes and plasmids. Cryptic prophages of the T-odd type have also been discovered in several proteobacterial genomes, including those of {alpha}-Proteobacteria (Filée and Forterre 2005Go). Given the recent identification of aaRS genes in giant Mimivirus (Abergel et al. 2005Go), a viral origin or a viral transmission route might be considered. However, the mimiviral sequences are highly diverged and cluster specifically with neither the mitochondrial aaRS nor bacterial or cytoplasmic aaRS (data not shown).

The emerging evolutionary scenario is increasingly complex with mitochondrial contributions from bacterial, eukaryotic, and viral partners. Ongoing gene loss and replacements may go a long way to explain why there are so few {alpha}-proteobacterial genes with homologs in eukaryotic genomes (Boussau et al. 2004Go). Thus, some "noise" is to be anticipated; however, it is remarkable that "none" of the aaRS trees supports an affiliation between mitochondria and {alpha}-Proteobacteria, not even those that otherwise show a strong retention of the phylogenetic signal over the time span considered. We can think of several possible explanations, among which we favor the simplest, namely, that the mitochondrial aaRS have been acquired from sources other than the {alpha}-Proteobacteria.


    Supplementary Material
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Supplementary figures 1–4 and table 2 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).


    Acknowledgements
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
This research was supported by grants from the Swedish Research Council, the Göran Gustafsson Foundation, the Swedish Foundation for Strategic Research, and the Wallenberg Foundation.


    Footnotes
 
Martin Embley, Associate Editor


    References
 TOP
 Abstract
 Introduction
 Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 

    Abergel C, Chenivesse S, Byrne D, Suhre K, Arondel V, Claverie JM. (2005) Mimivirus TyrRS: preliminary structural and functional characterization of the first amino-acyl tRNA synthetase found in a virus. Acta Crystallogr Sect F Struct Biol Cryst Commun 61:212–215.[CrossRef]

    Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402.[Abstract/Free Full Text]

    Ardell DG and Andersson SGE. (2006) TFAM detects co-evolution of tRNA identity rules with lateral transfer of histidyl-tRNA synthetase. Nucleic Acids Res 34:893–904.[Abstract/Free Full Text]

    Bapteste E and Walsh TM. (2005) Does the ‘Ring of Life’ ring true? Trends Microbiol 13:256–261.[CrossRef][ISI][Medline]

    Beninati T, Lo N, Sacchi L, Genchi L, Noda H, Bandi C. (2004) A novel alpha-Proteobacterium resides in the mitochondria of ovarian cells of the tick Ixodes ricinus. Appl Environ Microbiol 70:2596–2602.[Abstract/Free Full Text]

    Boussau B, Karlberg EO, Frank AC, Legault B-A, Andersson SGE. (2004) Computational inference of scenarios for alpha-proteobacterial genome evolution. Proc Natl Acad Sci USA 101:9722–9727.[Abstract/Free Full Text]

    Brown JR. (2001) Genomic and phylogenetic perspectives on the evolution of prokaryotes. Syst Biol 50:497–512.[CrossRef][ISI][Medline]

    Brown JR. (2003) Ancient horizontal gene transfer. Nat Rev Genet 4:121–132.[CrossRef][ISI][Medline]

    Brown JR, Gentry D, Becker AJ, Ingraham K, Holmes DJ, Stanhope MJ. (2003) Horizontal transfer of drug-resistant aminoacyl-transfer-RNA synthetases of anthrax and gram-positive pathogens. EMBO Rep 4:692–698.[CrossRef][ISI][Medline]

    Bryant D and Moulton V. (2004) Neighbor-net: an agglomerative method for the construction of phylogenetic networks. Mol Biol Evol 21:255–265.[Abstract/Free Full Text]

    Canback B, Andersson SGE, Kurland CE. (2002) The global phylogeny of glycolytic enzymes. Proc Natl Acad Sci USA 99:6097–6102.[Abstract/Free Full Text]

    Canback B, Tamas I, Andersson SGE. (2004) A phylogenomic study of endosymbiotic bacteria. Mol Biol Evol 21:1110–1122.[Abstract/Free Full Text]

    Creevey CJ, Fitzpatrick DA, Philip GK, Kinsella RJ, O'Connell MJ, Pentony MM, Travers SA, Wilkinson M, McInerney JO. (2004) Does a tree-like phylogeny only exist at the tips in the prokaryotes? Proc R Soc Lond B Biol Sci 271:2551–2558.[Medline]

    Diaz-Lazcoz Y, Aude J-C, Nitschké P, Chiapello H, Landes-Devauchelle C, Risler L. (1998) Evolution of genes, evolution of species: the case of aminoacyl-tRNA synthetases. Mol Biol Evol 15:1548–1561.[Free Full Text]

    Dohm JC, Vingron M, Staub E. (2006) Horizontal gene transfer in aminoacyl-tRNA synthetases including leucine-specific subtypes. J Mol Evol 63:437–447.[CrossRef][ISI][Medline]

    Doolittle WF. (1998) You are what you eat: a gene transfer ratchet that could account for bacterial genes in eukaryotic nuclear genomes. Trends Genet 14:307–311.[CrossRef][ISI][Medline]

    Dyall SF, Brown MT, Johnson PJ. (2004) Ancient invasions: from endosymbionts to organelles. Science 304:253–257.[Abstract/Free Full Text]

    Embley TM and Martin W. (2006) Eukaryotic evolution, changes and challenges. Nature 440:623–630.[CrossRef][Medline]

    Esser C, Ahmadinejad N, Wiegand C. (15 co-authors). (2004) A genome phylogeny for mitochondria among alpha-proteobacteria and a predominantly eubacterial ancestry of yeast nuclear genes. Mol Biol Evol 21:1643–1660.[Abstract/Free Full Text]

    Filée J and Forterre P. (2005) Viral proteins functioning in organelles: a cryptic origin? Trends Microbiol 13:510–513.[CrossRef][ISI][Medline]

    Filée J, Forterre P, Laurent J. (2003) The role played by viruses in the evolution of their hosts: a view based on informational protein phylogenies. Res Microbiol 154:237–243.[Medline]

    Fitzpatrick DA, Creevey CJ, McInerney JO. (2006) Genome phylogenies indicate a meaningful alpha-proteobacterial phylogeny and support a grouping of the mitochondria with the Rickettsiales. Mol Biol Evol 23:74–85.[Abstract/Free Full Text]

    Forterre P. (2006) Three RNA cells for ribosomal lineages and three DNA viruses to replicate their genomes: a hypothesis for the origin of cellular domain. Proc Natl Acad Sci USA 103:3669–3674.[Abstract/Free Full Text]

    Gabaldon T and Huynen MA. (2003) Reconstruction of the proto-mitochondrial metabolism. Science 301:609.[Free Full Text]

    Galtier N and Gouy M. (1995) Inferring phylogenies from DNA sequences of unequal base compositions. Proc Natl Acad Sci USA 92:11317–11321.[Abstract/Free Full Text]

    Galtier N, Gouy M, Gautier C. (1996) SEAVIEW and PHYLO_WIN: two graphic tools for sequence alignment and molecular phylogeny. Comput Appl Biosci 12:543–548.[Abstract/Free Full Text]

    Goldman NJ, Anderson P, Rodrigo AG. (2000) Likelihood-based tests of topologies in phylogenetics. Syst Biol 49:652–670.[CrossRef][ISI][Medline]

    Gray M, Burger WG, Lang BF. (1999) Mitochondrial evolution. Science 283:1476–1481.[Abstract/Free Full Text]

    Guindon S and Gascuel O. (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 52:696–704.[CrossRef][ISI][Medline]

    Hrdy I, Hirt RP, Dolezal P, Bardonova L, Foster PG, Tachezy J, Embley TM. (2004) Trichomonas hydrogenosomes contain the NADH dehydrogenase module of mitochondrial complex I. Nature 432:618–622.[CrossRef][Medline]

    Huelsenbeck JP and Ronquist F. (2001) MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17:754–755.[Abstract/Free Full Text]

    Huson DH. (1998) SplitsTree: analyzing and visualizing evolutionary data. Bioinformatics 14:68–73.[Abstract/Free Full Text]

    Jobb G, von Haesler A, Strimmer K. (2004) TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics. BMC Evol Biol 4:18.[CrossRef][Medline]

    Karlberg EO and Andersson SGE. (2003) Mitochondrial gene history and mRNA localization: is there a correlation? Nat Rev Genet 4:391–397.[CrossRef][ISI][Medline]

    Keane TM, Creevey CJ, Pentony MM, Naughton TJ, McInerney JO. (2006) Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified. BMC Evol Biol 6:29.[CrossRef][Medline]

    Koga R, Tsuchida T, Fukatsu T. (2003) Changing partners in an obligate symbiosis: a facultative endosymbiont can compensate for loss of the essential endosymbiont Buchnera in an aphid. Proc R Soc Lond B 270:2543–2550.[Medline]

    Kondo N, Nikoh N, Ijichi N, Shimada M, Fukatsu T. (2002) Genome fragment of Wolbachia endosymbiont transferred to X chromosome of host insect. Proc Natl Acad Sci USA 99:14280–14285.[Abstract/Free Full Text]

    Kurland CG and Andersson SGE. (2000) Origin and evolution of the mitochondrial proteome. Microbiol Mol Biol Rev 64:786–820.[Abstract/Free Full Text]

    Kurland CG, Collins LJ, Penny D. (2006) Genomics and the irre