MBE Advance Access originally published online on September 26, 2008
Molecular Biology and Evolution 2008 25(12):2689-2698; doi:10.1093/molbev/msn213
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
The Apicomplexan Whole-Genome Phylogeny: An Analysis of Incongruence among Gene Trees
,
* Department of Genetics, University of Georgia
Center for Tropical and Emerging Global Diseases, University of Georgia
Institute of Bioinformatics, University of Georgia
E-mail: chkuo{at}email.arizona.edu.
| Abstract |
|---|
|
|
|---|
The protistan phylum Apicomplexa contains many important pathogens and is the subject of intense genome sequencing efforts. Based upon the genome sequences from seven apicomplexan species and a ciliate outgroup, we identified 268 single-copy genes suitable for phylogenetic inference. Both concatenation and consensus approaches inferred the same species tree topology. This topology is consistent with most prior conceptions of apicomplexan evolution based upon ultrastructural and developmental characters, that is, the piroplasm genera Theileria and Babesia form the sister group to the Plasmodium species, the coccidian genera Eimeria and Toxoplasma are monophyletic and are the sister group to the Plasmodium species and piroplasm genera, and Cryptosporidium forms the sister group to the above mentioned with the ciliate Tetrahymena as the outgroup. The level of incongruence among gene trees appears to be high at first glance; only 19% of the genes support the species tree, and a total of 48 different gene-tree topologies are observed. Detailed investigations suggest that the low signal-to-noise ratio in many genes may be the main source of incongruence. The probability of being consistent with the species tree increases as a function of the minimum bootstrap support observed at tree nodes for a given gene tree. Moreover, gene sequences that generate high bootstrap support are robust to the changes in alignment parameters or phylogenetic method used. However, caution should be taken in that some genes can infer a "wrong" tree with strong support because of paralogy, model violations, or other causes. The importance of examining multiple, unlinked genes that possess a strong phylogenetic signal cannot be overstated.
Key Words: Apicomplexa genome scale phylogeny bootstrap long-branch attraction taxon sampling
| Introduction |
|---|
|
|
|---|
The protistan phylum Apicomplexa contains many important pathogens (Levine 1988
The use of genome sequences for phylogenetic inference has only recently become possible. The large number of characters derived from genomic data allows robust inference of organismal phylogeny (Delsuc et al. 2005
; Philippe, Delsuc, et al. 2005
; Rokas 2006
), even when the level of incomplete lineage sorting is high (Pollard et al. 2006
). Initially, it was thought that use of genomic data would bring an end to the incongruence commonly observed in multigene molecular phylogenetic inference (Gee 2003
; Rokas et al. 2003
). However, further investigations suggest that the results from genome-scale phylogenetic inference should be interpreted with caution (Soltis et al. 2004
; Jeffroy et al. 2006
; Nishihara et al. 2007
). Although genomic data can effectively suppress stochastic noise in shorter molecular sequences, the large amount of data can actually strengthen systematic biases when present (Phillips et al. 2004
; Rodriguez-Ezpeleta et al. 2007
).
Previous studies that examined factors such as poor taxon sampling (Soltis et al. 2004
; Philippe, Lartillot, and Brinkmann 2005
), inappropriate choices of phylogenetic method (Phillips et al. 2004
; Jeffroy et al. 2006
), nucleotide or amino acid composition bias and deviation from compositional equilibrium (Phillips et al. 2004
; Collins et al. 2005
), and variation of evolutionary rates among or within sites (Dopazo H and Dopazo J 2005
; Nishihara et al. 2007
; Rodriguez-Ezpeleta et al. 2007
), all found that systematic biases can lead to incorrect trees with strong support. Several approaches that can detect and remove systematic biases in genome-scale phylogenetic inference have been proposed, including modification of taxon sampling (Rodriguez-Ezpeleta et al. 2007
), examination of model violations (Rodriguez-Ezpeleta et al. 2007
), recoding of molecular sequences (Phillips et al. 2004
; Rodriguez-Ezpeleta et al. 2007
), removal of the fast-evolving sites (Nishihara et al. 2007
; Rodriguez-Ezpeleta et al. 2007
), and utilizing rare genomic changes (Delsuc et al. 2005
). Among the approaches that have been developed to address the systematic biases in genome-scale analyses, examination of incongruence among individual genes is directly relevant to the design and interpretation of multigene analyses that are fundamental in molecular phylogenetics (Huelsenbeck et al. 1996
; Taylor and Piel 2004
; Jeffroy et al. 2006
). Unfortunately, investigations of incongruence among gene trees at the genome-scale have been limited to a few selected groups such as gamma-Proteobacteria (Lerat et al. 2003
), yeast (Taylor and Piel 2004
; Gatesy and Baker 2005
; Jeffroy et al. 2006
), and Drosophila (Pollard et al. 2006
) due to the limitation of data availability.
In this study, we present the first genome-scale phylogenetic analysis in the phylum Apicomplexa. Because of the ancient origin of this phylum, estimated at approximately 700–900 Myr (Douzery et al. 2004
), we perform our genome-scale phylogenetic inference at the protein level. The robust inference of the organismal phylogeny based on genomic data provides a solid foundation for comparative studies that improve our knowledge of apicomplexan evolution. In addition to facilitating the planning of future phylogenetic studies that involve other closely related pathogens, our systematic investigation of incongruence among gene trees can improve our understanding of multigene phylogenetic inference in general.
| Materials and Methods |
|---|
|
|
|---|
Data Sources and Ortholog Identification
Our data set contains seven apicomplexan species that have fully annotated genome sequence available, including Babesia bovis (Brayton et al. 2007
|
Orthologous genes were identified using OrthoMCL (Li et al. 2003
Phylogenetic Inference
The program ClustalW (Thompson et al. 1994
) (version 1.83) was used for multiple sequence alignment. The "tossgaps" option was enabled to ignore gaps when constructing the guide tree, and all other parameters were set to the default values unless specifically stated otherwise. The alignments produced by ClustalW were filtered by GBLOCKS (Castresana 2000
) (version 0.91b) to using default settings remove regions that contain gaps or are highly divergent. The resulting amino acid alignment for each gene (provided in supplementary data file 1, Supplementary Material online) was used in the main phylogenetic analysis as described below; a codon-based nucleotide alignment for each gene was generated by PAL2NAL (Suyama et al. 2006
) and is provided in supplementary data file 2 (Supplementary Material online).
Three phylogenetic methods, including maximum likelihood (ML), maximum parsimony (MP), and Neighbor-Joining (NJ), were used to infer the gene tree for each individual gene. ML inferences were performed using PHYML (Guindon and Gascuel 2003
). The proportion of invariant sites and the gamma-distribution parameter with eight substitution categories were estimated from the data set. The substitution model was set to JTT (Jones et al. 1992
), and we enabled the optimization options for tree topology, branch lengths, and rate parameters. MP trees were constructed using PROTPARS in the PHYLIP package (Felsenstein 1989
) (version 3.65) with 100 randomizations of input order. When more than one equally parsimonious tree was found for a given gene, the strict consensus tree of all equally parsimonious trees was used as the MP tree of this gene. NJ trees were constructed using NEIGHBOR in the PHYLIP package with species input order randomization enabled. The distance matrices were calculated by Tree-Puzzle (Schmidt et al. 2002
) (version 5.2). The parameters used in Tree-Puzzle were set to the JTT substitution model, the mixed model of rate heterogeneity with one invariant and eight gamma rate categories, and the exact and slow parameter estimation. The level of bootstrap support for each gene was inferred by 100 resamplings of the alignment using SEQBOOT in the PHYLIP package followed by ML inference.
To investigate the sensitivity of a gene to the multiple sequence alignment parameter, we varied the gap opening penalty by 2-fold in both directions (i.e., increased the default cost from 10 to 20 or decreased it to 5) and inferred the gene tree under each setting. Individual genes are classified into three categories including robust, intermediate, and sensitive based on the ML gene-tree topologies from the three gap opening penalties examined. A gene is classified as robust if all three settings generated the same topology, intermediate if two out of the three settings generated the same topology, or sensitive if each setting generated a different topology.
To investigate the effect of the substitution model used on the resulting gene-tree topology, we performed ML inference for each gene using two additional substitution models, including LG (Le and Gascuel 2008
) and WAG (Whelan and Goldman 2001
). The resulting gene trees are compared with the topology obtained using the JTT model (Jones et al. 1992
).
Inference of the Species Tree
The species tree was inferred using two different approaches. The first approach was based on the consensus of individual gene trees. The consensus tree was inferred by the CONSENSE program in the PHYLIP package using extended majority rule. Gene trees inferred by different phylogenetic methods (i.e., ML, MP, and NJ) were analyzed separately. The second approach was based on the concatenated alignment of all individual genes following the phylogenetic inference procedures as described above.
Characterization of Gene Trees
The topology distance between each gene tree and the species tree was calculated based on the symmetric difference (Robinson and Foulds 1981
) as implemented in TREEDIST in the PHYLIP package. For genes that inferred a topology that is different from the species tree, we performed the approximately unbiased (AU) test (Shimodaira 2002
) and the Shimodaira–Hasegawa (SH) test (Shimodaira and Hasegawa 1999
) using the CONSEL package (Shimodaira and Hasegawa 2001
) to test if the species tree topology is significantly rejected by a gene.
Taxon Removal Tests
To evaluate the potential influence of long-branch attraction (LBA), we removed either of the two taxa that have a long terminal branch (i.e., the outgroup T. thermophila and the ingroup C. parvum) and repeated the phylogenetic inference for each gene. Our procedure is conceptually similar to the taxon jackknife method (Siddall 1995
) but contains one important distinction. The traditional taxon jackknife method removes a taxon after multiple sequence alignment and prior to tree reconstruction. However, the taxon being removed still affects the alignment and thus can influence the resulting tree. We chose to perform the taxon removal prior to multiple sequence alignment to eliminate any effect on the phylogenetic inference from the taxon being removed.
| Results and Discussion |
|---|
|
|
|---|
Ortholog Identification
From the seven apicomplexans and the one ciliate examined, we identified 268 single-copy genes that are shared by all eight species. These genes represent less than 10% of the annotated genes from the smallest genome (table 1), indicating that these organisms are highly divergent in their gene content. The long evolutionary distance between ciliates and apicomplexans only partially explains this observation. When the outgroup is not considered, the seven apicomplexans share 508 orthologous genes (of which 433 are single copy in all species). One of our previous studies that examined a different set of apicomplexan species produced similar results and suggested that 28–45% of the genes in an apicomplexan genome are genus-specific (Kuo and Kissinger 2008
For the purpose of phylogenetic analysis, we focus on the 268 single-copy genes shared by all eight species. Many of these genes are responsible for basic cellular processes (e.g., DNA replication, transcription, translation, etc.), as noted in our previous study (Kuo and Kissinger 2008
). The sequence identity and annotation information of these genes are provided in supplementary table S1 (Supplementary Material online).
The Apicomplexan Species Tree
The species tree was inferred using two different approaches. The first approach calculated the consensus tree among the 268 individual gene trees, and the second approach utilized a concatenated alignment of 71,830 amino acid sites. Both approaches resulted in the same species tree topology (fig. 1) by all three phylogenetic methods used. Groupings of three species pairs, including P. falciparum and P. vivax, B. bovis and T. annulata, and E. tenella and T. gondii, are supported by 87% or more of the genes based on ML consensus. In contrast, the two short internal branches are supported by less than 50% of the genes. Nevertheless, all internal branches received 100% ML bootstrap support based on the analysis of the concatenated alignment.
|
This tree topology is consistent with most of our prior understanding of apicomplexan evolution based on morphology and development (Perkins et al. 2000
The Distribution of Gene Trees
Examination of individual genes revealed a seemingly high degree of incongruence among gene trees. Of the 268 gene trees examined, we observed a total of 48 topologies based on ML analysis (fig. 2). The most frequently observed topology (fig. 3A) is consistent with the putative species tree and is supported by 19% of the genes. Each of the next three frequent topologies (fig. 3B–D) is supported by approximately 7–10% of the genes and is different in the placement of C. parvum. Two additional topologies (fig. 3E and F) are supported by 6% of the genes and exhibit alternative placements of the Plasmodium lineage. The observation that only a relatively small number of topologies are found may be attributed to our limited taxon sampling of eight species. For example, in an analysis of 106 genes from 14 yeast species, Jeffroy et al. (2006)
found that each of the genes analyzed supports a distinct topology.
|
|
Despite the seemingly high level of incongruence among gene trees, only 16 genes significantly reject the putative species tree topology in the AU test (Shimodaira 2002
The finding of a high level of topological incongruence among gene trees that lack statistical significance has been reported in previous genome-scale phylogenetic studies. Lerat et al. (2003)
examined 205 single-copy genes shared by 13 gamma-Proteobacteria species and found only two significantly rejected the putative species tree in the SH test. In both cases, the discordance between the gene tree and the putative species tree can be explained by a single lateral gene transfer (LGT) event. Similarly, examinations of the 106 single-copy genes shared by a group of Saccharomyces spp. showed that the majority of bipartition conflicts among genes have low bootstrap support (Taylor and Piel 2004
; Jeffroy et al. 2006
).
One possible hypothesis to explain the rare occurrences of a gene significantly rejecting the species tree is that single-copy genes are unlikely to be involved in LGT events (Daubin et al. 2002
, 2003
). Under this hypothesis, these genes have been confined in the organismal phylogeny throughout their evolutionary history, so the gene-tree topology is unlikely to be radically different from the species tree. By focusing on a small subset of genes that are highly conserved across all apicomplexan lineages examined, our methodology for orthologous gene selection may have effectively excluded genes that experienced LGT since the ciliate–apicomplexan divergence. Although LGT does not appear to influence our phylogenetic inference as presented here, caution should be taken in future studies because several previous studies suggest that LGT is an important evolutionary force in apicomplexans (Huang, Mullapudi, Lancto, et al. 2004
; Huang, Mullapudi, Sicheritz-Ponten, and Kissinger 2004
; Striepen et al. 2004
; Nagamune and Sibley 2006
) and other protists (Gogarten 2003
; Richards et al. 2003
; Andersson 2005
).
Evaluation of Phylogenetic Signal by Bootstrap Support
To test if the observed topological incongruence among gene trees can be explained by a low resolving power for certain clades in some genes, we used the minimum bootstrap value observed in a gene tree to identify genes that possess strong phylogenetic signals. The results indicate that the percentage of genes that support the putative species tree increases as a function of the bootstrap cutoff used (table 2). In the most extreme example, when only the genes with a minimum bootstrap value of 90% at any node are examined, all five genes that meet this cutoff support the putative species tree topology. Even when the selection stringency is relaxed to a 70% bootstrap support, a cutoff that is commonly used in phylogenetic inference (Hillis and Bull 1993
), 47% of these genes are consistent with the putative species tree and the two short internal branches received at least 60% of the consensus support. Curiously, we did not find any significant correlation between bootstrap support and alignment length, average pairwise protein distance, or other attributes of genes (supplementary table S1, Supplementary Material online).
|
In addition to being consistent with the putative species tree, genes with strong bootstrap support are often insensitive to changes in alignment parameter (table 3), substitution model (table 4), or the phylogenetic method used (table 5). In these tests, we are interested in investigating if a gene could infer the same gene-tree topology across a range of settings used in the phylogenetic inference process; the agreement between the gene-tree topology and the putative species tree is not considered. At 70% minimum bootstrap cutoff, we found that 90% of these genes are robust to a 4-fold change in the gap opening penalty (table 3), 93% of the genes are insensitive to the choice of substitution model (table 4), and 57% of the genes behave consistently across different phylogenetic methods (table 5). Although the use of methodological concordance as a criterion for selecting genes for phylogenetic inference was criticized (Grant and Kluge 2003
|
|
|
Removal of the Long Branches
In addition to the low signal-to-noise ratio in some genes, another possible source of incongruence among gene trees is the LBA problem that resulted from our nonideal taxon sampling. Several observations support this hypothesis. First, when a gene behaved inconsistently across different phylogenetic methods, ML and NJ often result in an identical gene-tree topology that is different from MP (table 5). In addition, the outgroup T. thermophila and the ingroup C. parvum both have a long evolutionary distance to the other taxa (fig. 1). The lack of additional species that can be used to break up the long branch leading to the Cryptosporidium lineage may be responsible for its unstable phylogenetic placement, as evidenced by the fact that three of the most frequently observed gene-tree topologies involve alternative placement of C. parvum (fig. 3B–D). Although the genome sequence of C. hominis is available, adding this species is not particularly helpful. The genomes of these two Cryptosporidium spp. exhibit only 3–5% divergence at the nucleotide level (Xu et al. 2004
The issue of nonideal taxon sampling reflects a limitation that is often faced by genome-scale phylogentic inferences (Soltis et al. 2004
). To circumvent this limitation, we utilized two other commonly suggested approaches to address the LBA problem (Bergsten 2005
). First, all sites that contain gaps or are highly divergent were removed from the alignment prior to phylogenetic inference by GBLOCKS (see Materials and Methods). Second, we removed either the outgroup T. thermophila or the ingroup C. parvum prior to sequence alignment and repeated the phylogenetic inference.
When the outgroup is removed from the data set, we observed a large increase in the consensus support for the Plasmodium–Babesia–Theileria clade (table 6). Two alternative bipartitions, as shown in panels E and F of figure 3, received substantially weaker consensus supports regardless of the minimum bootstrap cutoff used. Removal of the ingroup C. parvum resulted in a reduction of the number of observed gene-tree topologies (table 6), but the consensus support for the Plasmodium–Babesia–Theileria clade is relatively low compared with the removal of T. thermophila.
|
| Conclusion |
|---|
|
|
|---|
The recent availability of genome sequences allowed us to infer an organismal phylogeny that includes several important apicomplexan pathogens with high confidence. This robust species tree provides a solid foundation for future comparative studies that can improve our understanding of apicomplexan evolution and parasite biology. Although the level of incongruence among gene trees appears to be high at first glance, further investigation indicates that most of the observed conflict does not have strong statistical support. Interestingly, the minimum bootstrap support observed in a gene tree appears to be a useful predictor of phylogenetic performance. Genes that produce strong bootstrap support for all internal branches are more likely to be consistent with the species tree and robust to changes in the alignment parameter or the phylogenetic method used. Nevertheless, examination of multiple unlinked genes with strong phylogenetic signals is important for accurate phylogenetic inference because any single gene can have a different evolutionary history from the organismal phylogeny. Our systematic investigation provides a list of phylogenetically informative genes in the phylum Apicomplexa. These genes are good candidates for future sequencing efforts that aim at improving taxon sampling in this group of important pathogens.
| Supplementary Material |
|---|
|
|
|---|
Supplementary data files l and 2 and table S1 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
| Acknowledgements |
|---|
|
|
|---|
C.-H.K. was supported by a National Institutes of Health (NIH) Training Grant (GM07103), the Kirby and Jan Alton Graduate Fellowship, and a Dissertation Completion Assistantship at the University of Georgia. Funding for this work was provided by NIH R01 AI068908 to J.C.K. P. Brunk, F. Chen, J. Felsenstein, M. Heiges, A. Oliveira, E. Robinson, and H. Wang provided valuable assistance on the use of computer hardware and software. We thank the J. Craig Venter Institute for providing prepublication access to the genome sequence data of P. vivax and T. gondii. The associate editor, Dr Hervé Philippe, and three anonymous reviewers provided constructive comments that greatly improved this manuscript.
| Footnotes |
|---|
1 Present address: Department of Ecology and Evolutionary Biology, University of Arizona
Hervé Philippe, Associate Editor
| References |
|---|
|
|
|---|
Abrahamsen MS, Templeton TJ, Enomoto S, et al, (20 co-authors). Complete genome sequence of the apicomplexan, Cryptosporidium parvum. Science (2004) 304:441–445.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol (1990) 215:403–410.[CrossRef][Web of Science][Medline]
Andersson JO. Lateral gene transfer in eukaryotes. Cell Mol Life Sci (2005) 62:1182–1197.[CrossRef][Web of Science][Medline]
Bahl A, Brunk B, Crabtree J, et al, (18 co-authors). PlasmoDB: the Plasmodium genome resource. A database integrating experimental and computational data. Nucleic Acids Res (2003) 31:212–215.
Bergsten J. A review of long-branch attraction. Cladistics (2005) 21:163–193.[CrossRef][Web of Science]
Brayton KA, Lau AOT, Herndon DR, et al, (28 co-authors). Genome sequence of Babesia bovis and comparative analysis of apicomplexan hemoprotozoa. PLoS Pathog (2007) 3:e148.
Carlton J. Genome sequencing and comparative genomics of tropical disease pathogens. Cell Microbiol (2003) 5:861–873.[CrossRef][Web of Science][Medline]
Carreno RA, Matrin DS, Barta JR. Cryptosporidium is more closely related to the gregarines than to coccidia as shown by phylogenetic analysis of apicomplexan parasites inferred using small-subunit ribosomal RNA gene sequences. Parasitol Res (1999) 85:899–904.[CrossRef][Web of Science][Medline]
Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol (2000) 17:540–552.
Collins TM, Fedrigo O, Naylor GJP. Choosing the best genes for the job: the case for stationary genes in genome-scale phylogenetics. Syst Biol (2005) 54:493–500.[CrossRef][Web of Science][Medline]
Daubin V, Gouy M, Perriere G. A phylogenomic approach to bacterial phylogeny: evidence of a core of genes sharing a common history. Genome Res (2002) 12:1080–1090.
Daubin V, Moran NA, Ochman H. Phylogenetics and the cohesion of bacterial genomes. Science (2003) 301:829–832.
Delsuc F, Brinkmann H, Philippe H. Phylogenomics and the reconstruction of the tree of life. Nat Rev Genet (2005) 6:361–375.[Web of Science][Medline]
Dopazo H, Dopazo J. Genome-scale evidence of the nematode-arthropod clade. Genome Biol (2005) 6:R41.[CrossRef][Medline]
Douzery EJP, Snell EA, Bapteste E, Delsuc F, Philippe H. The timing of eukaryotic evolution: does a relaxed molecular clock reconcile proteins and fossils? Proc Natl Acad Sci USA (2004) 101:15386–15391.
Eisen JA, Coyne RS, Wu M, et al, (53 co-authors). Macronuclear genome sequence of the ciliate Tetrahymena thermophila, a model eukaryote. PLoS Biol (2006) 4:1620–1642.[Web of Science]
Escalante A, Ayala F. Evolutionary origin of Plasmodium and other Apicomplexa based on rRNA genes. Proc Natl Acad Sci USA (1995) 92:5793–5797.
Felsenstein J. PHYLIP—phylogeny inference package (version 3.2). Cladistics (1989) 5:164–166.
Gajria B, Bahl A, Brestelli J, et al, (15 co-authors). ToxoDB: an integrated Toxoplasma gondii database resource. Nucleic Acids Res (2008) gkm981. 36:D553–D556.
Gardner MJ, Bishop R, Shah T, et al, (44 co-authors). Genome sequence of Theileria parva, a bovine pathogen that transforms lymphocytes. Science (2005) 309:134–137.
Gardner MJ, Hall N, Fung E, et al, (45 co-authors). Genome sequence of the human malaria parasite Plasmodium falciparum. Nature (2002) 419:498–511.[CrossRef][Web of Science][Medline]
Gatesy J, Baker RH. Hidden likelihood support in genomic data: can forty-five wrongs make a right? Syst Biol (2005) 54:483–492.[CrossRef][Web of Science][Medline]
Gee H. Evolution: ending incongruence. Nature (2003) 425. 782–782.
Gogarten JP. Gene transfer: gene swapping craze reaches eukaryotes. Curr Biol (2003) 13:R53–R54.[CrossRef][Web of Science][Medline]
Grant T, Kluge AG. Data exploration in phylogenetic inference: scientific, heuristic, or neither. Cladistics (2003) 19:379–418.[CrossRef][Web of Science]
Guindon S, Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol (2003) 52:696–704.[CrossRef][Web of Science][Medline]
Heiges M, Wang HM, Robinson E, et al, (13 co-authors). CryptoDB: a Cryptosporidium bioinformatics resource update. Nucleic Acids Res (2006) 34:D419–D422.
Hertz-Fowler C, Peacock CS, Wood V, et al, (14 co-authors). GeneDB: a resource for prokaryotic and eukaryotic organisms. Nucleic Acids Res (2004) 32:D339–D343.
Hillis DM, Bull JJ. An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Syst Biol (1993) 42:182–192.
Huang J, Mullapudi N, Lancto CA, Scott M, Abrahamsen MS, Kissinger JC. Cryptosporidium parvum: phylogenomic evidence supports past endosymbiosis, intracellular and horizontal gene transfer. Genome Biol (2004) 5:R88.[CrossRef][Medline]
Huang JL, Mullapudi N, Sicheritz-Ponten T, Kissinger JC. A first glimpse into the pattern and scale of gene transfer in the Apicomplexa. Int J Parasitol (2004) 34:265–274.[CrossRef][Web of Science][Medline]
Huelsenbeck JP, Bull JJ, Cunningham CW. Combining data in phylogenetic analysis. Trends Ecol Evol (1996) 11:152–158.[CrossRef]
Hulsen T, Huynen MA, de Vlieg J, Groenen PMA. Benchmarking ortholog identification methods using functional genomics data. Genome Biol (2006) 7:R31.[CrossRef][Medline]
Jeffroy O, Brinkmann H, Delsuc F, Philippe H. Phylogenomics: the beginning of incongruence? Trends Genet (2006) 22:225–231.[CrossRef][Web of Science][Medline]
Jones DT, Taylor WR, Thornton JM. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci (1992) 8:275–282.
Kuo C-H, Kissinger JC. Consistent and contrasting properties of lineage-specific genes in the apicomplexan parasites Plasmodium and Theileria. BMC Evol Biol (2008) 8:108.[CrossRef][Medline]
Le SQ, Gascuel O. An improved general amino acid replacement matrix. Mol Biol Evol (2008) 25:1307–1320.
Leander BS, Harper JT, Keeling PJ. Molecular phylogeny and surface morphology of marine aseptate gregarines (apicomplexa): selenidium spp. and Lecudina spp. J Parasitol (2003) 89:1191–1205.[CrossRef][Medline]
Lerat E, Daubin V, Moran NA. From gene trees to organismal phylogeny in prokaryotes: the case of the gamma-proteobacteria. PLoS Biol (2003) 1:101–109.[CrossRef][Web of Science]
Levine ND. Taxonomy and review of the coccidian genus Cryptosporidium (Protozoa, Apicomplexa). J Protozool (1984) 31:94–98.[Medline]
Levine ND. Progress in taxonomy of the Apicomplexan protozoa. J Eukaryot Microbiol (1988) 35:518–520.[CrossRef]
Li L, Stoeckert CJ, Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res (2003) 13:2178–2189.
Montoya JG, Liesenfeld O. Toxoplasmosis. Lancet (2004) 363:1965–1976.[CrossRef][Web of Science][Medline]
Morrison DA, Ellis JT. Effects of nucleotide sequence alignment on phylogeny estimation: a case study of 18S rDNAs of Apicomplexa. Mol Biol Evol (1997) 14:428–441.[Abstract]
Nagamune K, Sibley LD. Comparative genomic and phylogenetic analyses of calcium ATPases and calcium-regulated proteins in the Apicomplexa. Mol Biol Evol (2006) 23:1613–1627.
Nishihara H, Okada N, Hasegawa M. Rooting the eutherian tree: the power and pitfalls of phylogenomics. Genome Biol (2007) 8:R199.[CrossRef][Medline]
Pain A, Renauld H, Berriman M, et al, (50 co-authors). Genome of the host-cell transforming parasite Theileria annulata compared with T. parva. Science (2005) 309:131–133.
Perkins FO, Barta JR, Clopton RE, Peirce MA, Upton SJ. Apicomplexa. In: An illustrated guide to the protozoa—Lee J, Leedale G, Bradbury P, eds. (2000) Lawrence (KS): Society of Protozoologists. 190–369.
Philippe H, Delsuc F, Brinkmann H, Lartillot N. Phylogenomics. Annu Rev Ecol Evol Syst (2005) 36:541–562.[CrossRef]
Philippe H, Lartillot N, Brinkmann H. Multigene analyses of bilaterian animals corroborate the monophyly of Ecdysozoa, Lophotrochozoa, and Protostomia. Mol Biol Evol (2005) 22:1246–1253.
Philippe H, Snell EA, Bapteste E, Lopez P, Holland PWH, Casane D. Phylogenomics of eukaryotes: impact of missing data on large alignments. Mol Biol Evol (2004) 21:1740–1752.
Phillips MJ, Delsuc FD, Penny D. Genome-scale phylogeny and the detection of systematic biases. Mol Biol Evol (2004) 21:1455–1458.
Pollard DA, Iyer VN, Moses AM, Eisen MB. Widespread discordance of gene trees with species tree in Drosophila: evidence for incomplete lineage sorting. PLoS Genet (2006) 2:1634–1647.[Web of Science]
Richards TA, Hirt RP, Williams BAP, Embley TM. Horizontal gene transfer and the evolution of parasitic protozoa. Protist (2003) 154:17–32.[Medline]
Robinson DF, Foulds LR. Comparison of phylogenetic trees. Math Biosci (1981) 53:131–147.[CrossRef][Web of Science]
Rodriguez-Ezpeleta N, Brinkmann H, Roure eacute atrice B, Lartillot N, Lang BF, Philippe H. Detecting and overcoming systematic errors in genome-scale phylogenies. Syst Biol (2007) 56:389–399.[CrossRef][Web of Science][Medline]
Rokas A. Genomics and the tree of life. Science (2006) 313:1897–1899.
Rokas A, Williams BL, King N, Carroll SB. Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature (2003) 425:798–804.[CrossRef][Web of Science][Medline]
Schmidt HA, Strimmer K, Vingron M, von Haeseler A. TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics (2002) 18:502–504.
Shimodaira H. An approximately unbiased test of phylogenetic tree selection. Syst Biol (2002) 51:492–508.[CrossRef][Web of Science][Medline]
Shimodaira H, Hasegawa M. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol Biol Evol (1999) 16:1114–1116.[Web of Science]
Shimodaira H, Hasegawa M. CONSEL: for assessing the confidence of phylogenetic tree selection. Bioinformatics (2001) 17:1246–1247.
Siddall ME. Another monophyly index: revisiting the jackknife. Cladistics (1995) 11:33–56.[CrossRef][Web of Science]
Soltis DE, Albert VA, Savolainen V, et al, (11 co-authors). Genome-scale data, angiosperm relationships, and ending incongruence: a cautionary tale in phylogenetics. Trends Plant Sci (2004) 9:477–483.[CrossRef][Web of Science][Medline]
Striepen B, Pruijssers AJP, Huang JL, Li C, Gubbels MJ, Umejiego NN, Hedstrom L, Kissinger JC. Gene transfer in the evolution of parasite nucleotide biosynthesis. Proc Natl Acad Sci USA (2004) 101:3154–3159.
Suyama M, Torrents D, Bork P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res (2006) 34:W609–W612.
Tarleton RL, Kissinger J. Parasite genomics: current status and future prospects. Curr Opin Immunol (2001) 13:395–402.[CrossRef][Web of Science][Medline]
Taylor DJ, Piel WH. An assessment of accuracy, error, and conflict with support values from genome-scale phylogenetic data. Mol Biol Evol (2004) 21:1534–1537.
Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Res (1994) 22:4673–4680.
van Dongen S. Graph clustering by flow simulation (2000) University of Utrecht.
Whelan S, Goldman N. A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol Biol Evol (2001) 18:691–699.
WHO and UNICEF. World malaria report 2005 (2005) Geneva (Switzerland): World Health Organization.
Xu P, Widmer G, Wang Y, et al, (18 co-authors). The genome of Cryptosporidium hominis. Nature (2004) 431:1107–1112.
Zhu G, Keithly JS, Philippe H. What is the phylogenetic position of Cryptosporidium? Int J Syst Evol Microbiol (2000) 50:1673–1681.[Abstract]
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


