Skip Navigation


MBE Advance Access originally published online on August 23, 2007
Molecular Biology and Evolution 2007 24(11):2400-2411; doi:10.1093/molbev/msm178
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrow All Versions of this Article:
24/11/2400    most recent
msm178v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Shavit, L.
Right arrow Articles by Holland, B. R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Shavit, L.
Right arrow Articles by Holland, B. R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

Research Articles

The Problem of Rooting Rapid Radiations

Liat Shavit, David Penny, Michael D. Hendy and Barbara R. Holland

The Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North, New Zealand

E-mail: l.shavit{at}massey.ac.nz.


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
There are many examples of groups (such as birds, bees, mammals, multicellular animals, and flowering plants) that have undergone a rapid radiation. In such cases, where there is a combination of short internal and long external branches, correctly estimating and rooting phylogenetic trees is known to be a difficult problem. In this simulation study, we tested the performances of different phylogenetic methods at estimating a tree that models a rapid radiation. We found that maximum likelihood, corrected and uncorrected neighbor-joining, and corrected and uncorrected parsimony, all suffer from biases toward specific tree topologies. In addition, we found that using a single-taxon outgroup to root a tree frequently disrupts an otherwise correct ingroup phylogeny. Moreover, for uncorrected parsimony, we found cases where several individual trees (in which the outgroup was placed incorrectly) were selected more frequently than the correct tree. Even for parameter settings where the correct tree was selected most frequently when using extremely long sequences, for sequences of up to 60,000 nucleotides the incorrectly rooted trees were each selected more frequently than the correct tree. For all the cases tested here, tree estimation using a two taxon outgroup was more accurate than when using a single-taxon outgroup. However, the ingroup was most accurately recovered when no outgroup was used.

Key Words: maximum parsimony • maximum likelihood • misleading zones • neighbor-joining • outgroup rooting • topological bias


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
The problem of tree reconstruction and rooting is known to be challenging, especially in cases of rapid radiations where there is a combination of short and long branches. In particular, long-branch attraction (Felsenstein 1978Go; Hendy and Penny 1989Go; Bergsten 2005Go) is known to make this problem difficult. Many examples involving birds (Harrison et al. 2004Go), bees (Lockhart and Cameron 2001Go), mammals (Lin et al. 2002Go), and early divergences of multicellular animals (Philip, Creevey, and McInerney 2005Go; Philippe, Lartillot, and Brinkmann 2005Go) imply that these features are not just of theoretical interest. An example, which has recently highlighted this problem, is the dispute about the rooting of the angiosperms (Soltis et al. 2004Go; Stefanovic, Rice, and Palmer 2004Go; Goremykin et al. 2005Go; Leebens-Mack et al. 2005Go). As pointed out by Lockhart and Penny (2005)Go, the basic topology of the angiosperm radiation appears to be star-like (many short internal branches connecting large angiosperm lineages), whereas the outgroup taxa are relatively distant.

Simulation studies have proven to be useful in evaluating the strengths and weaknesses of phylogenetic methods in tree reconstruction. Previous simulation studies on bifurcating trees show that when internal branches are small relative to external branches, even a small misspecification of the substitution model may mislead phylogenetic inference (Poe and Swofford 1999Go; Ho and Jermiin 2004Go). Holland et al. (2003)Go conducted a simulation study of the performance of the unweighted pair group method with arithmetic mean (UPGMA), neighbor-joining (NJ), maximum parsimony (MP), and maximum likelihood (ML) methods, for a 5-taxon tree with a symmetric 4-taxon ingroup under a molecular clock. That study compared the accuracy of different phylogenetic methods for various sequence lengths, and explored the effectiveness of correcting NJ for multiple substitutions. Holland et al. (2003)Go also tested the effectiveness of using an outgroup to root a tree and demonstrated some of the problems in reconstructing and rooting trees. They discovered a misleading zone where the tree estimate is consistent (i.e., the probability of estimating the correct tree tends to 1 as the sequence length tends to infinity), but for a wide range of sequence lengths, 4 incorrect trees were each chosen up to twice as frequently as the correct tree. They also established that the inclusion of a distant outgroup, which should join into a short internal branch, frequently disrupted the ingroup tree. This effect of outgroup inclusion disrupting the ingroup was also found for both mammals and birds (Lin, Waddell, and Penny 2002Go; Slack et al. 2003Go). In their study, Holland et al. (2003)Go used only 5 taxa; as the number of taxa increases and the models become more complex, additional problems are expected.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
To extend the work of Holland et al. (2003)Go, we focused on a symmetric 8-taxon ingroup tree with 5 short internal branches and a 1- or 2-taxon outgroup joining at the middle point of the innermost branch (fig. 1). This is a generalized version of a rapid radiation. A symmetric tree was chosen for its potential for analytical (exact) solutions.


Figure 1
View larger version (9K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 1.— The model trees used for simulations (8-taxon simulation: solid lines; 9-taxon simulation [1-taxon outgroup]: solid and dashed lines; 10-taxon simulation [2-taxon outgroup]: solid, dashed, and dotted lines). In the 9-taxon simulation, z was set to 0. In both the 9- and the 10-taxon trees, the outgroup attaches the ingroup at the middle of the most inner-most edge of the 8-taxon tree.

 
For all simulations, unless otherwise stated, the following settings and procedures apply. Seq-Gen version 1.3.2 (Rambaut and Grassly 1997Go) was used to generate the sequences. Four-state data sets were generated on each of the trees using the Jukes–Cantor model (Jukes and Cantor 1969Go) of nucleotide substitution. We chose to use the Jukes–Cantor model, which is nested within more complex models for 4-state characters (Felsenstein 2004Go), to ensure the generality of our results. Substitutions at each site were independent and identically distributed with equal rates. Branch weights (lengths) were defined to be the expected number of substitutions per site on each branch. Each of the 3 tree-estimation algorithms MP, ML, and NJ were applied to every sample sequence using PAUP* version 4b10 (Swofford 2002Go). For MP and ML, heuristic searches were done with the HSEARCH command's default settings except for the option NBEST, which was set to 1 (this was done so that, for each data set, only 1 best tree discovered during the search will be saved).

When comparing NJ applied to corrected distances and MP, 2 parameters are being changed simultaneously—the tree building method and whether or not a correction for multiple substitutions is done (Steel, Hendy, and Penny 1993Go; Penny et al. 1996Go). However, it is possible to separate the effects of these 2 parameters to allow for a better comparison between the methods. Therefore, NJ was applied both with the Jukes–Cantor correction (Jukes and Cantor 1969Go) by setting the DSET option in PAUP* to JC and with no correction by setting the DSET option to P. In some cases, MP was performed with Jukes–Cantor correction in addition to its usual implementation (no correction). Although correcting MP for multiple changes is possible, it is not implemented in publicly available software. Therefore, correction for MP was implemented using our own code with distance Hadamard (Hendy and Penny 1993Go) applied to distances that were corrected by the Jukes–Cantor method (this code is available from l.shavit{at}massey.ac.nz). For more information about corrected MP (cMP) and the effect of the correction on parsimony's consistency, see Steel, Hendy, and Penny (1993)Go and Penny et al. (1996)Go.

Sequences were generated on the model trees depicted in figure 1. Branch lengths varied according to parameters x, y, z, and w (fig. 1), where x (ranging from 0.005 to 0.025 in steps of 0.010) is the expected number of substitutions per site on each of the 5 internal branches, y (ranging from 0.1 to 0.3 in steps of 0.1) is the expected number of substitutions per site on each of the 8 external branches, z (ranging from 0 to 0.3 in steps of 0.05) is the expected number of substitutions per site on the edge connecting the outgroup taxa to the ingroup in the middle of the innermost edge, and w (ranging from 0 to 0.3 in steps of 0.05) is the expected number of substitutions per site on each outgroup branch. If z + w ≥ 1.5x + y, then there is a point on the tree such that the distances from that point to each of the leaves are all equal (we then say that "a molecular clock is maintained," though this is not true for all parameter combinations used here). One thousand data sets were generated of lengths l = (200, 400, 800, and 1,600) for each parameter combination of the model tree. The reconstructed unweighted trees (without edge lengths) were compared with the model (generating) tree.


    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Eight-Taxon Simulation
Accuracy of the Methods
We first considered the ability of the methods to reconstruct the ingroup tree alone. Sequences were generated on the 8-taxon tree T8 = (((1,2),(3,4)),((5,6),(7,8))) (see fig. 1). Figure 2 shows the accuracy of the different methods in reconstructing T8 for different regions of the parameter space. The results of this simulation show that all 4 methods are consistent for all regions of the parameter space. As expected, all methods are less accurate when the internal edges are short and the external branches are long.


Figure 2
View larger version (27K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 2.— Accuracy of MP, uNJ, cNJ, and ML in reconstructing the 8-taxon tree. In each box, the percentage of correct trees out of the 1,000 trees constructed by each method is shown for each length of the internal edges x = (0.005, 0.015, 0.025). Each row corresponds to a different branch length y = (0.1, 0.2, 0.3) and each column corresponds to a different sequence length l = (200, 400, 800, 1,600).

 
An unexpected feature is that in this parameter space MP performed as well as, and usually better than, the other methods tested. This was surprising because the tree and parameters were chosen so that it would be difficult for MP to obtain the correct tree. However, it is known that some biases can favor the correct tree (Swofford et al. 2001Go; Sullivan and Joyce 2005Go). In the case of long external branches adjacent to short internal branches, the lengths of the short internal branches are overestimated resulting in the recovery of the correct tree (Yang 1997Go; Siddall 1998Go; Bruno and Halpern 1999Go; Sullivan and Swofford 2001Go; Swofford et al. 2001Go; Ho and Jermiin 2004Go; Sullivan and Joyce 2005Go). In our results, the more difficult the parameter combinations were (shorter x and/or longer y), the bigger the improvement in accuracy of MP over the other methods. ML performed slightly better than uncorrected neighbor-joining (uNJ) and corrected neighbor-joining (cNJ). uNJ and cNJ found T8 with virtually the same frequencies, for each point in the parameter space.

Topological Bias
Two trees have the same unlabeled topology if one tree can be converted into the other (ignoring branch lengths) by a permutation of the labels (taxon names). A 2-fold symmetry is a point on any vertex or edge on the tree where precisely 2 of the subtrees are topologically identical. An example of a 2-fold symmetry is a cherry, which is defined as a single pair of leaves adjacent to a common node (McKenzie and Steel 2000Go). Note that a star tree with 3 or more taxa contains no cherries, as there are more than 2 taxa adjacent to the single internal node. We investigated the bias of phylogenetic methods toward estimating trees with a certain number of cherries. The 4 possible 8-taxon, unrooted, unlabeled, bifurcating tree topologies are shown in figure 3. Their frequencies were calculated using the formula given by Hendy, Little, and Penny (1984)Go (see also Penny, Hendy, and Steel 1991Go). Within the 4 possible unlabeled topologies of 8-taxon bifurcating trees, 1 topology comprises 4 cherries, 2 topologies have 3 cherries, and 1 topology has 2 cherries (fig. 3).


Figure 3
View larger version (9K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 3.— The 4 unlabeled topologies of 8-taxon bifurcating trees. Topology 1 has 4 cherries, Topology 2 and 3 have 3 cherries, and Topology 4 has 2 cherries. The 2-fold centers of symmetry in each topology are indicated by arrows. The number, Nt, of different tip-labeled bifurcating trees having each topology, t, is given.

 
To test the hypothesis that parsimony methods are biased toward selecting the highly symmetric topology of T8, 10,395 alignments (the number of 8-taxon, unrooted, bifurcating trees) were generated on an 8-taxon star tree (by setting x = 0). The expected number y of substitutions per site on the 8 (external) branches was set to 0.2, and the length of the generated sequences was set to 1,000. Each of the 5 phylogenetic methods was applied to the set of alignments, and the number of trees of each of the 4 topologies (fig. 3) was recorded. The DCOLLAPSE and LCOLLAPSE options in PAUP* were both set to "Yes," thus allowing uNJ, cNJ, and ML to collapse branches with length smaller than 10–8. For MP, the 2 COLLAPSE options MINBRLEN and MAXBRLEN were tested.

It is important to note, that because the star tree is a multifurcating tree with no internal branches, the correct number of trees having any of the 4 bifurcating topologies is 0. However, seeing that all methods selected many bifurcating trees, we compared the distribution of these with the distribution of all different 8-taxon, leaf-labeled, unrooted, bifurcating trees. The results are shown in figure 4. All methods were found to be biased toward fully resolved trees. Even when they were allowed to collapse 0-length branches, none of the methods ever recovered the star tree. Moreover, the biases demonstrated were not equivalent for all methods.


Figure 4
View larger version (16K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 4.— The percentage of trees, having each of the possible topologies for an 8-taxon unrooted tree, out of 10,395 trees constructed by each method for sequences generated on a star tree. The box on the left shows a classification of the 10,395 8-taxon bifurcating trees into the 4 possible topologies. On the right is the classification of the 10,395 trees, constructed by each method for sequences generated on a star tree, into the 4 possible unlabeled bifurcating tree topologies.

 
Strikingly, for MP, 80% of the estimated trees had 4 cherries, although only 3% (315 out of 10,395) of the 8-taxon bifurcating trees have such a topology. Furthermore, MP did not select any trees with 2 cherries or any multifurcating trees. In this example, we did not detect any differences in the results using either of the 2 collapsing options (MINBRLEN, MAXBRLEN). A less extreme bias was found for cMP, where 10% of the estimated trees had 4 cherries and 26% had 2 cherries. Both uNJ and cNJ had similar biases with only 17% of the estimated trees having 2 cherries, substantially less than the 49% (5,040 out of 10,395) of bifurcating trees having this topology. Thirteen percent of the trees constructed by uNJ and cNJ had 4 cherries, still well in excess of the 3% in the uniform distribution of the bifurcating trees.

Compared with the distribution of bifurcating trees, all methods selected more trees with topology 3 and fewer trees with topology 2 (both topologies have 3 cherries). For ML, some of the trees with topology 2 were collapsed into multifurcating trees. ML also found fewer trees with topology 4 (2 cherries) than there are in the uniform distribution.

For MP, cMP, uNJ, and cNJ, a general bias toward forming cherries was found. Although ML demonstrated less bias toward forming cherries, it did exhibit bias against collapsing edges that are adjacent to cherries. In more than 40% of the cases, ML selected multifurcating trees; however, the star tree was never selected. MP, uNJ, and cNJ estimated only bifurcating trees, even though the collapse options in PAUP* version 4b10 (Swofford 2002Go) were set to "Yes." The bias toward selecting bifurcating trees with cherries is particularly evident for MP, and this is almost certainly the explanation for why MP appears to perform so well in figure 2. When the sequence length was increased to 10,000, cMP, uNJ, and cNJ selected each topology with a similar frequency (to within 2%) to that found when the length of the generated sequences was set to 1,000. ML selected more trees with 2 cherries (topology 4) and fewer trees of topology 3 than were selected when the sequence length was set to 1,000 but selected the other topologies with similar frequencies (to within 1%) to those found with sequence length l = 1,000. MP selected only (i.e., 100%) trees with 4 cherries (topology 1). We also tested the effect of setting NBEST to "No," allowing the methods to select more than one tree for each data set (while weighting the trees for each data set, so that the total weight of each data set was 1). This did not have a significant effect on the results.

We have shown that, for the parameter space used, all the phylogenetic methods tested here were consistent in reconstructing the 8-taxon tree (T8). Nonetheless, we found that the phylogenetic methods tested, and particularly MP, are biased toward specific tree topologies.

Adding a Single-Taxon Outgroup
The next simulation tested the effect of adding a single-taxon outgroup to the 8-taxon tree. Sequences were generated on the 9-taxon tree T9 = ((((1,2),(3,4)),((5,6),(7,8))),9). The expected number of substitutions per site on the edge connecting the outgroup taxa to the ingroup, z, was set to 0.

Accuracy of the Methods
Given that the simulation study done by Holland et al. (2003)Go found that the addition of an outgroup can disrupt a correct ingroup, we compared the outcomes of applying the methods to the 9-taxon alignment and to an alignment of the 8 ingroup taxa alone. The results were classified into 6 categories according to the scheme shown in figure 5a, based on whether or not the 9-taxon tree (constructed from the 9-taxon alignment) was correct and whether or not the 8-taxon tree (constructed from the 8-taxon ingroup alignment) was correct. The percentage of trials resulting in each category is reported in figure 5b. As in the 8-taxon simulation, and as expected, all methods were found to be less accurate when the internal edges are short and the external branches are long (see supplementary material S1, Supplementary Material online).


Figure 5
View larger version (41K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 5.— Frequencies of different types of error in reconstructing the 9-taxon tree. (a) The different types of result combinations in reconstructing the 9- and 8-taxon tree from the 9- and 8-taxon alignments, respectively. Each terminal node description of a category gives the result of 8-taxon estimation, whereas internal node descriptions are results of 9-taxon estimation. The first category is where both the 8- and 9-taxon trees are correct. The second is where the addition of the outgroup corrects an incorrect 8-taxon tree. The third category is where both the 8-taxon tree and the ingroup tree within the 9-taxon tree were constructed correctly, but the outgroup was misplaced. The fourth category is where the addition of the outgroup corrects an incorrect ingroup tree, but the outgroup itself is placed incorrectly. More disturbing is the fifth category where the inclusion of the outgroup has confounded the correct 8-taxon ingroup tree. The last category is where both the 8-taxon tree and the ingroup within the 9-taxon tree are incorrect. (b) The results for sequence length l = 1,600 averaged over the length of the internal edges, x, and the length of the external branches, y.

 
With the inclusion of an outgroup, the accuracy in reconstructing the correct ingroup tree was reduced compared with the 8-taxon case. As expected, the results show that the more distant the outgroup becomes, the more difficult it is to reconstruct the correct tree (fig. 5). For the tree and parameters used, ML was the most accurate of the methods tested. MP, which was very accurate in reconstructing the 8-taxon tree (see Topological Bias), was particularly affected by the inclusion of the outgroup. In fact, MP was the only method that became inconsistent (with parameters x = 0.005 or x = 0.015, y = 0.3, z = 0, and w = 0.4; see supplementary material S1, Supplementary Material online). When the molecular clock was maintained, uNJ performed better than cNJ, but when the molecular clock was violated, cNJ was more accurate (see table 1).


View this table:
[in this window]
[in a new window]

 
Table 1 Accuracy of cNJ and uNJ in Reconstructing the 9-Taxon Tree with Respect to the Molecular Clock Assumption

 
Most interesting are cases in which the 8-taxon ingroup tree was correct but adding the outgroup disrupted the ingroup (these are ~13% of all cases). In most of those cases, the distorted ingroup results from the outgroup attaching to the ingroup at one of the long external branches, 2 branches away from the correct short internal branch. Examples, where the addition of an outgroup distorts an ingroup tree, were previously reported for birds (Slack et al. 2003Go) and for mammals (Lin, Waddell, and Penny 2002Go). The converse situation, where an incorrect ingroup tree was constructed (on an 8-taxon alignment) but the correct 9-taxon tree was found, occurred in less than 1.5% of the cases (fig. 5).

Misleading Zone
For MP, the simulations on the 9-taxon tree with the parameters x = 0.015, y = 0.2, z = 0, and w = 0.4 were extended to include sequence lengths of l = (200, 400, 800, ..., 204,800). Trees were classified into 4 categories: I) the single correct tree, II) the 4 trees in which the ingroup phylogeny is correct but the outgroup (taxon 9) is incorrectly joined to one of the internal branches, III) the 8 trees in which the ingroup phylogeny is correct but the outgroup (taxon 9) is incorrectly joined to one of the external branches, and IV) the remaining 135,122 trees. The results for sequence lengths l = 200–102,400 are shown in figure 6.


Figure 6
View larger version (16K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 6.— The misleading zone for MP. In this simulation, the 9-taxon tree T9 = ((((1,2),(3,4)),((5,6),(7,8))),9), with parameters x = 0.015, y = 0.2, z = 0, and w = 0.4, was used. The frequency with which the correct tree and each of the competing trees—the 4 (1 branch away) trees in which the ingroup phylogeny is correct but the outgroup, taxon 9, is incorrectly joined to one of the internal branches, for example (((1,2),9),(3,4)),((5,6),(7,8))), and the 8 (2 branches away) trees in which the ingroup phylogeny is correct but the outgroup, taxon 9, is incorrectly joined to one of the external branches, for example (((1,9),2),(3,4)),((5,6),(7,8)))—were chosen is shown. All other 135,122 trees in which the ingroup is wrong are collectively referred to as "other topologies." For each category, the results are averaged over the number of trees in the category. The misleading zone extends to a sequence length of approximately 60,000 nucleotides. Only then does the correct tree get selected more frequently than each of the 8 competing trees that are 2 branches away from the correct tree.

 
Within its consistency zone, the probability of MP selecting the correct tree goes to 1 as the sequence length increases. However, following Holland et al. (2003)Go, we have identified a misleading zone within, but close to the boundary of, the consistency zone of MP. This is a specific region of the parameter space in which MP is consistent, but for finite sequence lengths, it is possible for each of several individual incorrect trees to be selected more frequently than the correct tree. For example, the 9-taxon tree with parameters x = 0.015, y = 0.2, z = 0, and w = 0.4 is inside the misleading zone of MP. For l = 1,600, each of 8 incorrect trees is selected with much greater frequency than the correct tree. For l = 200 (using 10,000 data sets), we found a ratio of ~1:3 between the correct tree and each of 8 incorrect trees where the outgroup attaches to one of the external branches. Sequences of ~60,000 nucleotides are required before the correct tree is chosen more frequently than any other tree. With sequence length of 102,400, the correct tree is still only recovered ~15% of the time. With sequence length of 204,800 (not shown), the correct tree is recovered in ~28% of the time. Extrapolating from this data, we expect that a sequence length of at least 400,000 characters would be needed for MP to have a 50% success rate in finding the correct tree. It is important to note that correcting for multiple substitutions significantly reduces the size of MP's misleading zone for this combination of parameters. In fact, for cMP (as for uNJ, cNJ, and ML), a sequence length as short as 400 is already enough for the correct tree to be chosen most frequently (data not shown). For short sequence lengths (l = 200), all methods often select an incorrect tree and some incorrect trees are each selected with greater frequency than that of the correct tree. But because the number of times each tree is selected is very small, it is difficult to check whether this is statistically significant. However, as in the 5-taxon study of Holland et al. (2003)Go, there does appear to be a small misleading zone for all the methods studied here.

Breaking Symmetry
In order to evaluate the effect of breaking the symmetry of the ingroup tree, we changed the 9-taxon tree so that one external edge of the ingroup is longer than the others (a higher rate of evolution) and consequentially the symmetry of the ingroup is broken. The results (fig. 7) show that the longer this ingroup branch is, the more frequently the outgroup joins it, reducing the accuracy of all methods in reconstructing the 9-taxon tree. The long external edge seems to have little effect on the accuracy of ML and cNJ, whereas a strong negative effect on both MP and uNJ was observed. The longer the selected external edge, the further we are from maintaining a molecular clock and the more pronounced the advantage of the corrected methods (ML and cNJ) over the uncorrected methods (uNJ and MP) becomes.


Figure 7
View larger version (31K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 7.— The effect of symmetry breaking. The results are shown for parameters y = 0.1 and w = 0.3. In each box, the percentage of correct trees reconstructed for each sequence length and each of the 4 lengths e = (0, 0.05, 0.1, 0.2) that were added to one external branch is shown. Each row corresponds to a different length of internal edges x = (0.005, 0.015, 0.025) and each column corresponds to a different tree-estimation method.

 
We have demonstrated that with the use of a single-taxon outgroup and a rapid radiation, it is difficult to correctly infer the position of the root, even when the ingroup tree is correct. This is particularly noticeable when the substitution rate of one ingroup taxon is higher than the others. Of particular concern is the observation that introducing an outgroup can interfere with the accuracy of the ingroup tree.

Two-Taxon Outgroup
Accuracy of the Methods
This simulation was used to evaluate the effect of including a second outgroup taxon, on the accuracy of the different methods in reconstructing the tree. Sequences were generated on the 10-taxon tree T10 (fig. 1) with 2, 1, or 0 outgroup taxon removed to acquire the 8-, 9-, and 10-taxon data sets, respectively. The phylogenetic methods were applied to the same data sets, and their ability to reconstruct the correct 8-, 9-, and 10-taxon trees was compared.

In table 2, the number of times in which the tree was reconstructed correctly is reported for each of the methods and for the 4 different branch lengths used. In every single case, correct trees were reconstructed more frequently for the 10-taxon data compared with the 9-taxon data. However, the frequency with which trees were correctly estimated for the 8-taxon data is higher than for both the 9- and 10-taxon data sets. This is true for all 4 methods with each of the sequence lengths.


View this table:
[in this window]
[in a new window]

 
Table 2 Accuracy of the Final Tree

 
This increase in reliability, when going from 9 to 10 taxa, runs counter to our intuition that the greater the number of taxa (and so the greater the number of internal edges that need to be estimated), the more difficult it is to reconstruct the correct tree. A possible explanation is that the more balanced topology of the 10-taxon tree makes it easier for the methods to reconstruct it. The correct ingroup is reconstructed most frequently for the 8-taxon (ingroup alone) data sets (table 3) and more frequently for the 10-taxon data set than for the 9-taxon data sets. Thus, the inclusion of a single-taxon outgroup disrupts the correctly constructed ingroup more frequently than does the inclusion of the 2 related outgroup taxa.


View this table:
[in this window]
[in a new window]

 
Table 3 Ingroup Tree Accuracy

 
Placement of a Second Outgroup Taxon
Biologists often face the problem of choosing good outgroup taxa for tree reconstruction. In this simulation, we tested how the placement of the second outgroup taxon affects the accuracy of the methods in reconstructing the ingroup, that is, the 8-taxon tree. The ability of the methods to reconstruct the ingroup for different values of z (the expected number of substitutions on the edge connecting the outgroups' common ancestor to the ingroup) and w (the expected number of substitutions on the edge of each outgroup taxon) was compared. In addition, the outcomes of these runs were compared with the corresponding results for 9 and 8 taxon (all phylogenetic methods used were applied to the same data sets). The results are shown in figure 8, where the accuracy of the methods in reconstructing the ingroup tree using the 8, 9, and 10 taxon (unconstrained) is presented. The results are categorized into 8 categories: "rrr," ingroup correct in all (8, 9, and 10 taxon); "rrw," ingroup wrong in the 10 taxon but correct in the 8 and 9 taxon; "rwr," ingroup correct in the 8 and 10 taxon but wrong in the 9 taxon; "rww," ingroup wrong in both the 9 and 10 taxon but correct in the 8 taxon; "wrr," ingroup correct in the 9 and 10 taxon but wrong in the 8 taxon; "wrw," ingroup wrong in the 8 and 10 taxon but correct in the 9 taxon; "wwr," ingroup correct in the 10 taxon but wrong in the 8 and 9 taxon; "www," ingroup wrong in all (8, 9, and 10 taxon).


Figure 8
View larger version (36K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 8.— Classification of the 10-taxon results into the 8 possible combinations (see text). In this simulation, a 10-taxon tree with parameters x = 0.015, y = 0.2, and z + w = 0.3 was used, z varied from 0 to 0.3 in steps of 0.05. For each method, the percentage of trees out of 1,000 trees constructed, in each class described in the text, is shown for each length of the edge connecting the outgroup taxa to ingroup tree. The results are shown for sequence length l = 1,600.

 
As expected, ML constructed the correct ingroup (for 8, 9, and 10 taxon) more frequently than did the other methods. Although uNJ performed slightly better than cNJ, both constructed the correct trees with similar frequencies (the parameters used obey the molecular clock assumption). MP reconstructed the ingroup correctly for all in only about 55% of the cases; however, it had the lowest percentage of "www" (wrong in all). Moreover, when the common ancestor of the 2 outgroup taxa was close to the ingroup (z = 0.05), MP reconstructed the ingroup tree correctly for the 8- and 10-taxon data approximately 95% of the time. In addition, MP has the highest percentage of runs in which the ingroup was reconstructed correctly in the 8 and 10 taxon but was wrong in the 9-taxon data ("rwr") and the lowest percentage of runs in which the ingroup was wrong for the 8-taxon data but was right for the others ("wrr"). These results are as expected, taking into account the bias parsimony has toward forming cherries. Cases in which the methods construct the ingroup incorrectly from the 8- and 10-taxon data sets while reconstructing the correct ingroup from the 9-taxon data set are very rare (<2%).

Finally, we tested the accuracy of the phylogenetic methods in reconstructing the 10-taxon tree for different numbers (z) of substitutions per site on the edge connecting the common ancestor of the 2 outgroup taxa to the ingroup and the effect of constraining the 2 outgroup taxa to be together. The results are shown in figure 9. The closer the common ancestor of the 2 outgroup taxa was to the ingroup (the further the 2 outgroup taxa are from each other), the more accurate the methods were in reconstructing the 10-taxon generating tree. However, it appears advantageous for z to be larger than 0 (such that there is a split separating the outgroup taxa from the ingroup). This trend is very obvious for MP, where the accuracy dropped very rapidly as the common ancestor of the 2 outgroup taxa became further from the ingroup. This trend is also noticeable for uNJ and cNJ where a more moderate change in accuracy was observed. For ML, although only a very slight drop in accuracy was found, the general trend still applies. We also found that for z = 0 constraining the 2 outgroup taxa to come together had a positive effect on the accuracy of all the methods, both in reconstructing the ingroup tree and in placing the outgroup taxa in the correct position. When z > 0, for long sequences, constraining the 2 outgroup taxa to come together did not effect the accuracy with which the methods reconstructed the ingroup tree and placed the outgroup taxa (fig. 9a). However, for short sequences and small values of z, a slight improvement was recorded (fig. 9b).


Figure 9
View larger version (31K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 9.— Accuracy in constructing the 10-taxon tree for different number (z) of substitutions per site on the edge connecting the common ancestor of the 2 outgroup taxa to the ingroup, with and without constraining the 2 outgroup taxa to be together. A 10-taxon tree with parameters x = 0.015, y = 0.2, and z + w = 0.3 was used, z varied from 0 to 0.3 in steps of 0.05. (a) The results for sequence length l = (200, 400); scale = (0, 25). (b) The results for sequence length l = (800, 1,600); scale = (0, 100). All methods are more accurate when the 2 outgroup taxa used are separated from the ingroup with a common nonzero branch and when the common ancestor of the 2 outgroup taxa is close to the ingroup.

 

    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
In this simulation study, we have identified problems that are likely to affect the ability of phylogenetic methods to reconstruct tree topologies corresponding to rapid radiations (where there is a combination of short internal and long external branches). Rapid radiations are often star-like, and it is therefore important to identify possible biases in reconstructing a star tree. We established that MP, cMP, uNJ, and cNJ are all biased toward forming cherries (see fig. 4). This effect is most pronounced for MP, for which trees having 4 cherries were chosen many more times than any other topology even though the generating tree had no cherries. ML seems to be biased in a different way; it appears to collapse edges that are not adjacent to cherries. All methods are biased toward a high number of internal edges as none of the methods was successful in recovering the star tree, even when collapsing was allowed. This effect is similar to the Bayesian "star paradox," where sequences that have evolved on a star tree can give branches with posterior probability close to 1. Steel and Matsen (2007)Go showed that for Bayesian analysis, this effect is not expected to automatically vanish given long enough sequences. Topological biases, such as the bias toward forming cherries found here, may work either against or in favor of the methods in reconstructing trees (depending on the true topology of the tree).

Our findings indicate that rooting a star-like tree (many short internal branches connecting long external branches), by joining distant outgroup taxa to a short internal edge, often prevents the correct construction of the ingroup tree (see table 3 and fig. 5). The effect is particularly strong when an outgroup taxon and an ingroup taxon share a higher substitution rate (fig. 7). In many of the cases tested, the outgroup was placed 2 branches away from the correct position. For our data, an important finding is that when a tree rooted by an outgroup is in disagreement with the unrooted ingroup tree, the unrooted ingroup tree is most often correct. For the cases tested here, we found that the use of 2 outgroup taxa is better than the use of a single outgroup taxon, both for the accuracy with which a tree is rooted and for maintaining the correct ingroup tree (see tables 2 and 3). However, ingroup tree reconstruction is more accurate when the methods are applied to the ingroup alone (see table 3 and fig. 8). We also found that using 2 outgroup taxa that are distant from each other is better than using 2 closely related outgroup taxa; this is especially true for MP. For the trees and parameters tested here, and for short sequence lengths, constraining the 2 outgroup taxa to come together is generally advantageous, especially when they are not closely related (see fig. 9b). However, for longer sequences, constraining the outgroup taxa to come together does not have an effect on the accuracy of the methods (see fig. 9a). In general, our results confirm that it is "best practice" to infer phylogeny both with and without an outgroup and then compare the results.

Correcting MP for multiple changes was found to be beneficial in the cases where the molecular clock assumption is valid, particularly in cases where MP is misleading or inconsistent. A possible explanation for this is that with the given tree topology under the molecular clock, MP suffers from long-branch attraction. With our parameters, cMP does not suffer from long-branch attraction and therefore is doing better in estimating the correct tree. Nevertheless, under the set of parameters used here, when the molecular clock assumption is violated, MP does not suffer from the long-branch attraction and is indeed biased toward the correct tree. Consequently, under our conditions when the molecular clock assumption is violated, MP is more accurate than cMP in reconstructing the correct tree. This effect is likely to be a characteristic of the highly symmetric model tree.

In the cases where the molecular clock assumption is valid, uNJ was found to be more accurate than cNJ in reconstructing both the 9-taxon tree as a whole and the relationships among the ingroup taxa (see table 1). However, when this assumption is violated, by breaking the symmetry of the tree, cNJ and ML were found to be more accurate than uNJ and MP. This effect under the molecular clock may be due to amplification of sampling error and/or because the standard correction has a bias toward overcorrecting. These results are consistent with those found in other simulation studies (Sourdis and Krimbas 1987Go; Saitou and Imanishi 1989Go; Holland et al. 2003Go), where corrections for multiple substitutions were found to be helpful only for recovering trees with unequal rates of change along branches. Nei and Kumar (2000)Go offered guidelines for constructing phylogenetic trees and our results support their argument that uncorrected distances give the correct tree more often than corrected distances when the rate of nucleotide substitution is nearly the same for all evolutionary lineages and there is no strong transition/transversion bias.

Our results support the observation of Holland et al. (2003)Go that methods can be consistent but misleading (even in the absence of model misspecification). We observed a misleading zone for MP where, although the frequency with which the correct tree is found tends to 1 as the sequence length l tends to infinity, for finite yet very long sequences, a number of incorrect trees are each chosen more frequently than the correct tree.

Holland et al. (2003)Go considered the boundary of the consistency zone for MP, that is, the part of the parameter space where a slight change in the edge lengths makes parsimony either consistent or inconsistent. For 5-taxon trees with 2-state data, they calculated that each of the 4 incorrect trees where the outgroup is drawn to an external edge is selected by MP twice as frequently as the correct tree. In the 5-taxon case, there are only 6 splits for which the number of substitutions needed is not the same in the 5 competing trees. All 6 splits have the same expected frequency on the boundary of MP inconsistency. Two of those splits support the correct tree and each of these has to independently compete with 2 of the other splits (see Holland et al. 2003Go). The calculation for 4-state data is more complex, but we suspect that the ratio between the correct tree and each of the frequently selected incorrect trees will be equivalent to that of the 2-state data. The calculation for the 9-taxon tree is more difficult, as there are many interdependent splits. Therefore, further mathematical work is required to calculate the ratio between the correct tree and each of the incorrect trees, where the outgroup is drawn to an external edge, and to evaluate the effect of the number of taxa on the frequency with which the correct tree (with the outgroup in its correct placement) is found.

Although this study specifically tested the effects on the reconstruction of an 8-taxon symmetric tree and a simple (biologically oversimplified) substitution model, the problems reported are expected to exist in larger trees and with more complex models (in which the Jukes–Cantor model is nested). Using a complex model of sequence evolution would not have ensured that any tree-estimation properties found were general. In our study, we used 4-state data, which is the natural biological language and are known to saturate slightly slower than 2-state data (Penny et al. 2001Go). It would be interesting to test the methods further using 20-state amino acid data. Bayesian phylogenetic analysis was suggested to be as robust to relative branch-length differences as ML (Mar, Harlow, and Ragan 2005Go); therefore, it would also be interesting to test Bayesian inference for the cases studied here.


    Supplementary Material
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Supplementary material S1 is available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
We thank Klaus Schliep for R-code to generate graphs and statistical advice, and we thank Warwick Allen for computer support. We also thank the Marsden Fund and the Foundation for Research Science and Technology for funding. This study would not have been possible without the use of Helix parallel computing facility (http://helix.massey.ac.nz).


    Footnotes
 
Arndt Von Haeseler, Associate Editor


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 

    Bergsten J. A review of long-branch attraction. Cladistics (2005) 21:163–193.[CrossRef][Web of Science]

    Bruno WJ, Halpern AL. Topological bias and inconsistency of maximum likelihood using wrong models. Mol Biol Evol. (1999) 16:564–566.[Web of Science][Medline]

    Felsenstein J. Cases in which parsimony or compatibility methods will be positively misleading. Syst Zool. (1978) 27:401–410.

    Felsenstein J. Inferring phylogenies (2004) Sunderland, (MA): Sinauer Associates.

    Goremykin VV, Holland B, Hirsch-Ernst KI, Hellwig FH. Analysis of Acorus calamus chloroplast genome and its phylogenetic implications. Mol Biol Evol. (2005) 22:1813–1822.[Abstract/Free Full Text]

    Harrison GL, McLenachan PA, Phillips MJ, Slack KE, Cooper A, Penny D. Four new avian mitochondrial genomes help get to basic evolutionary questions in the late Cretaceous. Mol Biol Evol. (2004) 21:974–983.[Abstract/Free Full Text]

    Hendy MD, Little CHC, Penny D. Comparing trees with pendant vertices labelled. SIAM J Appl Math. (1984) 44:1054–1065.[CrossRef]

    Hendy MD, Penny D. A framework for the quantitative study of evolutionary trees. Syst Zool. (1989) 38:297–309.

    Hendy MD, Penny D. Spectral-analysis of phylogenetic data. J Classif. (1993) 10:5–24.[Medline]

    Ho SYW, Jermiin LS. Tracing the decay of the historical signal in biological sequence data. Syst Biol. (2004) 53:623–637.[CrossRef][Web of Science][Medline]

    Holland BR, Penny D, Hendy MD. Outgroup misplacement and phylogenetic inaccuracy under a molecular clock—a simulation study. Syst Biol. (2003) 52:229–238.[CrossRef][Web of Science][Medline]

    Jukes TH, Cantor CR. Evolution of protein molecules. In: Mammalian protein metabolism—Munro HN, ed. (1969) New York: Academic Press. 21–132.

    Leebens-Mack J, Raubeson LA, Cui LY, Kuehl JV, Fourcade MH, Chumley TW, Boore JL, Jansen RK, dePamphilis CW. Identifying the basal angiosperm node in chloroplast genome phylogenies: sampling one's way out of the Felsenstein zone. Mol Biol Evol. (2005) 22:1948–1963.[Abstract/Free Full Text]

    Lin YH, McLenachan PA, Gore AR, Phillips MJ, Ota R, Hendy MD, Penny D. Four new mitochondrial genomes and the increased stability of evolutionary trees of mammals from improved taxon sampling. Mol Biol Evol. (2002) 19:2060–2070.[Abstract/Free Full Text]

    Lin YH, Waddell PJ, Penny D. Pika and vole mitochondrial genomes increase support for both rodent monophyly and glires. Gene (2002) 294:119–129.[CrossRef][Web of Science][Medline]

    Lockhart PJ, Cameron SA. Trees for bees. Trends Ecol Evol. (2001) 16:84–88.[CrossRef][Medline]

    Lockhart PJ, Penny D. The place of Amborella within the radiation of angiosperms. Trends Plant Sci. (2005) 10:201–202.[CrossRef][Web of Science][Medline]

    Mar JC, Harlow TJ, Ragan MA. Bayesian and maximum likelihood phylogenetic analyses of protein sequence data under relative branch-length differences and model violation. BMC Evol Biol. (2005) 5:8.[CrossRef][Medline]

    McKenzie A, Steel M. Distributions of cherries for two models of trees. Math Biosci. (2000) 164:81–92.[CrossRef][Web of Science][Medline]

    Nei M, Kumar S. Molecular evolution and phylogenetics (2000) New York: Oxford University Press.

    Penny D, Hendy MD, Lockhart PJ, Steel MA. Corrected parsimony, minimum evolution, and Hadamard conjugations. Syst Biol. (1996) 45:596–606.

    Penny D, Hendy MD, Steel MA. Testing the theory of descent. In: Phylogenetic analysis of DNA sequences—Miyamoto MM, Cracraft J, eds. (1991) New York: Oxford University Press. 155–183.

    Penny D, McComish BJ, Charleston MA, Hendy MD. Mathematical elegance with biochemical realism: the covarion model of molecular evolution. J Mol Evol. (2001) 53:711–723.[CrossRef][Web of Science][Medline]

    Philip GK, Creevey CJ, McInerney JO. The Opisthokonta and the Ecdysozoa may not be clades: stronger support for the grouping of plant and animal than for animal and fungi and stronger support for the Coelomata than Ecdysozoa. Mol Biol Evol. (2005) 22:1175–1184.[Abstract/Free Full Text]

    Philippe H, Lartillot N, Brinkmann H. Multigene analyses of bilaterian animals corroborate the monophyly of Ecdysozoa, Lophotrochozoa, and Protostomia. Mol Biol Evol. (2005) 22:1246–1253.[Abstract/Free Full Text]

    Poe S, Swofford DL. Taxon sampling revisited. Nature (1999) 398:299–300.

    Rambaut A, Grassly NC. Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic frees. Comput Appl Biosci. (1997) 13:235–238.[Abstract/Free Full Text]

    Saitou N, Imanishi T. Relative efficiencies of the Fitch-Margoliash, maximum-parsimony, maximum-likelihood, minimum-evolution, and neighbor-joining methods of phylogenetic tree construction in obtaining the correct tree. Mol Biol Evol. (1989) 6:514–525.[Web of Science]

    Siddall ME. Success of parsimony in the four-taxon case: long-branch repulsion by likelihood in the Farris zone. Cladistics (1998) 14:209–220.[CrossRef][Web of Science]

    Slack KE, Janke A, Penny D, Arnason U. Two new avian mitochondrial genomes (penguin and goose) and a summary of bird and reptile mitogenomic features. Gene (2003) 302:43–52.[CrossRef][Web of Science][Medline]

    Soltis DE, Albert VA, Savolainen V. (11 co-authors). Genome-scale data, angiosperm relationships, and ‘ending incongruence’: a cautionary tale in phylogenetics. Trends Plant Sci. (2004) 9:477–483.[CrossRef][Web of Science][Medline]

    Sourdis J, Krimbas C. Accuracy of phylogenetic trees estimated from DNA-sequence data. Mol Biol Evol. (1987) 4:159–166.[Abstract]

    Steel M, Matsen FA. The Bayesian ‘star paradox’ persists for long finite sequences. Mol Biol Evol. (2007) 24:1075–1079.[Abstract/Free Full Text]

    Steel MA, Hendy MD, Penny D. Parsimony can be consistent. Syst Biol. (1993) 42:581–587.

    Stefanovic S, Rice DW, Palmer JD. Long branch attraction, taxon sampling, and the earliest angiosperms: amborella or monocots? BMC Evol Biol. (2004) 4:35.[CrossRef][Medline]

    Sullivan J, Joyce P. Model selection in phylogenetics. Annu Rev Ecol Evol Syst. (2005) 36:445–466.[CrossRef]

    Sullivan J, Swofford DL. Should we use model-based methods for phylogenetic inference when we know that assumptions about among-site rate variation and nucleotide substitution pattern are violated? Syst Biol. (2001) 50:723–729.[CrossRef][Web of Science][Medline]

    Swofford DL. PAUP*. In: Phylogenetic analysis using parsimony (*and other methods). Version 4 (2002) Sunderland, (MA): Sinauer Associates.

    Swofford DL, Waddell PJ, Huelsenbeck JP, Foster PG, Lewis PO, Rogers JS. Bias in phylogenetic estimation and its relevance to the choice between parsimony and likelihood methods. Syst Biol. (2001) 50:525–539.[Web of Science][Medline]

    Yang ZH. How often do wrong models produce better phylogenies? Mol Biol Evol. (1997) 14:105–108.[Web of Science][Medline]

Accepted for publication August 10, 2007.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Phil Trans R Soc BHome page
T. Dagan and W. Martin
Getting a better picture of microbial evolution en route to a network of genomes
Phil Trans R Soc B, August 12, 2009; 364(1527): 2187 - 2196.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Bot.Home page
S. W. Graham and W. J. D. Iles
Different gymnosperm outgroups have (mostly) congruent signal regarding the root of flowering plant phylogeny
Am. J. Botany, January 1, 2009; 96(1): 216 - 227.
[Abstract] [Full Text] [PDF]


Home page
Syst BiolHome page
J. B. Whitfield, S. A. Cameron, D. H. Huson, and M. A. Steel
Filtered Z-Closure Supernetworks for Extracting and Visualizing Recurrent Signal from Incongruent Gene Trees
Syst Biol, December 1, 2008; 57(6): 939 - 947.
[Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow Supplementary Material
Right arrow All Versions of this Article:
24/11/2400    most recent
msm178v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Shavit, L.
Right arrow Articles by Holland, B. R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Shavit, L.
Right arrow Articles by Holland, B. R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?