MBE Advance Access originally published online on May 9, 2007
Molecular Biology and Evolution 2007 24(8):1722-1730; doi:10.1093/molbev/msm094
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
Rates of Genome Evolution and Branching Order from Whole Genome Analysis

* John Curtin School of Medical Research, The Australian National University, Canberra, Australia
Walter and Eliza Hall Institute, Parkville, Victoria, Australia
E-mail: Gavin.Huttley{at}anu.edu.au.
| Abstract |
|---|
|
|
|---|
Accurate estimation of any phylogeny is important as a framework for evolutionary analysis of form and function at all levels of organization from sequence to whole organism. Using alignments of nonrepetitive components of opossum, human, mouse, rat, and dog genomes we evaluated two alternative tree topologies for eutherian evolution. We show with very high confidence that there is a basal split between rodents (as represented by the mouse and rat) and a branch joining primates (as represented by humans) and carnivores (as represented by dogs), consistent with some but not the most widely accepted mammalian phylogenies. The result was robust to substitution model choice with equivalent inference returned from a spectrum of models ranging from a general time reversible model, a model that treated nucleotides as either purines and pyrimidines, and variants of these that incorporated rate heterogeneity among sites. By determining this particular branching order we are able to show that the rate of molecular evolution is almost identical in rodent and carnivore lineages and that sequences evolve
11%–14% faster in these lineages than in the primate lineage. In addition by applying the chicken as outgroup the analyses suggested that the rate of evolution in all eutherian lineages is
30% slower than in the opossum lineage. This pattern of relative rates is inconsistent with the hypothesis that generation time is an important determinant of substitution rates and, by implication, mutation rates. Possible factors causing rate differences between the lineages include differences in DNA repair and replication enzymology, and shifts in nucleotide pools. Our analysis demonstrates the importance of using multiple sequences from across the genome to estimate phylogeny and relative evolutionary rate in order to reduce the influence of distorting local effects evident even in relatively long sequences.
Key Words: mutation substitution rate male biased evolution eutherian phylogeny
| Introduction |
|---|
|
|
|---|
The high level of interest in the biology of eutherian mammals is reflected in the number of mammalian genome projects completed or underway. Much of the value of sequencing multiple genomes comes from the comparisons that can be made among them. Comparative analysis can be used to investigate important biological processes from rates of nucleotide sequence evolution to the direction of morphological change, and to understand relationships among these different processes. However, because species arise through a series of hierarchical branching events during evolution they do not have independent evolutionary histories so that when they are analyzed comparatively their phylogenetic relationships must be taken into account. Accurate estimation of phylogeny is thus essential for most comparative evolutionary analyses (Harvey and Pagel 1991
Despite the importance of reliable phylogenies for the interpretation of mammalian evolution, consensus has not been reached as to some basal divergences. Early efforts at estimating the phylogeny used comparison of the anatomy of fossil and extant taxa. Neither was a good source of phylogenetic information. Data were interpreted as indicating a radiation rather than a branching phylogeny. Very few eutherian fossils were known from the Cretaceous and many of the orders appear at approximately the same time in the Paleocene, suggesting an adaptive radiation following the extinction of the dinosaurs (Benton 1999
). Sequence-based phylogenetic analysis does not support this view, indicating instead that the ordinal branches were clearly separated in time. Most authors now interpret the sequence data as indicating that much of the ordinal divergence occurred much earlier than previously thought - during the Cretaceous (Graur 1993
; Easteal 1999
; Penny et al. 1999
; Arnason et al. 2000
; Waddell, Kishino, and Ota 2001
; Bininda-Emonds et al. 2007
). The discovery of an increasing diversity of eutherian fossils from the Cretaceous supports this view (Hu et al. 2005
).
One phylogenetic issue important in the assessment of evolutionary rates is the relative divergence order of Primates, Rodentia, Carnivora, and Artiodactyla. Easteal (1985
, 1988
, 1990
) first suggested that among these orders, there is a basal split between rodents and a branch including carnivores and primates (rodent-first theory), and that an early rodent divergence required a revision of the prevailing theory of the evolutionary rate of nucleotide sequences. Previous analysis, assuming simultaneous divergences of the different orders, suggested that the rate of nucleotide sequence evolution was much faster in rodents than primates (Wu and Li 1985
). This interpretation lead to the theory that the neutral rate of nucleotide sequence evolution, and hence the rate of mutation, was affected by generation time, since rodents have much shorter generation times than primates. The rate difference was not apparent when an early rodent divergence was assumed (Easteal 1992
; Kumar and Subramanian 2002
). This branching order appeared to be resolved by a more extensive analysis that refuted the rodent-first theory, indicating, with high confidence, that carnivores diverged first (Murphy et al. 2001
). This topology is now assumed in many comparative genomic analyses. The work has also stimulated considerable debate, with numerous authors questioning the magnitude of support for this topology (Misawa and Nei 2003
; Kullberg et al. 2006
). Analysis of different mixtures of the nuclear and mitochondrial genes used by Murphy et al. (2001)
showed the vulnerability of support for this topology to the choice of genes used in the analysis, and to differences in the amount of data obtained from each of the sampled lineages (Misawa and Nei 2003
). Another critical factor affecting phylogenetic estimates, which was evident at 3rd codon positions of the Murphy et al. data (Kullberg et al. 2006
), is systematic variation in the composition (as distinct from the sequence) of nucleotides among compared sequences (Ho and Jermiin 2004
; Phillips et al. 2004
). Using amino acid sequences (i.e., avoiding synonymous 3rd codon variation) does not entirely overcome the problem as amino acids also vary in composition among lineages (Knight, Freeland, and Landweber 2001
).
In support of the carnivore-first branching order have been publications that use retrotransposon insertion events as phylogenetic markers (Kriegs et al. 2006
; Nishihara, Hasegawa, and Okada 2006
). The use of these multi-gene families for phylogenetic reconstruction is a relatively recent development, yet known aspects of transposable element biology suggest that homoplasies probably affect these molecular markers, casting doubt on results obtained from their application. First, evidence supports ectopic exchange, in the form of nonhomologous recombination and/or gene conversion between element family members, as a major evolutionary force affecting transposable elements (Langley et al. 1988
; Huttley, MacRae, and Clegg 1995
; Abrusan and Krambeck 2006
). Establishing unambiguously the orthology of observed elements is therefore difficult. Second, element insertions are not random with respect to sequence composition and/or chromatin status as shown by the existence of insertion hotspots (Cantrell et al. 2001
). Third, lineage sorting, which can result in character distributions incongruent with the true species relationships, of element insertions has been reported (Nishihara, Hasegawa, and Okada 2006
). Finally, transposable elements are increasingly being found to have important roles in genome evolution (Bejerano et al. 2006
; Kamal, Xie, and Lander 2006
) and their status as neutral genetic markers therefore cannot be assumed.
Recent studies strongly indicate that adding more genes is a better way of improving phylogenetic estimates than adding more taxa (see for example, Rosenberg and Kumar 2001
; Rokas and Carroll 2005
). Furthermore, the enormous volume of data available from analysis of whole genomes coupled with the statistical consistency of maximum likelihood (Chang 1996
) suggests that selection of genome blocks and use of appropriate sequence-evolution models should provide very robust phylogenetic estimates. Here we investigate the relatively simple phylogenetic question of the branching order of Primates, Rodentia, and Carnivora, based on comparison of multiple 10kb blocks of nonrepetitive sequences from the human, rat, mouse, and dog genomes, using the marsupial opossum genome (Mikkelsen et al. 2007
) as an outgroup reference. We focus on comparing only the rodent-first and carnivore-first topologies. As no previous studies have found substantial support for the third possible, primate-first, topology, we do not consider it here. Using the resulting best-supported tree topology we revisit the question of relative rates of sequence evolution in the lineages of these different eutherian orders.
Another approach to investigating factors that affect mutation and substitution rate is to compare evolutionary rates of nucleotide sequences on sex chromosomes and autosomes, since these spend different amounts of time in male and female lineages, which in turn have different properties. The most obvious of these, and the one most discussed, is generation time; in mammals there are more cell generations in males. Y-chromosomes spend all their time in male lineages and X-chromosomes spend two-thirds of their time in female lineages and one-third in male lineages, whereas autosomes spend an equal amount of time in both lineages. These differences lead to clear predictions about the relative evolutionary rates of nucleotide sequences on these chromosomes, if generation time has an important influence on mutation rate. However, rate differences between chromosomes need to be interpreted with caution as sex-biased evolution may result from other factors, such as methylation pattern, that differ between male and female lineages (Easteal, Collet, and Betty 1995
; Huttley et al. 2000
).
The marsupial genome sequence reveals a 46.85 Megabase region translocated from an autosomal region to the X chromosome in the mammalian common ancestor, before the divergence of eutherian mammals (Mikkelsen et al. 2007
). We use a comparative analysis of this large region and X to autosome ratios to investigate if an accurate estimate of the extent of sex-biased evolution in mammals can be made.
| Materials and Methods |
|---|
|
|
|---|
We use Mlagan (Brudno et al. 2003
The result was a set of alignments that contained either strictly Autosomal-linked blocks (the A set); X-linked blocks (the X set) or, Eutherian X-linked with Marsupial and Bird Autosomal-linked blocks (the A:X set). The alignments contain a mixture of coding and noncoding sequence. Only alignments longer than 10 kb were retained.
Our focus is on substitutions resulting from single nucleotide mutations and the probabilistic evolutionary models we use are restricted to this kind of variation. We therefore eliminated from the alignments all columns with sequence differences likely to derive from other kinds of mutation such as slipped-strand-mispairing. Runs of 4 or more mono- to tetra-nucleotide repeats were replaced by Ns and columns containing gaps or consisting only of Ns were removed.
We split all alignments into 10 kb blocks and, given the ample volume of data available, discarded remaining segments < 10 kb. Splitting the data into identically sized multiple blocks in this way allowed us to assess the influence of genomic region on evolutionary rate and phylogenetic estimation. Base composition varies across mammalian genomes because of differences in mutational and or DNA repair processes, and the resulting compositional bias is likely to affect rates of substitution (Wolfe, Sharp, and Li 1989
) and phylogenetic estimation (Ho and Jermiin 2004
; Phillips et al. 2004
).
We assessed two alternative tree topologies: the carnivore-first tree (Carnivores, (Rodents, Primates)) (Murphy et al. 2001
), which places carnivores as the outgroup to rodents and primates; and the rodent-first tree (Rodents, (Carnivores, Primates)) (Easteal 1988
) which places rodents as the outgroup (fig. 1A and 1B, respectively). Because these lineages have diverged in their base composition (GC%) and the maximum-likelihood reconstruction methods employed assume compositional homogeneity, we sought to minimize the influence of this violation in branch length estimation in two ways: (1) we recoded the sequences into a 2-state alphabet of purines (R) and pyrimidines (Y) (Phillips et al. 2004
); (2) any of these R/Y recoded alignment blocks from which a pair of sequences gave a nominally significant departure (P < 0.05) from a
2 goodness-of-fit test of compositional homogeneity was excluded.
|
We specified two substitution models for modeling sequence evolution. Both models are continuous time Markov processes of the two states R and Y. Equilibrium probabilities of R and Y were estimated as the averaged frequency across the sequences from an alignment block. In one model we allowed site heterogeneity using a discretized
distribution (Yang 1997
), and tree topology (rodent-first, carnivore-first), the likelihood was maximized using first a global (simulated annealing) optimizer and then a local (Powell) optimizer with default settings.
As we only consider two possible tree topologies, we used a sign test from the difference in log-likelihoods to discriminate between them. Specifically, for each alignment segment we subtracted the log-likelihood estimated for the rodent-first topology (lnLR) from that estimated for the carnivore-first (lnLC) topology:
=lnLC – lnLR. A positive
indicates the alignment segment supports the carnivore-first tree, and a negative
indicates support for the rodent-first tree. A two-sided sign test was applied (using PyEvolve) to the resulting set of signs to determine, first, whether there was a significant difference in support for the two-topologies and, second, which topology was supported.
We tested for equivalence of evolutionary rate between lineages using the likelihood form of the relative rate test (Muse and Weir 1992
). These tests were conducted using the RY model and four-taxon trees consisting of the chicken and opossum outgroups and two eutherian lineages. All possible combinations of eutherian lineages were considered. For assessment of evolutionary rates of opossum against a eutherian lineage, a three-taxon tree was employed with chicken the designated outgroup. Under the null hypothesis of equal rate, the branches of two lineages of interest were set to be equal, and the model parameters were estimated by maximizing the likelihood. For the alternative hypothesis, the branch lengths were allowed to differ. A likelihood ratio (LR) was then determined as LR=2(lnLalt – lnLnull), with one degree of freedom (df) — equal to the difference in numbers of free parameters between the alternative and null hypotheses.
To test the hypothesis of rate equivalence across alignments, we used the sum of their LRs on the assumption that alignments have evolved independently, with degrees of freedom equal to the number of alignment blocks. The probability that a LR of equal or greater size than the observed LR would occur by chance was estimated from the
2 distribution (Goldman 1993
; Ota et al. 2000
). We evaluated the proportion of a data set contributing to a significant LR test by determining the number of alignment segments for which the marginal LR statistic was significant, after applying a Bonferroni correction for the number of segments in the data set. We also determined the number of alignment segments for which the marginal LR statistic was significant at the nominal 0.05 level. All relative rate tests were implemented using PyEvolve and were conducted using the optimization procedure described above. On rejection of the null hypothesis of rate equivalence, a subsequent two-sided sign test was applied to assess whether there was a consistent direction to the rate differences. The number of alignment blocks for which species 1 of the eutherian pair had a longer branch length than species 2 was counted as a success, the number of trials being the total number of alignment blocks.
| Results |
|---|
|
|
|---|
We estimated the maximum-likelihood tree for each 10-kb alignment block. We then counted the number of alignment blocks that support each of the two alternative tree topologies and used that number as a measure of phylogenetic support. The substantial volume of aligned genome sequence data available means that this is a good estimate of the support for the topology of the whole genome. The rodent-first topology (fig. 1A) was strongly favored over the carnivore-first (fig. 1B) topology by our analysis. Of 462 compositionally homogeneous A (autosomal-linked) and X (X-linked) blocks (totaling 4.62 Mb of aligned sequences), 386 (
84%) supported the rodent-first tree, and 76 supported the carnivore-first tree (two-sided P = 1.1e-16; fig. 2). This pattern was robust because of the data set and evolutionary model used: equivalent results were obtained from the A and X data sets analyzed separately (fig. 2 A and B), from analyses without the chicken sequence (results not shown), and from applying either the RY, RY+
, a nucleotide general time-reversible (GTR), or the GTR+
substitution model to these data sets (see Supplementary Material online). This consistency across evolutionary models with and without rate heterogeneity affirms strong support of the rodent-first topology and that this support is unlikely to result from long-branch attraction (Yang 1994
|
Relative rate tests indicated a significant difference in evolutionary rate between marsupials and eutherians for all data sets (table 1), with eutherians having a relatively slower rate for nearly all alignment blocks. The difference in rate was greater for X alignments (mean eutherian/marsupial ratio 0.6041, s.d. = 0.1199) than A alignments (mean eutherian/marsupial ratio 0.6972, s.d. = 0.1342). This pattern is consistent with a reduced average rate of substitution affecting X-linked blocks arising from male-biased evolution in eutherians.
|
All data sets indicate that the primate lineage has evolved significantly more slowly than the other sampled eutherian lineages (table 1; fig. 3). This result is robust to whether a 3, 4, or (when possible) 6 taxon tree was used (results not shown). The mean proportion of primate to rodent substitution rates was 88.5% (s.d. =17%), and the mean proportion of primate to carnivore was 86% (s.d. =13%).
|
There is little evidence supporting a systematic rate difference between the carnivore and rodent lineages or between the rodent lineages. For the A and X data sets, the rates in the rodent lineages are marginally higher than in the carnivore lineage (table 1), whereas the reverse is true for the A:X (eutherian X-linked, marsupial autosomal) data set. Performing a sign test on these combined results did not allow rejection of the null hypothesis of rate equivalence (two-sided P
0.09, 0.25 for mouse and rat, respectively). Although the X data set supports the previously reported (Cooper et al. 2004
0.13).
All analyses demonstrated significant correlations between branch lengths (substitution rates) for the same alignment segment in different lineages (fig. 4). Thus, for example, the fastest evolving blocks between human and dog tend also to be the fastest evolving between mouse and rat. This is consistent with previous observations of local correlations in evolutionary rate that persist across lineages (Webster et al. 2004
).
|
| Discussion |
|---|
|
|
|---|
This work demonstrates the value of a marsupial genome sequence as an outgroup in resolving key issues about the evolution of eutherian mammals. We have demonstrated with very high confidence that the rodents diverged before carnivores and primates. This analysis contributes to the development of a eutherian phylogeny based on whole genome sequences. Extension of our approach will provide an increasingly detailed and accurate picture of the evolutionary history and relationship between mammals, and a detailed understanding of the different molecular evolutionary processes and how they vary across different lineages. Our approach, using large-scale data, allows aggressive filtering of alignment positions influenced by processes other than point mutation and sites that have sequence ambiguities or missing data in the alignments. The scale of the resulting data set is large enough to dilute out regional effects and to give a strong indication of the central trend of the evolution of the entire genome. That scale further enables alternate approaches to measuring support for a tree topology. By using identically sized segments sampled from the genome alignments, it was possible to attain an estimate of the proportion of alignments among the sampled genomes that support the alternative trees.
An important aspect of our results is the finding that a minority of the 10-kb alignment blocks we analyzed indicated an alternative topology. Ignoring the weaker support for the carnivore-first topology (evidenced by typically smaller |
|, fig. 2), this result demonstrates the importance of basing phylogenetic estimation on sequences sampled from different regions in the genome to avoid local effects, even when individual alignment blocks are relatively long. Local effects may result from unrecognized violation of a number of assumptions underlying alignment selection and analysis, such as orthology, base compositional stationarity, and absence of lineage-sorting effects. Large-scale genomic rearrangements may also translocate segments into regions subject to different mutagenic processes. Efforts to select "good" genes for phylogenetic analysis, as discussed by Kullberg et al. (2006)
, reduce the number of alignments and therefore may exacerbate local effects. We addressed the possible influence of base composition heterogeneity by recoding the sequences from bases to purines and pyrimidines (although we obtained the same phylogenetic estimates without recoding). We conclude that using many sequences for phylogenetic estimation is better than using only one or a small number of long sequences.
Many of the differences in analysis between our study and those undertaken previously are made possible by the volume of sequence data available in whole-genome comparisons. This increased volume allowed us to adopt conservative approaches (such as the nonparametric sign test), minimizing the number of assumptions in the methods we used, and conferring additional confidence in the results we have obtained. The volume of data also enables selective sampling of sequences to more closely match the assumptions of the evolutionary models employed, and to avoid reported pitfalls, such as the sensitivity of results to taxonomically biased indel occurrences (Misawa and Nei 2003
). All molecular evolutionary studies remain sensitive to errors in sequence alignment and orthology determination. We have gone some way toward addressing the issue of sequence alignment quality by excluding any alignment column with an indel. The impact of alignment may be evaluated in future work by sampling aligned positions based on alignment quality scores.
We have not attempted here to directly minimize errors arising from incorrect orthology determinations, but we argue that our analysis is conservative. We would expect accuracy of synteny (and thus orthology) to increase with increasing sequence coverage of a genome, because of fewer errors in assembly. The impact of assembly errors on genome scale analyses may be a bias toward clustering genomes that are more extensively sequenced, because orthologous regions will have evolutionary distances less than or equal to those between paralogous regions. As human and mouse are the most densely sequenced genomes in our study, such a bias would favor the carnivore-first topology, which is not what we observed.
We have found that the rate of nucleotide substitution in the primate lineage has been
11%–14% slower than the rates in the rodent and carnivore lineages, consistent with the
9% slowdown estimated from fourfold degenerate sites of protein coding genes (Kumar and Subramanian 2002
). Our estimates from recoded sequences should closely reflect the underlying true substitution rate differences, as they are less likely to be affected by compositional shifts between genomes. A slower rate of molecular evolution has previously been explained as resulting from a longer generation time (Graur 1993
). However, we also show that the nucleotide substitution rates in the rodent and carnivore lineages have been very similar and that the eutherian lineages are
30% slower than the marsupials, as judged from analysis applying the chicken as outgroup. The rate differences between these taxa are inconsistent with a global molecular clock, but they are also inconsistent with a generation time effect, calling into question the role of generation time in the primate slowdown. A more likely explanation is that there are differences in the biochemical processes associated with DNA metabolism between the lineages. Attributes of DNA metabolism have been demonstrated to have the capacity to evolve, and there are known differences between the sampled lineages. For instance, the nucleotide excision repair (NER) system deals with a diverse spectrum of DNA lesions via one of two main pathways: global genomic repair or transcription coupled repair. An important lesion type targeted by NER is cyclobutane pyrimidine dimers (CPDs). It has been demonstrated that rodents lack the ability to repair CPDs via the global genomic repair pathway (Tang et al. 2000
). As a slow rate of substitutions affecting the time to peak tension (TpT) has been attributed to the influence of NER (Arndt, Hwa, and Petrov 2005
, H. Lindsay, P. Maxwell and G. A. Huttley, unpublished data), the global genomic operation of this repair process could contribute to the slower rate of primate substitutions. It is also the case that repair of CPDs and pyrimidine (6–4) pyrimidone photoproducts differs between eutherian mammals and marsupials. As the latter possess a highly efficient photolyase (Schul et al. 2002), this seems an unlikely contributor to the elevated substitution rate in marsupials.
Figure 4 shows that the evolutionary rates of some sequence blocks are substantially greater than others, and that these differences in rate are correlated across lineages. Local differences in evolutionary rate have been known for a long time and may be caused by a number of factors, including levels of transcription (Brudno et al. 2003
), rates of recombination and/or gene conversion (Fullerton, Bernardo Carvalho, and Clark 2001
), or differences in replication timing coupled with shifts in the composition and size of the dNTP pools (Wolfe, Sharp, and Li 1989
; Martomo and Mathews 2002
).
Despite these rate correlations across lineages, it is evident from figure 3 that some sequence blocks have evolved substantially faster in primates, even though there is an overall primate slowdown. As with phylogenetic estimation, this finding demonstrates the limitations of comparisons of relative evolutionary rates based on alignments from one or a small number of regions. We conclude that alignments from many regions are needed to obtain reliable estimates of molecular evolutionary rates between lineages.
The translocation of a 46.85 Megabase autosomal region from the mammalian common ancestor onto the X chromosome prior to the eutherian radiation theoretically provides a large data set from which to address the question of male-biased evolution. Our analyses of this region are consistent with a reduced average rate of substitution affecting X-linked blocks arising from male-biased evolution in eutherians. However, this data set is confounded by variation in rates between species and by the fact that the X-chromosome has lower gene density than the autosomes (Ross et al. 2005). The inability to disentangle the male bias influence from the rate effect on a per alignment basis invalidates the mean of this distribution as valid data for calculating the magnitude of male bias. We do not estimate a distribution of the male-biased evolution statistic (
µ) from individual alignments, because these confounding effects generate invalid negative or infinite values for many of the individual alignments.
Estimates of the ratio of mutation rates in male and female lineages, obtained from comparing substitution rates on auto- and sex chromosomes, range from 1.53 to 7.58 (Berlin et al. 2006
; Goetting-Minesky and Makova 2006
). Male-biased evolution may be due to any factor affecting mutation rate that differs systematically between male and female germlines, including the number of rounds of DNA replication and the pattern and extent of DNA methylation (Easteal, Collet, and Betty 1995
; Huttley et al. 2000
). From our results we cannot distinguish between these different causes. Our results do, however, confirm the observations of Berlin et al. (2006)
that regional variation is a significant confounder of estimates of male bias. This is critically important in many studies that rely on gametologous genes (nonrecombining orthologs that spend differing amounts of time in the male and female germline), as this is a highly restricted data set and limits the amount of data that can be used.
The present study underlines the value of the opossum genome as a reference for addressing important issues about the evolution of eutherian mammals. Our results address a long-standing issue in eutherian systematics, providing robust support for rodents as an outgroup to carnivores and primates. We have reliably estimated that the rate of nucleotide substitution has been
14% slower in the primate lineage than in the rodent and carnivore lineages, but that there has been no appreciable rate difference between rodent and carnivore lineages, suggesting that generation time is not an important determinant of substitution rate differences. The acceleration of opossum relative to all eutherians can also not be accounted for by a generation time effect. Finally we have confirmed the general finding of greater substitution rate in male lineages, but the limitation of the data prevents us from extensive exploration of this issue.
| Acknowledgements |
|---|
|
|
|---|
We thank the Opossum Genome Sequencing Consortium and the Broad Institute of Harvard and MIT for providing the Monodelphis domestica genome sequence, and Javier Herrero for performing the sequence alignments. This project was supported by a time allocation fromthe Australian Partnership for Advanced Computing, Australian Research Council grants DP0450066 and LP0347613, and National Health and Medical Research Council grant 366739.
| Footnotes |
|---|
Sudhir Kumar, Associate Editor
| References |
|---|
|
|
|---|
Abrusan G, Krambeck H-J. The distribution of L1 and Alu retroelements in relation to GC content on human sex chromosomes is consistent with the ectopic recombination model. J Mol Evol (2006) 63:484–492.[CrossRef][ISI][Medline]
Arnason U, Gullberg A, Gretarsdottir S, Ursing B, Janke A. The mitochondrial genome of the sperm whale and a new molecular reference for estimating eutherian divergence dates. J Mol Evol (2000) 50:569–578.[ISI][Medline]
Arndt PF, Hwa T, Petrov DA. Substantial regional variation in substitution rates in the human genome: importance of GC content, gene density, and telomere-specific effects. J Mol Evol (2005) 60:748–763.[CrossRef][ISI][Medline]
Bejerano G, Lowe CB, Ahituv N, King B, Siepel A, Salama SR, Rubin EM, Kent WJ, Haussler D. A distal enhancer and an ultraconserved exon are derived from a novel retroposon. Nature (2006) 441:87–90.[CrossRef][Medline]
Benton MJ. Early origins of modern birds and mammals: molecules vs. morphology. Bioessays (1999) 21:1043–1051.[CrossRef][ISI][Medline]
Berlin S, Brandstrom M, Backstrom N, Axelsson E, Smith NGC, Ellegren H. Substitution rate heterogeneity and the male mutation bias. J Mol Evol (2006) 62:226–233.[CrossRef][ISI][Medline]
Bininda-Emonds ORP, Cardillo M, Jones KE, MacPhee RDE, Beck RMD, Grenyer R, Price SA, Vos RA, Gittleman JL, Purvis A. The delayed rise of present-day mammals. Nature (2007) 446:507–512.[CrossRef][Medline]
Brudno M, Do CB, Cooper GM, Kim MF, Davydov E, Green ED, Sidow A, Batzoglou S. LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA. Genome Res (2003) 13:721–731.
Butterfield A, Vedagiri V, Lang E, Lawrence C, Wakefield MJ, Isaev I, Huttley GA. PyEvolve: a toolkit for statistical modelling of molecular evolution. BMC Bioinformatics (2004) 5:1.[CrossRef][Medline]
Cantrell MA, Filanoski BJ, Ingermann AR, Olsson K, DiLuglio N, Lister Z, Wichman HA. An ancient retrovirus-like element contains hot spots for SINE insertion. Genetics (2001) 158:769–777.
Chang JT. Full reconstruction of Markov models on evolutionary trees: Identifiability and consistency. Math Biosci (1996) 137:51–73.[CrossRef][ISI][Medline]
Cooper GM, Brudno M, Stone EA, Dubchak I, Batzoglou S, Sidow A. Characterization of evolutionary rates and constraints in three Mammalian genomes. Genome Res (2004) 14:539–548.
Easteal S. A mammalian molecular clock? BioEssays (1992) 14:415–419.[CrossRef][ISI][Medline]
Easteal S. The pattern of mammalian evolution and the relative rate of molecular evolution. Genetics (1990) 124:165–173.[Abstract]
Easteal S. Rate constancy of globin gene evolution in placental mammals. Proc Natl Acad Sci USA (1988) 85:7622–7626.
Easteal S. Generation time and the rate of molecular evolution. Mol Biol Evol (1985) 2:450–453.[ISI][Medline]
Easteal S. Molecular evidence for the early divergence of placental mammals. BioEssays (1999) 21:1052–1058. discussion 1059.[CrossRef][ISI][Medline]
Easteal S, Collet CC, Betty DJ. The Mammalian Molecular Clock. (1995) Springer-Verlag & R.D. Landes, Austin, Texas.
Fullerton SM, Bernardo Carvalho A, Clark AG. Local rates of recombination are positively correlated with GC content in the human genome. Mol Biol Evol (2001) 18:1139–1142.
Goetting-Minesky MP, Makova KD. Mammalian male mutation bias: impacts of generation time and regional variation in substitution rates. J Mol Evol (2006) 63:537–544.[CrossRef][ISI][Medline]
Goldman N. Simple diagnostic statistical tests of models for DNA substitution. J Mol Evol (1993) 37:650–661.[ISI][Medline]
Graur D. Towards a molecular resolution of the ordinal phylogeny of the eutherian mammals. FEBS Lett (1993) 325:152–159.[CrossRef][ISI][Medline]
Harvey PH, Pagel MD. The Comparative Method in Evolutionary Biology (1991) Oxford: Oxford University Press.
Ho SY, Jermiin L. Tracing the decay of the historical signal in biological sequence data. Syst Biol (2004) 53:623–637.[CrossRef][ISI][Medline]
Hu Y, Meng J, Wang Y, Li C. Large Mesozoic mammals fed on young dinosaurs. Nature (2005) 433:149–152.[CrossRef][Medline]
Huttley GA, Jakobsen IB, Wilson SR, Easteal S. How important is DNA replication for mutagenesis? Mol Biol Evol (2000) 17:929–937.
Huttley GA, MacRae AF, Clegg MT. Molecular evolution of the Ac/Ds transposable-element family in pearl millet and other grasses. Genetics (1995) 139:1411–1419.[Abstract]
Kamal M, Xie X, Lander ES. A large family of ancient repeat elements in the human genome is under strong selection. Proc Natl Acad Sci USA (2006) 103:2740–2745.
Knight RD, Freeland SJ, Landweber LF. A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes. Genome Biol (2001) 2:RESEARCH0010.[Medline]
Kriegs JO, Churakov G, Kiefmann M, Jordan U, Brosius J, Schmitz J. Retroposed elements as archives for the evolutionary history of placental mammals. PLoS Biol (2006) 4:e91.[CrossRef][Medline]
Kullberg M, Nilsson MA, Arnason U, Harley EH, Janke A. Housekeeping genes for phylogenetic analysis of eutherian relationships. Mol Biol Evol (2006) 23:1493–1503.
Kumar S, Subramanian S. Mutation rates in mammalian genomes. Proc Natl Acad Sci USA (2002) 99:803–808.
Langley CH, Montgomery E, Hudson R, Kaplan N, Charlesworth B. On the role of unequal exchange in the containment of transposable element copy number. Genet Res (1988) 52:223–235.[ISI][Medline]
Martomo SA, Mathews CK. Effects of biological DNA precursor pool asymmetry upon accuracy of DNA replication in vitro. Mutat Res (2002) 499:197–211.[ISI][Medline]
Mikkelsen TS, Wakefield MJ, Aken B, et al. Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences. Nature (2007) 446. doi:10.1038/nature05805.
Misawa K, Nei M. Reanalysis of Murphy et al.'s data gives various mammalian phylogenies and suggests overcredibility of Bayesian trees. J Mol Evol (2003) 57(Suppl 1):S290–296.[CrossRef][ISI][Medline]
Murphy WJ, Eizirik E, O'Brien SJ, et al. Resolution of the early placental mammal radiation using Bayesian phylogenetics. Science (2001) 294:2348–2351.[CrossRef][ISI][Medline]
Muse SV, Weir BS. Testing for equality of evolutionary rates. Genetics (1992) 132:269–276.[Abstract]
Nishihara H, Hasegawa M, Okada N. Pegasoferae, an unexpected mammalian clade revealed by tracking ancient retroposon insertions. Proc Natl Acad Sci USA (2006) 103:9929–9934.
Ota R, Waddell PJ, Hasegawa M, Shimodaira H, Kishino H. Appropriate likelihood ratio tests and marginal distributions for evolutionary tree models with constraints on parameters. Mol Biol Evol (2000) 17:798–803.
Penny D, Hasegawa M, Waddell PJ, Hendy MD. Mammalian evolution: timing and implications from using the LogDeterminant transform for proteins of differing amino acid composition. Syst Biol (1999) 48:76–93.[CrossRef][ISI][Medline]
Phillips MJ, Delsuc F, Penny D. Genome-scale phylogeny and the detection of systematic biases. Mol Biol Evol (2004) 21:1455–1458.
Rokas A, Carroll SB. More genes or more taxa? The relative contribution of gene number and taxon number to phylogenetic accuracy. Mol Biol Evol (2005) 22:1337–1344.
Rosenberg MS, Kumar S. Incomplete taxon sampling is not a problem for phylogenetic inference. Proc Natl Acad Sci USA (2001) 98:10751–10756.
Ross MT, Grafham DV, Coffey AJ. (279 co-authors). The DNA sequence of the human X chromosome. Nature (2005) 434:325–337.[CrossRef][Medline]
Schul W, Jans J, Rijksen YMA. (11 co-authors). Enhanced repair of cyclobutane pyrimidine dimers and improved UV resistance in photolyase transgenic mice. EMBO J (2002) 21:4719–4729.[CrossRef][ISI][Medline]
Smith TF, Waterman MS. Identification of common molecular subsequences. J Mol Biol (1981) 147:195–197.[CrossRef][ISI][Medline]
Tang JY, Hwang BJ, Ford JM, Hanawalt PC, Chu G. Xeroderma pigmentosum p48 gene enhances global genomic repair and suppresses UV-induced mutagenesis. Mol Cell (2000) 5:737–744.[CrossRef][ISI][Medline]
Waddell PJ, Kishino H, Ota R. A phylogenetic foundation for comparative mammalian genomics. Genome Inform (2001) 12:141–154.[Medline]
Webster MT, Smith NGC, Lercher MJ, Ellegren H. Gene expression, synteny, and local similarity in human noncoding mutation rates. Mol Biol Evol (2004) 21:1820–1830.
Wolfe KH, Sharp PM, Li WH. Mutation rates differ among regions of the mammalian genome. Nature (1989) 337:283–285.[CrossRef][Medline]
Wu CI, Li WH. Evidence for higher rates of nucleotide substitution in rodents than in man. Proc Natl Acad Sci USA (1985) 82:1741–1745.
Yang Z. How often do wrong models produce better phylogenies? Mol Biol Evol (1997) 14:105–108.[ISI][Medline]
Yang Z. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J Mol Evol (1994) 39:306–314.[CrossRef][ISI][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. Matsuya, R. Sakate, Y. Kawahara, K. O. Koyanagi, Y. Sato, Y. Fujii, C. Yamasaki, T. Habara, H. Nakaoka, F. Todokoro, et al. Evola: Ortholog database of all human genes in H-InvDB with manual curation of phylogenetic trees Nucleic Acids Res., January 11, 2008; 36(suppl_1): D787 - D792. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




) and their significance are indicated on each panel.