Molecular Biology and Evolution 18:2298-2305 (2001)
© 2001 Society for Molecular Biology and Evolution
Exploring the Demographic History of DNA Sequences Using the Generalized Skyline Plot
Department of Zoology, University of Oxford
| Abstract |
|---|
|
|
|---|
We present an intuitive visual framework, the generalized skyline plot, to explore the demographic history of sampled DNA sequences. This approach is based on a genealogy inferred from the sequences and provides a nonparametric estimate of effective population size through time. In contrast to previous related procedures, the generalized skyline plot is more applicable to cases where the underlying tree is not fully resolved and the data is not highly variable. This is achieved by the grouping of adjacent coalescent intervals. We employ a small-sample Akaike information criterion to objectively choose the optimal grouping strategy. We investigate the performance of our approach using simulation and subsequently apply it to HIV-1 sequences from central Africa and mtDNA sequences from red pandas.
| Introduction |
|---|
|
|
|---|
Contemporary DNA sequences contain information about the demographic history of the population from which they were sampled. As a result, the inference of demographic parameters from genetic data has become an important topic in statistical genetics, with applications in fields as diverse as anthropology, conservation biology, epidemiology, and virology (Harvey et al. 1996
Methods for estimating demographic history from gene sequences are mostly based on coalescent theory (Kingman 1982a, 1982b;
Hudson 1990
; Nordborg 2001
). They usually rely on a simple parametric model N(t) which describes effective population size through time. Time t is zero at present and increases into the past, hence N(0) is the effective population size at present. Two simple demographic models are frequently used: constant population size N(t) = N(0), with one parameter N(0) and exponential growth N(t) = N(0)e-rt, with two parameters r and N(0). Often, however, there is no prior reason to assume a specific model of demographic history for the data in question. Moreover, the available models may be too simplistic. Hence, nonparametric and model selection tools can play a useful role in the inference of population history from gene sequence data.
Nee et al. (1995)
proposed the lineage through time (LTT) plot to graphically investigate the demographic history of gene sequences. LTT plots display the rate of coalescence through time in a genealogy which has been reconstructed from an alignment of homologous sequences. Pybus, Rambaut, and Harvey (2000)
described a simple transformation that converts this rate of coalescence into a plot of estimated effective population size against time, which we call here the classic skyline plot. The LTT and classic skyline plot approaches are closely related and both assume that a fully resolved phylogeny with reliable estimates of divergence times is available. As a consequence, these approaches can only be applied to data that exhibit a strong phylogenetic signal and are not appropriate for alignments which contain identical sequences. In addition, neither method provides an assessment of coalescent error. This is the error that results from the randomness inherent in the coalescent process.
In this paper we introduce the generalized skyline plot, a simple framework for exploring the demographic signal in a sample of DNA sequences. This method extends the classic skyline plot by allowing multiple coalescent events (for which little divergence time information is available) to be grouped together. The classic plot is a special case of the generalized plot, which arises when no coalescent events are grouped. The generalized plot can be applied to data sets which contain identical sequences and has the added benefit of smoothing the classic plot, which typically displays stochastic noise. We show that the most appropriate amount of smoothing can be determined by using a penalized likelihood approach. Furthermore, we derive the skyline plot as a simple method of moments estimator based on standard coalescence distributions, which enables us to compute estimates of the coalescent error. To illustrate our approach, we analyze HIV-1 sequences from central Africa and investigate the demographic history of red pandas using mtDNA sequences.
| Methods |
|---|
|
|
|---|
The coalescent describes the relationship between the shape of an intrapopulation genealogy (representing the ancestry of randomly-sampled, nonrecombining, neutrally evolving sequences) and the demographic history of the sampled population (Kingman 1982a, 1982b
|
|
n changes after each coalescent event. Thus, the waiting time wn until the next coalescent event is exponentially distributed according to
![]() |
|
|
|
|
The accumulated waiting time wn,k =
i = 1k wn - i + 1 until k coalescent events have occurred is the sum of k different exponential variables (
i
j for i
j) and thus follows a hypo-exponential distribution (e.g., Ross 1997
)
|
|
j = 1;j
ik
n - j + 1/(
n - j + 1 -
n - i + 1). This distribution has expectation
|
|
|
|
Deterministic changes of N through time can be introduced in the coalescent by a nonlinear scaling factor (Hudson 1990
; Griffith and Tavaré 1994
; Donnelly and Tavaré 1995
; Kuhner, Yamato, and Felsenstein 1998
). If selection, recombination, or noncontemporary sequences are present then further adjustments to the coalescent are necessary (e.g., Rodrigo and Felsenstein 1999
; Nordborg 2001
).
The Classic Skyline Plot
Suppose that we have a fully resolved genealogy
with m tips, estimated from a given sequence alignment in such a way that
's internal nodes are dated according to a given time scale. This requires a molecular clock, or more generally, a model of rate correlation among different branches in the tree (Gillespie 1991
; Sanderson 1997
; Thorne, Kishino, and Painter 1998
; Huelsenbeck, Larget, and Swofford 2000
).
defines m - 1 ordered internode intervals Im,Im - 1,...,I2 where the subscript indicates the number of lineages present during each interval. The length of interval In is denoted by
n. A simple demographic model can then be constructed as follows. During each interval In we assume that population size is a local constant, Mn, but between different intervals the population size is allowed to change. Hence, for a set of m - 1 intervals, we approximate the demographic history N(t) by a piecewise constant function with m - 1 independent variables Mm,Mm - 1,...,M2.
A method of moments estimator for the population size during each interval In is then constructed by setting the expected waiting time (eq. 2
) for the next coalescent event equal to
n, and solving the resulting equation for Mn. This gives the classic skyline plot estimate
|
|
The Generalized Skyline Plot
Generally, we expect the accuracy of the observed intervals
(obtained from a reconstructed genealogy) to be adversely affected by limited genetic variation. The number of substitutions occurring in an internode interval is often modeled by a Poisson distribution. Consequently, the observed number of substitutions is proportional to the time elapsed when either the substitution rate or the internode interval is large. However, this approximation breaks down when the product of interval length and substitution rate is small. Under such circumstances it would be beneficial to pool small intervals together so that all intervals are large enough for time to be proportional to the number of substitutions. Zero-length intervals always occur if the alignment contains identical sequences, and also arise when the branch lengths of a genealogy are estimated using maximum likelihood under a molecular clock. The disadvantage of pooling intervals is that some (but not all) of the temporal structure in the data is lost. When the sequences contain very little or no genetic variation, a Bayesian approach employing prior distributions for the substitution and coalescent parameters is required (Tavaré et al. 1997
). However, in these cases a single-tree estimator such as the skyline plot is inappropriate.
Allowing pooled intervals in the skyline plot leads to the derivation of the generalized skyline plot. Consider a composite time interval In,k where n denotes the number of lineages at the start of the interval, and k is the total number of coalescent events taking place during this interval. In,k has observed length
n,k =
n +
n - 1 + ··· +
n - k + 1. If we assume a locally constant population size Mn,k during this composite interval we can construct a method of moments estimator for Mn,k using equation (5)
, and arrive at
|
|
Note that the generalized skyline plot (eq. 8
) contains the classic skyline plot (eq. 7
) as a special case when each interval contains only a single coalescent event (k = 1). If there is only a single composite interval Im,m - 1 that contains all m - 1 coalescent events in the genealogy, then equation (8)
collapses to
m,m - 1 =
m,m - 1m/[2(m - 1)]. This is the standard population genetic relationship between effective population size and the time to the most recent common ancestor of a sample of size m.
Grouping Intervals and Model Selection
In order to choose which intervals in genealogy
should be pooled we adopt the following convention. First, the set of standard internode intervals Im,Im - 1, ...,I2 is determined from
. Next, if an interval is smaller than a certain threshold
then the interval is considered as small. Proceeding from Im to I2, each small interval is pooled with the neighboring interval closer to the root. If the neighboring interval is also small, then pooling continues until the composite interval is larger than
. Note that this approach prevents the occurrence of zero-length intervals at present. Thus
determines how much temporal structure in the data is retained and hence controls the degree to which the skyline plot is smoothed. The choice of
is guided by two opposing objectives. On the one hand,
should be large enough to remove the noise in the data which arises from the randomness of the mutational process. On the other hand,
should be small enough to preserve the actual demographic signal in the data.
How should the most appropriate value of
be chosen? Visual inspection of skyline plots calculated under various
values is helpful, but an objective approach based on statistical model selection would be preferable. Here we outline one possible approach which penalizes skyline plots that overfit the data. As skyline plots represent specific hypotheses of demographic history, we can calculate the likelihood of a skyline plot using standard approaches, given the observed internode interval lengths (Griffith and Tavaré 1994
; Pybus, Rambaut, and Harvey 2000
). For a skyline plot derived from a genealogy with m sequences the log-likelihood log L reduces to
|
|
Note that the estimated population size
for any subinterval in a composite interval In,k is
n,k. Now let K be the number of inferred parameters (=number of composite intervals in the skyline plot) and let S = m - 1 be the sample size (=number of coalescent events in the genealogy). We can compare skyline plots with different
values by penalizing the log-likelihood of each plot using the AICc correction
|
|
, we can use equation (10)
to obtain an optimal generalized skyline plot, by choosing the value of
which maximizes log LAICc.
Statistical Properties and Simulations
Here, we investigate the statistical properties of the skyline plot and study its performance using sequence data simulated under known demographic scenarios.
First, we analytically calculate the coalescent variance
C2 of the skyline plot. For the classic skyline plot we use equation (3) and obtain
|
|
The coalescent variance for the generalized skyline plot can be computed similarly,
|
|
To investigate the bias of the skyline plot we conducted a small simulation study. For various settings of m and k (see table 1 ), simulations were performed as follows: (1) 1,000 genealogies with m tips were simulated using the demographic model N(t) = 0.1, (2) The first k internode intervals were grouped together and the skyline plot estimate
was calculated using equation (8)
for each of the 1,000 simulated gene trees, and (3) The expectation E(
) and the bias b(
) = E(
) - M were computed along with the observed variance var(
) and the theoretical variance
C2(
).
|
The results are summarized in table 1 . They indicate that the generalized skyline plot is an unbiased estimator of the effective population size during an interval In,k (when the coalescent intervals
are known without error). As expected from the earlier analytical results, the variance of this estimate is large but declines quickly when intervals are pooled (k > 1). Note that the skyline plot (and the above simulation) assumes that the effective population size is locally constant during an interval. If the population size changes within an interval then the skyline plot (as a piecewise constant estimator of N[t]) is, by definition, biased. However, in this case the classic skyline plot provides an estimate of the harmonic mean of N(t) during each interval (Pybus, Rambaut, and Harvey 2000
Next, we studied the performance of the classic and generalized plots using sequence data simulated under known demographic scenarios. The purpose of these simulations was to determine whether the generalized plot is more reliable than the classic plot when the DNA sequences used are not highly variable. The simulations were performed as follows: (1) Expected coalescent trees, which contain no coalescent error, were obtained under two demographic models, N(t) = 0.05 (constant) and N(t) = e - 1000t (exponential). These models were chosen to approximately represent the history of animal mtDNA sequences. Note that time is measured in substitutions per site, (2) Sequences were simulated down these trees using the HKY (Hasegawa, Kishino, and Yano 1985
) model (transition-tranversion ratio = 10; nucleotide frequencies
A = 0.3,
C = 0.25,
G = 0.15, and
T = 0.3) and no rate heterogeneity. The constant-model alignment contained 500 bp and the exponential-model alignment contained 1,500 bp, (3) Genealogies were estimated from the simulated sequences using the TBR search heuristic in PAUP* (Swofford 1998
). The substitution model specified earlier was used, and (4) Classic and generalized skyline plots were obtained from the estimated genealogies. The
value was found by optimizing the AICc corrected log-likelihood (see eq. 10
).
Figures 1 and 2
show the simulation results for the constant and exponential models, respectively. Under the constant-size model, many of the simulated sequences are not unique and many of the internode intervals in the estimated tree are very small (fig. 1a
). Thus the number of observed substitutions provides little information about the true coalescent interval lengths and consequently the classic skyline plot is very noisy (fig. 1b
). In contrast, the generalized skyline plot estimate is smooth and almost identical to the true demographic history (fig. 1c
). The optimal
was 0.1, which resulted in all the observed intervals being pooled into a single composite interval. This should be expected, as the true demographic history contains no changes in population size.
|
Under the exponential model, only two sequences were identical (fig. 2a ) and both the classic and generalized plots provide a good estimate of the true demographic history, although the generalized plot is less noisy (fig. 2b and c ). The optimal
was 0.00115. Interestingly, both plots appear to slightly overestimate population size in the past, which suggests that, in this set of sequences, the estimated branch lengths near the root of the genealogy are too long.
|
| Results |
|---|
|
|
|---|
We illustrate our framework by analyzing two previously published data sets. We investigate the demographic history of HIV-1 using sequences sampled from Central Africa, and we also analyze mtDNA sequences from red pandas (Ailurus fulgens). These examples were chosen for two reasons. First, these data sets have been previously studied using other coalescent methods, so alternative results are available for comparison. Second, the genealogies inferred from these sequences contain a number of short or zero-length branches, which allow us to compare the performance of the generalized and classic plots.
HIV-1 in Central Africa
HIV-1 group M contains the viruses which cause the global HIV pandemic and appears to have arisen in Central Africa during the last 100 years. Vidal et al. (2000)
investigated the genetic diversity of HIV-1 group M in this region by obtaining viral gene sequences (env gene, V3-V5) in 1997 from 197 infected individuals living in the Democratic Republic of Congo. Yusim et al. (2001)
used a customized maximum likelihood approach to estimate a phylogeny for this large data set, and it is this phylogeny which we use here (fig. 3a
). Detailed interpretation of the HIV tree and further analysis of this data set can be found in Rambaut et al. (2001)
and Yusim et al. (2001)
.
|
The classic skyline plot for the tree of Yusim et al. (2001)
= 0. The tree contains many internode intervals which are zero or near zero in length. Consequently, the plot contains gaps (where the estimated effective population size
n is zero) and spikes (where
n is close to zero). Figure 3c and d
show other generalized plots for the same tree. As
is increased, the generalized plot becomes less noisy than the classic plot, but also becomes less finely resolved. If
is very large then too many intervals are grouped and, as a result, information about demographic history is lost (see fig. 3d
).
The thick curves in figure 3bd
show a maximum likelihood estimate of population size obtained from the HIV tree using a specific parametric model, N(t) = N(0)(
+ [1 -
]e - rt), called the expansion model. The parameters of this model were estimated using maximum likelihood (see Yusim et al. 2001
). Figure 3c
shows the generalized plot with the highest AICc value (
= 0.0119). This plot is neither noisy nor oversimplified, and corresponds closely to the maximum likelihood parametric estimate.
We note that it is very unlikely that this HIV-1 data set has been evolving according to the molecular clock and without recombination. Therefore, statistical estimates of population parameters from these data based on the standard neutral coalescent model must be treated with caution. The quantitative effects of recombination on coalescent-based estimates of demographic history have yet to be determined.
Red Pandas in Southwestern China
The red panda, which inhabits southwestern China, is an endangered species. To investigate the genetic diversity of this species, Su et al. (2001)
obtained a data set of 53 homologous sequences, 250 bp in length, from the 5' end of the mtDNA control region. The alignment contains only 25 haplotypes, and thus many sequences are identical. We estimated a genealogy for these sequences by maximum likelihood, using the TBR search heuristic in PAUP (Swofford 1998
). The HKY substitution model was used (estimated transition-transversion ratio = 36.5; nucleotide frequencies
A = 0.28,
C = 0.26,
G = 0.14, and
T = 0.32) under the assumption of a molecular clock. Clock-like evolution could not be rejected using a likelihood ratio test (Felsenstein 1981
).
Figure 4
shows the classic skyline plot and the optimal generalized skyline plot (AICc estimate of
= 0.0008) obtained from the panda mtDNA genealogy. The generalized skyline plot (fig. 4b
) suggests that the effective population size of red pandas has followed a logistic growth. Su et al. (2001)
analyzed the same data using pairwise difference distributions and concluded that the red pandas had undergone recent population growth. In contrast, figure 4c
suggests an approximately constant population size at present, with growth in the distant past. Pairwise difference distributions do not explicitly incorporate phylogenetic structure and are therefore expected to be less powerful than methods which do, such as the skyline plot (Felsenstein 1992
).
|
The classic skyline plot (fig. 4a ) for the same tree gives a different picture of demographic history, as it suggests that effective population size has increased approximately exponentially in the recent past. This conclusion is a result of the limited phylogenetic signal in the data, which does not permit accurate estimation of the short internode intervals near the tips of the genealogy (as discussed earlier for fig. 1 ).
For a comparison, we also obtained a maximum likelihood estimate of effective population size using the program FLUCTUATE, which assumes a model of exponential growth (Kuhner, Yamato, and Felsenstein 1998
). This estimate is shown as a thick line in figure 4b and c.
Although the FLUCTUATE estimate only partially matches the skyline plot estimates, it does clearly illustrate the effectiveness of the skyline plot as a model selection tool. If a logistic growth model was implemented in the FLUCTUATE package, then we would expect it to provide a better fit to the red panda data than the exponential model used here.
| Discussion |
|---|
|
|
|---|
The generalized skyline plot offers a flexible framework for exploring the demographic history of a sample of DNA sequences, and provides an estimate of effective population size which explicitly incorporates phylogenetic structure. It has three main advantages over the LTT plot and the classic skyline plot, (1) it can be applied to data containing a weaker phylogenetic signal or identical sequences (or both), (2) it provides an estimate of the coalescent error, and (3) it enables the stochastic noise present in the classic plot to be reduced.
The present approach is thus particularly useful as a rapid model selection tool, that is, the generalized skyline plot provides insights with respect to which parametric models may be suitable for a given data set. In the case of the HIV-1 data set (fig. 3 ), it indicates a model of exponential growth with a growth rate that increases through time. For the red panda mtDNA data set, a model of logistic growth appears to be most appropriate (fig. 4 ).
Our method is computationally fast and algorithmically straightforward. Tree estimation is separated from the problem of demographic inference, thus the underlying tree reconstruction method can be adapted to the particular data set in question. If an unusual or complicated substitution model is required, or if a model which permits variation in evolutionary rates among lineages is warranted (e.g., Gillespie 1991
; Sanderson 1997
; Thorne, Kishino, and Painter 1998
; Huelsenbeck, Larget, and Swofford 2000
), then these models can be used without altering the skyline plot method.
On the other hand, our approach requires that at least some of the divergence times in a gene tree can be reliably inferred, so it cannot be used on data containing very little variation. It is also important to realize that our approach is a single-tree method. It is therefore complementary to computationally intensive approaches which treat the tree as an unknown nuisance variable and effectively use a collection of trees to infer effective population size (Kuhner, Yamato, and Felsenstein 1995, 1998
; Stephens and Donnelly 2000
).
In addition to the coalescent error
C2 estimated here, the skyline plot also carries an error introduced by the uncertainty of the phylogenetic estimates of coalescent times. This error has been ignored here and we are currently investigating ways of estimating its effect on the skyline plot.
| Acknowledgements |
|---|
|
|
|---|
We thank Andrew Rambaut and Peter Donnelly for discussion, and Bing Su for providing the red panda sequence alignment. We would also like to thank the editor and referees for helpful comments. One referee pointed out the useful simplification of equation (5) . This work was supported by an Emmy-Noether-Fellowship of the Deutsche Forschungsgemeinschaft (K.S.) and by grant 50275 from the Wellcome Trust (O.G.P.).
| Footnotes |
|---|
Keith Crandall, Reviewing Editor
Keywords: coalescent process
corrected Akaike criterion
HIV-1
model selection
likelihood
red panda
skyline plot ![]()
Address for correspondence and reprints: Oliver G. Pybus. South Parks Road, Oxford, OX1 3PS, UK. oliver.pybus{at}zoo.ox.ac.uk
. ![]()
| References |
|---|
|
|
|---|
Akaike H., 1974 A new look at the statistical model identification IEEE Trans. Automat. Control AC-19:716-723
Burnham K. P., D. R. Anderson, 1998 Model selection and inference: a practical information-theoretic approach Springer, New York
Donnelly P., S. Tavar, 1995 Coalescents and genealogical structure under neutrality Annu. Rev. Genet 29:401-421[ISI][Medline]
Drummond A., K. Strimmer, 2001 PAL: an object-oriented programming library for molecular evolution and phylogenetics Bioinformatics 17:662-663
Felsenstein J., 1981 Evolutionary trees from DNA sequences: a maximum-likelihood approach J. Mol. Evol 17:368-376[ISI][Medline]
. 1992 Estimating effective population size from samples of sequences: inefficieny of pairwise and segregating sites as compared to phylogenetic estimates Genet. Res 59:139-147[ISI][Medline]
Gillespie J. H., 1991 The causes of molecular evolution Oxford University Press, Oxford
Griffith R. C., S. Tavar, 1994 Sampling theory for neutral alleles in a varying environment Philos. Trans. R. Soc. Lond. B 344:403-410[ISI][Medline]
Harvey P. H., A. J. Leigh Brown, J. Maynard Smith, S. Nee, eds 1996 New uses for new phylogenies, Oxford University Press, Oxford
Hasegawa M., H. Kishino, K. Yano, 1985 Dating of the human-ape splitting by a molecular clock of mitochondrial DNA J. Mol. Evol 22:160-174[ISI][Medline]
Hudson R. R., 1990 Gene genealogies and the coalescent process Oxf. Surv. Evol. Biol 9:1-44
Huelsenbeck J. P., B. Larget, D. Swofford, 2000 A compound Poisson process for relaxing the molecular clock Genetics 154:1879-1892
Hurvich C. M., C. L. Tsai, 1989 Regression and time series model selection in small samples Biometrika 76:297-307
Kingman J. F. C., 1982a. The coalescent Stoch. Proc. Applns 13:235-248
. 1982b. On the genealogy of large populations J. Appl. Probab 19A:27-43
Kuhner M. K., J. Yamato, J. Felsenstein, 1995 Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling Genetics 140:1421-1430[Abstract]
. 1998 Maximum likelihood estimation of population growth rates based on the coalescent Genetics 149:429-434
Nee S., E. C. Holmes, A. Rambaut, P. H. Harvey, 1995 Inferring population history from molecular phylogenies Philos. Trans. R. Soc. Lond. B 349:25-31[ISI][Medline]
Nordborg M., 2001 Coalescent theory Pp. 179212 in D. Balding, M. Bishop, and C. Cannings, eds. Handbook of statistical genetics. Wiley, Chichester, England
Pybus O. G., A. Rambaut, P. H. Harvey, 2000 An integrated framework for the inference of viral population history from reconstructed genealogies Genetics 155:1429-1437
Rambaut A., D. L. Robertson, O. G. Pybus, M. Peeters, E. C. Holmes, 2001 Phylogeny and the origin of HIV-1 Nature 410:1047-1048[Medline]
Rodrigo A. G., J. Felsenstein, 1999 Coalescence approaches to HIV population genetics Pp. 233272 in K. A. Crandall, ed. The evolution of HIV. John Hopkins University Press, Baltimore
Ross S. M., 1997 Introduction to probability models. 6th edition Academic Press, San Diego
Sanderson M. J., 1997 A nonparametric approach to estimating divergence times in the absence of rate constancy Mol. Biol. Evol 14:1218-1231[ISI]
Stephens M., P. Donnelly, 2000 Inference in molecular population genetics J. R. Statist. Soc. B 62:605-655
Su B., Y.-X. Fu, Y.-X. Wang, L. Jin, R. Chakraborty, 2001 Genetic diversity and population history of the red panda (Ailurus fulgens) as inferred from mitochondrial DNA sequence variations Mol. Biol. Evol 18:1070-1076
Swofford D. L., 1998 PAUP*: phylogenetic analysis using parsimony (* and other methods). Version 4 Sinauer Associates, Sunderland, Mass
Tavar S., D. J. Balding, R. C. Griffiths, P. Donnelly, 1997 Inferring coalesence times from DNA sequence data Genetics 145:505-518[Abstract]
Thorne J. L., H. Kishino, I. S. Painter, 1998 Estimating the rate of evolution of the rate of molecular evolution Mol. Biol. Evol 15:1647-1657[Abstract]
Vidal N., M. Peeters, C. Mulanga-Kabeya, N. Nzilambi, D. Robertson, W. Ilunga, H. Sema, K. Tishimanga, B. Bongo, E. Delaporte, 2000 Unprecedented degree of HIV-1 group M genetic diversity in the Democratic Republic of Congo suggests that the HIV-1 pandemic originated in Central Africa J. Virol 74:10498-10507
Yusim K., M. Peeters, O. G. Pybus, T. Bhattacharya, E. Delaporte, C. Mulanga, M. Muldoon, J. Theiler, B. Korber, 2001 Using HIV-1 sequences to infer historical features of the AIDS epidemic and HIV evolution Philos. Trans. R. Soc. Lond. B 356:855-866[ISI][Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
V. N. Minin, E. W. Bloomquist, and M. A. Suchard Smooth Skyride through a Rough Skyline: Bayesian Coalescent-Based Inference of Population Dynamics Mol. Biol. Evol., July 1, 2008; 25(7): 1459 - 1471. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Wirth, G. Morelli, B. Kusecek, A. van Belkum, C. van der Schee, A. Meyer, and M. Achtman The rise and spread of a new pathogen: Seroresistant Moraxella catarrhalis Genome Res., November 1, 2007; 17(11): 1647 - 1656. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Jimenez-Hernandez, M. Torres-Puente, M. A. Bracho, I. Garcia-Robles, E. Ortega, J. del Olmo, F. Carnicer, F. Gonzalez-Candelas, and A. Moya Epidemic dynamics of two coexisting hepatitis C virus subtypes J. Gen. Virol., January 1, 2007; 88(1): 123 - 133. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. M. Goodreau Assessing the Effects of Human Mixing Patterns on Human Immunodeficiency Virus-1 Interhost Phylogenetics Through Social Network Simulation. Genetics, April 1, 2006; 172(4): 2033 - 2045. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Nakano, L. Lu, Y. He, Y. Fu, B. H. Robertson, and O. G. Pybus Population genetic history of hepatitis C virus 1b infection in China J. Gen. Virol., January 1, 2006; 87(1): 73 - 82. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Salemi, S. L. Lamers, S. Yu, T. de Oliveira, W. M. Fitch, and M. S. McGrath Phylodynamic Analysis of Human Immunodeficiency Virus Type 1 in Distinct Brain Compartments Provides a Model for the Neuropathogenesis of AIDS J. Virol., September 1, 2005; 79(17): 11343 - 11352. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Field, E. J. Feil, and G. A. Wilson Databases and software for the comparison of prokaryotic genomes Microbiology, July 1, 2005; 151(7): 2125 - 2132. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J. Drummond, A. Rambaut, B. Shapiro, and O. G. Pybus Bayesian Coalescent Inference of Past Population Dynamics from Molecular Sequences Mol. Biol. Evol., May 1, 2005; 22(5): 1185 - 1192. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. F. Wilkins A Separation-of-Timescales Approach to the Coalescent in a Continuous Population Genetics, December 1, 2004; 168(4): 2227 - 2244. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Shapiro, A. J. Drummond, A. Rambaut, M. C. Wilson, P. E. Matheus, A. V. Sher, O. G. Pybus, M. T. P. Gilbert, I. Barnes, J. Binladen, et al. Rise and Fall of the Beringian Steppe Bison Science, November 26, 2004; 306(5701): 1561 - 1565. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Lemey, O. G. Pybus, A. Rambaut, A. J. Drummond, D. L. Robertson, P. Roques, M. Worobey, and A.-M. Vandamme The Molecular Population Genetics of HIV-1 Group O Genetics, July 1, 2004; 167(3): 1059 - 1068. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. S. Flanagan, A. Tobler, A. Davison, O. G. Pybus, D. D. Kapan, S. Planas, M. Linares, D. Heckel, and W. O. McMillan Historical demography of Mullerian mimicry in the neotropical Heliconius butterflies PNAS, June 29, 2004; 101(26): 9704 - 9709. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. F. Sarkar and D. S. Guttman Evolution of the Core Genome of Pseudomonas syringae, a Highly Clonal, Endemic Plant Pathogen Appl. Envir. Microbiol., April 1, 2004; 70(4): 1999 - 2012. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Arnason Mitochondrial Cytochrome b DNA Variation in the High-Fecundity Atlantic Cod: Trans-Atlantic Clines and Shallow Gene Genealogy Genetics, April 1, 2004; 166(4): 1871 - 1885. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Lemey, O. G. Pybus, B. Wang, N. K. Saksena, M. Salemi, and A.-M. Vandamme Tracing the origin and history of the HIV-2 epidemic PNAS, May 27, 2003; 100(11): 6588 - 6592. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. G. Pybus, A. J. Drummond, T. Nakano, B. H. Robertson, and A. Rambaut The Epidemiology and Iatrogenic Transmission of Hepatitis C Virus in Egypt: A Bayesian Coalescent Approach Mol. Biol. Evol., March 1, 2003; 20(3): 381 - 387. [Abstract] [Full Text] [PDF] |
||||
| |||||||||||||||||||





















