MBE Advance Access originally published online on August 10, 2006
Molecular Biology and Evolution 2006 23(11):2101-2111; doi:10.1093/molbev/msl084
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
High Resolution Analysis and Phylogenetic Network Construction Using Complete mtDNA Sequences in Sardinian Genetic Isolates






* Shardna Life Sciences, Pula (Cagliari), Italy
Dipartimento di Biologia, Università di Ferrara, Ferrara, Italy
Dipartimento di Biologia Evoluzionistica e SperimentaleSezione di Antropologia, Università di Bologna, Bologna, Italy
Istituto di Genetica delle Popolazioni, CNR, Alghero (Sassari), Italy
E-mail: angius{at}shardna.it.
| Abstract |
|---|
|
|
|---|
For mitochondrial phylogenetic analysis, the best result comes from complete sequences. We therefore decided to sequence the entire mitochondrial DNA (mtDNA) (coding and D-loop regions) of 63 individuals selected in 3 small Ogliastra villages, an isolated area of eastern Sardinia: Talana, Urzulei, and Perdasdefogu. We studied at least one individual for each of the most frequent maternal genealogical lineages belonging to haplogroups H, V, J, K, T, U, and X. We found in our 63 samples, 172 and 69 sequence changes in the coding and in the D-loop region, respectively. Thirteen out of 172 sequence changes in the coding region are novel. It is our hypothesis that some of them are characteristic of the Ogliastra region and/or Sardinia. We reconstructed the phylogenetic network of the 63 complete mtDNA sequences for the 3 villages. We also drew a network including a large number of European sequences and calculated various indices of genetic diversity in Ogliastra. It appears that these small populations remained extremely isolated and genetically differentiated compared with other European populations. We also identified in our samples a never previously described subhaplogroup, U5b3, which seems peculiar to the Ogliastra region.
Key Words: complete mtDNA sequences genetic isolates phylogenetic network subhaplogroup genetic drift founder effect
| Introduction |
|---|
|
|
|---|
Mitochondrial DNA (mtDNA) analysis is considered an essential tool for studying human population structure, origins, migration patterns, and demographic history, given its polymorphism, its matrilineal mode of descent, and its lack of recombination (Torroni et al. 1996
The fact that mutations accumulate sequentially along maternal lineages allows associating many of them with different geographical regions of the world (Ingman et al. 2000
; Herrnstadt et al. 2002
). mtDNA sequence variations can thus be used to construct phylogenetic networks (Bandelt et al. 1999
), displaying the relationships among sequences and estimating the time of appearance of mutations associated with each haplotype.
Historically, the first mtDNA polymorphisms used in human phylogenetic studies were identified in the noncoding or D-loop region, containing the hypervariable segments (HVS). Accurate phylogenetic networks for European mtDNAs were constructed using HVS-1 and HVS-2 sequence data (Richards et al. 1998
; Helgason et al. 2000
), complemented with coding-region restriction fragment length polymorphisms (RFLP) used to define mtDNA haplogroups (Torroni et al. 1996
; Macaulay et al. 1999
). High mutation rates, variability in site substitution rates, homoplasic sites, parallel mutation events, and reversion make the D-loop evolution complex and the phylogenetic analysis subject to error (Bandelt et al. 2002
; Dennis 2003
; Forster 2003
). Complete mtDNA sequences, only recently becoming available (Ingman et al. 2000
; Finnila et al. 2001
; Maca-Meyer et al. 2001
; Torroni et al. 2001
; Herrnstadt et al. 2002
; Achilli et al. 2004
; Howell et al. 2004
; Palanichamy et al. 2004
; Rajkumar et al. 2005
), represent the best possible solution for phylogenetic analysis (Richards and Macaulay 2001
; Kivisild et al. 2006
). However, data for mtDNA polymorphisms in different human populations are still limited and only few haplogroups are based on mtDNAs complete sequence (Finnila and Majamaa 2001
). Most important, several reports associate some diseases with specific mtDNA haplogroups (Brown et al. 1997
; Kalman et al. 1999
; Chinnery et al. 2000
; Ruiz-Pesini et al. 2000
; Moilanen et al. 2003
; Herrnstadt and Howell 2004
; Mancuso et al. 2004
; Pyle et al. 2005
), and it is therefore fundamental to increase our understanding of mtDNA haplogroup (Rose et al. 2001
; Niemi et al. 2003
, 2005
).
Interpopulation comparisons and phylogenetic tree construction through mtDNA studies can be useful for the characterization of populations with unusual genetic features (Tolk et al. 2000
; Finnila and Majamaa 2001
; Larruga et al. 2001
; Meinila et al. 2001
). mtDNA analysis of different Sardinian populations revealed specific genetic characteristic and a variable degree of subpopulation homogeneity within the island (Workman et al. 1975
; Piazza et al. 1988
; Morelli et al. 2000
). However, so far no studies were carried out in eastern Sardinia using complete mtDNA sequences.
In the present article, we present the analysis of 63 complete mtDNA sequences of samples coming from 3 distinct villages of a Sardinian region, Ogliastra, characterized by both genetic and environmental homogeneity. This area encompasses 23 small isolated villages, whose founders presumably derived from the same original gene pool, with scant genetic exchanges with the rest of Sardinian areas recorded during the last 400 years of parish and historical records. Indeed, it is thought that geographic and cultural barriers have exerted a strong isolating effect in this part of Sardinia. Geographical isolation, small population size, high endogamy, and inbreeding are expected to lead to increased genetic differentiation among subpopulations as a consequence of founder effect and genetic drift (Angius et al. 2001
; Fraumene et al. 2003
).
Genealogical records systematically kept since the 17th century allow the careful and accurate reconstruction of genealogies for each village. We created a relational database and developed specific tools to access these data easily and reconstruct complete genealogical trees for up to 16 generations in a rapid manner (Mancosu et al. 2003
, 2005
). It is, therefore, possible to also reconstruct accurately all the maternal lineages present in the 3 villages all the way back to the 17th century. We analyzed one or more individuals from each founder maternal genealogical lineage.
Our study allowed us to monitor segregation and selection in the evolution of these lineages. Complete mtDNA sequences permitted us to evaluate genetic differentiation of each village in comparison to other European sequences.
| Materials and Methods |
|---|
|
|
|---|
Samples Selection
We sequenced complete mtDNAs of 63 individuals from 3 small villages within Ogliastra region (Eastern Sardinia). Talana (1,161 inhabitants) and Urzulei (1,419 inhabitants) are located in the northwestern part of Ogliastra, whereas Perdasdefogu (2,400 inhabitants) is located in the south. Samples were selected on the basis of genealogical information (Fraumene et al. 2003
All villages have a limited number of maternal lineages coalescing into few different haplogroups (table 1). Maternal lineages with less than 10 descendents were not taken into consideration as not being sufficiently representative of the community. For Talana, we estimate that 17 genealogical lineages account for about 87% of the present-day living inhabitants (1,765 residents and nonresidents), for Urzulei, 22 genealogical lineages account for about 90% present-day living inhabitants (2,091 residents and nonresidents), and for Perdasdefogu, 23 genealogical lineages account for about 79% present-day living inhabitants (2,340 residents and nonresidents). During the last 50 years, Perdasdefogu has undergone substantial immigration because of a military base nearby, but, by our sampling method, the maternal lineages immigrated in the village after 1950 were not considered.
|
Informed consent was obtained from each individual and all samples were collected in accordance with the Declaration of Helsinki.
DNA and Sequence Analysis
Genomic DNA was isolated from 5 ml of peripheral blood as previously described (Ciulla et al. 1988
). Whole mtDNA was amplified using 24 partially overlapping fragments, each
800 bp in length (Rieder et al. 1998
). Polymerase chain reaction (PCR) primers were designed to provide
200 bp overlap between neighboring fragments. Template DNA was amplified in 25-µl volume. PCR conditions were as follows: initial denaturation 2 min at 95 °C, 35 subsequent cycles of 30 s denaturation at 95 °C, 30 s annealing at each primer-specific temperature, 45 s extension at 72 °C, and a final 7 min extension at 72 °C. PCR products were purified with ExoSap-IT Kit (USB Corporation, Cleveland, OH). Sequencing reactions were performed with 7 µl purified PCR product, forward or reverse primers used in PCR reactions, and ABI BigDye Terminator Cycle Sequencing kits (Applied Biosystems, Foster City, CA). Sequencing reactions were purified using Millipore MultiScreen Assay System (Millipore, Billerica, MA), and fluorescent-labeled extension products were loaded onto an ABI 3730 DNA analyzer (Applied Biosystems, Foster City, CA). To avoid errors or artifacts, each sample was sequenced for both mtDNA strands. Moreover, ambiguous results were always confirmed by independent PCR and sequencing.
Comparisons with rCRS were performed using the BioEdit Sequence Alignment software (http://www.mbio.ncsu.edu/BioEdit/bioedit.html).
Sequences Evolutionary Analysis
All sequences were assigned to Western Eurasian haplogroups according to the nomenclature of Reidla et al. (2003)
, Maca-Meyer et al. (2003)
, Palanichamy et al. (2004)
, Achilli et al. (2004
, 2005
), Behar et al. (2006)
, and references therein.
The phylogenetic networks based on coding and D-loop region sequences were constructed by use of a median-joining algorithm (Bandelt et al. 1995
, 1999
) as implemented in the Network 4.1 program (http://www.fluxus-technology.com). By inspecting the network, one can identify homoplasic sites, that is, sites that are subjected to recurrent mutation (Bandelt et al. 1995
, 1999
).
Median-joining networks, based on nucleotide variation in the whole mtDNA, were generated with an
value of 0. Because of mutation-rate heterogeneity between D-loop and coding region, we chose to give a smaller weight to nucleotide positions in the D-loop region. We assigned a weight = 2 to all coding-region variants and a weight = 1 for D-loop substitutions.
Analyses of Genetic Variation
Statistical analyses of genetic variation within each population of the 3 villages of Ogliastra (Talana, Urzulei, and Perdasdefogu) were conducted using DnaSP ver 4.0 software (Rozas et al. 2003
).
Measures of genetic variation between populations were computed using Arlequin ver 2.0 software (Schneider et al. 2000
). FST and
ST values, 2 indices of population differentiation, based, respectively, on allele frequencies and on both allele frequency and sequence differences, were computed between each pair of populations, and an analysis of molecular variation (AMOVA) was carried out. To assess the significance of the genetic variances thus estimated, individual genotypes were randomly reassigned to populations and populations to groups 1,000 times (Schneider et al. 2000
). In this way, the significance of the estimated variances was tested by a nonparametric permutational procedure in which the distribution of the random variances thus obtained was compared with the observed values (Excoffier et al. 1992
). Several tests of mutation-drift equilibrium were conducted using Tajima's D (Tajima 1989
), Fu and Li's D and F (Fu and Li 1993
), and Fu's Fs (Fu 1997
).
| Results |
|---|
|
|
|---|
Sequence Variation in mtDNA from 63 Subjects of Talana, Urzulei, and Perdasdefogu
Complete mtDNA sequence was determined in 63 maternally unrelated healthy individuals from Talana, Urzulei, and Perdasdefogu belonging to haplogroups H, pre-V, V, J, K, T, U, and X.
These sequences were defined using haplogroup definitions as described in literature (Torroni et al. 1996
; Finnila et al. 2000
, 2001
; Herrnstadt et al. 2002
; Maca-Meyer et al. 2003
; Reidla et al. 2003
; Achilli et al. 2004
, 2005
; Palanichamy et al. 2004
; Behar et al. 2006
). Although haplogroup distributions were different in each village (table 1), total frequencies were as follows: haplogroup H 52.4%; haplogroup T 9.5%; haplogroup V 1.6%; haplogroup pre-V 1.6%; haplogroup J 6.4%; haplogroup U 20.6%; haplogroup K 3.2%, and haplogroup X 4.7%. No mtDNAs from haplogroups I and W were found. These frequencies differ from those of Herrnstadt et al. (2002)
, where haplogroup U has a lower frequency, but our data are similar to Kivisild et al. (2006)
. Moreover, we found that Urzulei has the highest haplogroup U frequency. We found an underrepresentation of haplogroup K compared with the literature probably due to genetic drift: haplogroup K was present only in Perdasdefogu.
We found that our 63 mtDNA sequences contained 172 sequence changes in the coding-region and 69 sequence changes in the D-loop region. In the coding region, 46 positions were nonsynonymous, 84 positions were synonymous, 20 substitutions were in 12s and 16s rRNA genes, 16 substitutions were in tRNA genes, 2 substitutions were in the origin of replication, and 4 in the noncoding region (see supplementary table, Supplementary Material online). Thirteen out of 172 sequence changes in the coding region are novel as they have not been previously reported in literature or described in the Mitomap Web site (table 2).
|
Polymorphisms A263G, A750G, A1438G, A4769G, A8860G, and A15326G were shared by all mtDNA sequences. The 2706G and 7028T polymorphisms were not present in most haplogroup H mtDNA sequences. The 11719A polymorphism was present in all T, J, U, X, and K haplogroups. The C14766T polymorphism was also present in all T, J, U, X, and K haplogroups, with one exception: Talana, subhaplogroup T2.
Topology of Phylogenetic Networks and Comparison with Europeans
The median network of 63 complete mtDNA sequences of the Ogliastra region is shown in figure 1. Our analysis revealed 241 segregating sites characterizing 51 haplotypes. The topology of our network shows 4 mtDNA haplogroups clusters: HV, UK, TJ, and X according to a recent mtDNA tree based on complete mtDNA coding-region sequences (Kivisild et al. 2006
).
|
Haplogroups H and V form a cluster sharing 73A, 14766C, and 11719G polymorphisms, with 4 exceptions: 6P and 2P samples from Perdasdefogu sharing 73G, 2798T, and 2579T samples from Talana sharing 14766T. The H haplogroup network is highly starlike, and at least 2 subclusters emerge: H1, defined by 3010A encompassing 34% of the HV cluster, and H3, defined by 6776C encompassing 48% of the HV cluster. The starlike nature of H3 and H1 subhaplogroups suggests that these lineages have undergone a recent expansion. Sample 62U could be assigned to H4 and is characterized by 3992T, 4024G, 5004C, 7356A, 7521A, 9123A, 14365T, and 14582G sequence changes. There are also typical H haplotypes pertaining to 136U (186A, 2789T, 4823C, and 16192T), 2P (73G, 152C, 8557A, 12358G, and 16145A), and 2545T (16167T and 16261T) samples. Sample 2010T was identified as V and is characterized by sequence changes 72C, 3745A, 4314C, 4580A, 7852A, 15904T, 15905C, 16188T, 16294T, and 16298C. Sample 6P was identified as pre-V and is characterized by 72C, 73G, 12662G, and 15904T sequence changes. The 62U, 136U, 2P, 2545T, 2010T, and 6P samples, present at equal frequency within the HV cluster, represent 18% of samples of the cluster.
Within the HV cluster, some sequences differ from the previously published mtDNA phylogeny. Sample 6P was assigned to the HV cluster. It has 2 substitutions typical of V haplogroup (72C and 15904T) but lacks 73A, 4580A, and 16298C. Based on these findings and on its network position, this sample could be assigned to the pre-V haplogroup (Achilli et al. 2004
). Sample 2P is similar to sample number 28 of Achilli et al. (2004)
, except for the substitution 9368G. Samples 2465T and 2477T present the 3010A, although they are H3. Moreover, sample 2554T has 3010G, although its network position is H1. All these cases could be explained by back mutation or homoplasy within the same haplogroup because such sites are known to be prone to recurrent mutations (Achilli et al. 2004
; Loogvali et al. 2004
; Palanichamy et al. 2004
).
The TJ cluster is characterized by polymorphisms 4216C, 11251G, 15452A, and 16126C. In our population, the haplogroup T could be subdivided into 2 subhaplogroups, T2 and T2b. The haplogroup J could be subdivided into 3 subhaplogroups: J1c, J2a, and J2b.
Polymorphisms 12372A, 12308G, and 11467G characterize the UK cluster. In our network, 5 subhaplogroups are present within haplogroup U: U6, U1a, U5b1, U5b2, and U5b3. U6 and K1a are distinct from the other subhaplogroups by the presence of 16311C.
Haplogroup K is a subhaplogroup within U (Achilli et al. 2005
; Behar et al. 2006
; Kivisild et al. 2006
). Subhaplogroup U6 is characterized by 3348G, 4336T, 4454C, 5147A, 6575G, 12501A, 14470C, 14518G, 16172C, 16219G, and 16261T sequence changes.
Subhaplogroup U1a is characterized by 195C, 285T, 663G, 2218T, 4991A, 6026A, 7403G, 7581C, 11116C, 12879C, 13104G, 13656C, 14070G, 14364A, 15115C, 15148A, 15217A, 15954C, 16145A, 16189C, and 16249C sequence changes.
Subhaplogroups U5b1, U5b2, and U5b3 are defined by sequence changes 150T, 3197C, 7768G, 9477A, 13617C, 14182C, and 16270T. A new, well-defined cluster, characterized by 6 transitions (373G, 7226A, 11177T, 16192T, 16235G, and 16519C) and 1 transversion at position 16169A, is identified; in accordance with the nomenclature system of Richards et al. (1998)
, it could be tentatively named U5b3. This subhaplogroup, never previously reported in the literature, seems to be restricted to Sardinia.
In our network, the haplogroup X is characterized by polymorphisms 153G, 195C, 225A, 1719A, 6221C, 6371T, 12705T, 13966G, 14470C, 16189C, 16223 T, 16278T, and 16519C. Additional sequence changes subdivide the main branch into subhaplogroups, X2b and X2 (Reidla et al. 2003
).
We found homoplasy cases in different haplogroups: 709A (for haplogroups TU), 5147A (for haplogroups TU), 5656G (for haplogroups TU), 10398G (for haplogroups JK), 14798C (for haplogroups JK), 14470C (for haplogroups UX), 11914A (for haplogroups HU), 11950G (for haplogroups HU), and 15607G (for haplogroups HT). We detected in all 18 homoplasic positions in the D-loop region and some even occurring at subhaplogroup-defining sites.
Figure 2 shows a network based on European sequences described by Herrnstadt et al. (2002)
together with 76 European samples from Kivisild et al. (2006)
. Taking into account the ethnic origin (United States and United Kingdom) of Herrnstadt sequences, we added the Kivisild samples covering numerous European countries in order to produce a more complete portrait of the European network. The Kivisild samples include 5 Northern European, 12 Italian, 1 Greek, 2 Finn, 2 Ashkenazi, 1 Georgian, 17 Hungarian, 3 Icelander, 3 Czech, 1 Sardinian, 5 Basque, 1 Iberian, and 23 Dutch.
|
We inserted our samples in this European network to show the position of Ogliastra in comparison with European lineages. Our samples are placed externally within each haplogroup, reflecting influence of factors such as founder effect and genetic drift. There is general agreement regarding the formation of H, H1, H3, V, T2, T2b, J1c, J2a, J2b, U5b1, U5b2, X2b, and X2 subhaplogroups, but for subhaplogroup K1a, we changed Herrnstadt et al. (2002) network topology according to Behar et al. (2006)
Analyses of Genetic Variation
The results of the statistical analysis of genetic diversity within each population are presented in table 3. The gene diversity for the whole sequence within populations ranges from 0.970 to 0.993 depending on the locality, thus showing a relatively high genetic variability within each population. Those high values could be explained by the biased sampling strategy. Indeed, individuals were not sampled randomly but one individual from each lineage was selected, thereby increasing the genetic diversity within each population. However, for the D-loop, the values fall within the range of values reported for Italian populations (Handt et al. 1998
).
|
On the other hand, the nucleotide diversity and the number of segregating sites is lower than that observed by Ingman et al. (2000)
In brief, most of our values of genetic diversity were lower than the ones found by Ingman et al. (2000)
but higher than the ones previously estimated for Sardinia; this could be explained, respectively, by the facts that 1) our study was based on only 3 localities (instead of 32 countries in the study of Ingman et al.) and 2) the sampling was not random but specifically designed to explore intrapopulation diversity, thus increasing measures of internal diversity compared with other studies in Sardinia.
The values of genetic distance between populations, both FST and
ST, are relatively low (FST = 0.0173;
ST = 0.0398). The percentage of variation based on haplotype frequencies among populations was 1.73 (P < 0.01) and within populations was 98.27, whereas the percentage of variation based on molecular distances was 3.98 (P < 0.05) and 96.02, respectively, among and within populations. However, AMOVA reveals that between 1.7% and 4.0% of the genetic variation is due to differences among populations (considering either haplotype frequencies or molecular distances), which is statistically significant. However, these values are consistent with the ones observed for mainland Italians and Sardinians by Barbujani et al. (1995)
, who suggested that less than 6% of the mitochondrial diversity could be attributed to differences between localities.
Tests of Mutation-Drift Equilibrium
Tajima's D is negative in all populations sampled but never significant, thus not showing any departure from neutrality, unless pooling all 3 populations together (table 4). This is in contrast with most previous studies on mtDNA, which have shown significant negative values, generally interpreted as a consequence of population expansions (Excoffier and Schneider 1999
). For instance, based on the D-loop region, both Merriwether et al. (1991)
and Barbujani et al. (1995)
found significant Tajima's D values in Sardinians; and Verginelli et al. (2003)
also found significant values in central-eastern Italy. Similarly, in a study of complete mitochondrial sequences (Ingman et al. 2000
), significant negative values of Tajima's D were observed in the non-African samples. Fu and Li's D and F tests were not significant in our study, even after pooling the populations.
|
The insignificant values obtained in our tests of mutation-drift equilibrium do not seem due to the sampling scheme. Tajima's D is based on the differences between the number of segregating sites and the average number of nucleotide differences. Fu and Li's D test is based on the differences between the number of singletons (mutations appearing only once among the sequences) and the total number of mutations, whereas Fu and Li's F test is based on the differences between the number of singletons and the average number of nucleotide differences between pairs of sequences.
Positive values are observed in stationary populations, in which a substantial number of mutations are shared by different lineages (Rogers and Harpending 1992
). Conversely, in expanding populations, most mutations tend to be unique to a single lineage, resulting in negative values. By selecting and completely sequencing mitochondrial genomes of different haplogroups or by jointly analyzing members of different populations, one is overestimating the fraction of singletons (and segregating sites) and, hence, biasing downwards the estimates.
On the other hand, Fu's Fs values are both significant for complete mtDNA sequences and for the D-loop, when considering all localities together. The Fs test statistic (Fu 1997
) is based on the probability of having a number of alleles greater or equal to the observed number in a sample drawn from a stationary population. This statistic is particularly sensitive to population growth (Excoffier and Schneider 1999
). Using the data of Di Rienzo et al. (1991)
on Sardinia, Excoffier and Schneider also found significant negative values, indicating a period of population expansion. It is possible that we found significant values only with this last test because it is the most sensitive to population demographic processes, and there is evidence that population growth in Ogliastra might have been slower than in other European countries (Barbujani et al. 1995
).
| Discussion |
|---|
|
|
|---|
Haplogroup definition and population genetic inference would greatly benefit from a better understanding of mtDNA variations in coding regions, but complete mtDNA sequences are rarely produced. Analysis of complete coding-region sequences of our Ogliastra samples revealed 13 novel substitutions. Our study ascertained that 26% of nonsynonymous substitutions involve threonine codon: most changes replace threonine with alanine. Comparing human, primate, and other mammalian mtDNA reveals that threonine/alanine substitutions are overrepresented in humans, whereas methionine/leucine changes are common in other mammalian species (Kivisild et al. 2006
So far, only a few studies have carried out complete mitochondrial sequences, and we have analyzed the complete coding region of Sardinian samples for the first time. The number of never previously described substitutions discovered in this study is probably due to the low number of complete mtDNA sequences reported in the literature. We could hypothesize that some substitutions are typical of the Ogliastra region and/or Sardinia, whereas others will be found in the future in other populations. Moreover, we found some typical substitutions, that is, the U5b3 subhaplogroup, probably peculiar to the Ogliastra region.
In the present study, we constructed a network showing the general topology of coding and D-loop regions of Talana, Urzulei, and Perdasdefogu samples, and we inserted it in a broader network containing other European sequences described in literature.
Our complete mtDNA sequences can be arranged into an unambiguous network, in agreement with previously published D-loopbased networks. Each haplogroup was defined by many coding-region sequence changes, whereas only a few D-loop substitutions appeared to be important in this respect (as reported for 6P sample). In Talana, Urzulei, and Perdasdefogu (Fraumene et al. 2003
), all reticulations were resolved, demonstrating the importance of integrating the study of the 2 regions in order to define a correct and simplified haplogroup phylogeny. This study characterizes also with greater precision some haplogroups previously based only on control region data and RFLP analysis. We find that there are actually many haplogroup-associated, haplogroup-defining, and haplogroup-specific sequence changes and a large number of homoplasic sites. The network topology of our samples is congruent with previously published Western Eurasian basal haplogroup phylogenies (Herrnstadt et al. 2002
; Reidla et al. 2003
; Achilli et al. 2004
; Palanichamy et al. 2004
; Achilli et al. 2005
), with some minor external branch differences probably due to retromutation and events of homoplasy. Homoplasy or back mutation presence within the coding region is already documented (Loogvali et al. 2004
) and confirms some site variability in mtDNA evolution (Aris-Brosou and Excoffier 1996
). Comparing with other European coding-region sequences, it appears that these 3 small populations remained extremely isolated and genetically differentiated in the European context. Different biodemographic histories determined by population expansions, immigration rates, endogamy, founder effect, and genetic drift shaped each one of these populations characterized by unusual genetic features.
Comparing statistics describing genetic diversity in Ogliastra with those of other populations is not straightforward because of the sampling scheme chosen. Here we observed relatively high values of gene diversity, with respect to other isolated populations. This is the expected consequence of the sampling strategy; indeed when specific mtDNA lineages are selected, one is artificially increasing diversity within populations. However, the number of segregating (or polymorphic) sites is less than that in other studies, both for the whole sequence and for the D-loop alone. Therefore, it is highly likely that variation within populations is actually reduced in the Ogliastra region, a result suggesting that the isolation documented in the historical record has actually affected the people's genetic features. Along with the observation that FST and
ST values are high among the 3 villages sampled, our finding would suggest that the Ogliastra population is unlikely to show internal stratification. Only an analysis of genetic diversity for autosomal markers implementing a random sampling strategy could confirm that the 3 populations studied in this region are distinct and internally homogeneous in general. This could also clarify a possible departure from neutral expectations (for Fu's Fs statistic) due to an excess of low-frequency haplotypes. A similar excess of rare haplotypes (and a related deficit of intermediate-frequency haplotypes) was observed in most previous mitochondrial studies, with some populations of hunters and gatherers being the main exception (see e.g., Excoffier and Schneider 1999
). We assume chances are that the Ogliastra populations, despite the relative paucity of the resources available and with their isolation, did slowly expand, much like other food-producing populations. This hypothesis should be further investigated in the future. The results of the Tajima's D, Fu and Li's D and F tests, and Fu's Fs suggest that because of the relative paucity of the resources available and their isolation, the Ogliastra populations increased in size too slowly, unlike other food-producing populations, to carry a clear mark of the expansion in their mitochondrial genome.
To conclude, complete networks could help to distinguish between a rare polymorphism and a pathogenic mutation in clinically affected people. Similar phylogenetic network studies should be carried out in other populations for furthering medical and population genetics.
| Electronic Database Information |
|---|
|
|
|---|
The URLs for data in this article are as follows:
Genome Data Base, http://www.gdb.org/.
Helsinki declaration, http://www.wma.net/e/policy/17-c_e.html.
Life Sciences and Engineering Technology Solutions, http://www.fluxus-engineering.com/ (for Network 4.1 software).
| Supplementary Material |
|---|
|
|
|---|
Complete coding-region sequence information for 63 individual samples that have been submitted to GenBank under accession nos. DQ523619DQ523681 and supplementary table are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
| Acknowledgements |
|---|
|
|
|---|
We thank Giuseppe Ledda and Teresa Manias for genealogical analysis. We are very grateful to the populations of Talana, Perdasdefogu, and Urzulei for their collaboration and to Parco Genetico dell'Ogliastra and the municipal administrations for their economic and logistic support. We acknowledge intramural funding from Shardna Life Sciences and support by grant from the Italian Ministry of Education, University and Research n: 5571/DSPAR/2002.
| Footnotes |
|---|
Sarah Tishkoff, Associate Editor
| References |
|---|
|
|
|---|
Achilli A, Rengo C, Battaglia V, et al. (13 co-authors). (2005) Saami and Berbersan unexpected mitochondrial DNA link. Am J Hum Genet 76:58836.[CrossRef][ISI][Medline]
Achilli A, Rengo C, Magri C, et al. (21 co-authors). (2004) The molecular dissection of mtDNA haplogroup H confirms that the Franco-Cantabrian glacial refuge was a major source for the European gene pool. Am J Hum Genet 75:59108.[CrossRef][ISI][Medline]
Anderson S, Bankier AT, Barrell BG, et al. (14 co-authors). (1981) Sequence and organization of the human mitochondrial genome. Nature 290:45765.[CrossRef][Medline]
Andrews RM, Kubacka I, Chinnery PF, Lightowlers RN, Turnbull DM, Howell N. (1999) Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA. Nat Genet 23:2147.[CrossRef][ISI][Medline]
Angius A, Melis PM, Morelli L, Petretto E, Casu G, Maestrale GB, Fraumene C, Bebbere D, Forabosco P, Pirastu M. (2001) Archival, demographic and genetic studies define a Sardinian sub-isolate as a suitable model for mapping complex traits. Hum Genet 109:198209.[CrossRef][ISI][Medline]
Aris-Brosou S and Excoffier L. (1996) The impact of population expansion and mutation rate heterogeneity on DNA sequence polymorphism. Mol Biol Evol 13:494504.[Abstract]
Bandelt HJ, Forster P, Rohl A. (1999) Median-joining networks for inferring intraspecific phylogenies. Mol Biol Evol 16:3748.[Abstract]
Bandelt HJ, Forster P, Sykes BC, Richards MB. (1995) Mitochondrial portraits of human populations using median networks. Genetics 141:74353.[Abstract]
Bandelt HJ, Quintana-Murci L, Salas A, Macaulay V. (2002) The fingerprint of phantom mutations in mitochondrial DNA data. Am J Hum Genet 71:5115060.[CrossRef][ISI][Medline]
Barbujani G, Bertorelle G, Capitani G, Scozzari R. (1995) Geographical structuring in the mtDNA of Italians. Proc Natl Acad Sci 92:2091715.
Behar DM, Metspalu E, Kivisild T, et al. (20 co-authors). (2006) The matrilineal ancestry of ashkenazi jewry: portrait of a recent founder event. Am J Hum Genet 78:348797.[CrossRef][ISI][Medline]
Brown MD, Sun F, Wallace DC. (1997) Clustering of Caucasian Leber hereditary optic neuropathy patients containing the 11778 or 14484 mutations on an mtDNA lineage. Am J Hum Genet 60:23817.[ISI][Medline]
Chinnery PF, Taylor GA, Howell N, Andrews RM, Morris CM, Taylor RW, McKeith IG, Perry RH, Edwardson JA, Turnbull DM. (2000) Mitochondrial DNA haplogroups and susceptibility to AD and dementia with Lewy bodies. Neurology 55:23024.
Ciulla TA, Sklar RM, Hauser SL. (1988) A simple method for DNA purification from peripheral blood. Anal Biochem 174:24858.[CrossRef][ISI][Medline]
Dennis. 2003. Error reports threaten to unravel databases of mitochondrial DNA. Nature 421:69257734.
Di Rienzo A and Wilson AC. (1991) Branching pattern in the evolutionary tree for human mitochondrial DNA. Proc Natl Acad Sci 88:1597601.
Excoffier L and Schneider S. (1999) Why hunter-gatherer populations do not show signs of Pleistocene demographic expansions. Proc Natl Acad Sci 96:10597602.
Excoffier L, Smouse PE, Quattro JM. (1992) Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131:47991.[Abstract]
Finnila S, Hassinen IE, Ala-Kokko L, Majamaa K. (2000) Phylogenetic network of the mtDNA haplogroup U in Northern Finland based on sequence analysis of the complete coding region by conformation-sensitive gel electrophoresis. Am J Hum Genet 66:3101726.[CrossRef][ISI][Medline]
Finnila S, Lehtonen MS, Majamaa K. (2001) Phylogenetic network for European mtDNA. Am J Hum Genet 68:6147584.[CrossRef][ISI][Medline]
Finnila S and Majamaa K. (2001) Phylogenetic analysis of mtDNA haplogroup TJ in a Finnish population. Am J Hum Genet 46:649.
Forster P. (2003) To err is human. Ann Hum Genet 67:Pt 124.[CrossRef][ISI][Medline]
Fraumene C, Petretto E, Angius A, Pirastu M. (2003) Striking differentiation of sub-populations within a genetically homogeneous isolate (Ogliastra) in Sardinia as revealed by mtDNA analysis. Hum Genet 114:1110.[CrossRef][ISI][Medline]
Fu Y-X. (1997) Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics 147:91525.[Abstract]
Fu Y-X and Li W-H. (1993) Statistical tests of neutrality of mutations. Genetics 133:693709.[Abstract]
Handt O, Meyer S, von Haeseler A. (1998) Compilation of human mtDNA control region sequences. Nucleic Acids Res 26:11269.
Helgason A, Sigureth ardottir S, Nicholson J, Sykes B, Hill EW, Bradley DG, Bosnes V, Gulcher JR, Ward R, Stefansson K. (2000) Estimating Scandinavian and Gaelic ancestry in the male settlers of Iceland. Am J Hum Genet 67:697717.[CrossRef][ISI][Medline]
Herrnstadt C, Elson JL, Fahy E, et al. (11 co-authors). (2002) Reduced-median-network analysis of complete mitochondrial DNA coding-region sequences for the major African, Asian, and European haplogroups. Am J Hum Genet 70:5115271.[CrossRef][ISI][Medline]
Herrnstadt C and Howell N. (2004) An evolutionary perspective on pathogenic mtDNA mutations: haplogroup associations of clinical disorders. Mitochondrion 4:567918.[CrossRef][ISI][Medline]
Howell N, Elson JL, Turnbull DM, Herrnstadt C. (2004) African Haplogroup L mtDNA sequences show violations of clock-like evolution. Mol Biol Evol 21:10184354.
Ingman M, Kaessmann H, Paabo S, Gyllensten U. (2000) Mitochondrial genome variation and the origin of modern humans. Nature 408:681370813.[CrossRef][Medline]
Kalman B, Li S, Chatterjee D, O'Connor J, Voehl MR, Brown MD, Alder H. (1999) Large scale screening of the mitochondrial DNA reveals no pathogenic mutations but a haplotype associated with multiple sclerosis in Caucasians. Acta Neurol Scand 99:11625.[Medline]
Kivisild T, Shen P, Wall D, et al. (17 co-authors). (2006) The role of selection in the evolution of human mitochondrial genomes. Genetics 172:137387.
Larruga JM, Diez F, Pinto FM, Flores C, Gonzalez AM. (2001) Mitochondrial DNA characterisation of European isolates: the Maragatos from Spain. Eur J Hum Genet 9:70816.[CrossRef][ISI][Medline]
Loogvali EL, Roostalu U, Malyarchuk BA, et al. (35 co-authors). (2004) Disuniting uniformity: a pied cladistic canvas of mtDNA haplogroup H in Eurasia. Mol Biol Evol 21:11201221.
Maca-Meyer N, Gonzalez AM, Larruga JM, Flores C, Cabrera VM. (2001) Major genomic mitochondrial lineages delineate early human expansions. BMC Genet 2:113.[Medline]
Maca-Meyer N, Gonzalez AM, Pestano J, Flores C, Larruga JM, Cabrera VM. (2003) Mitochondrial DNA transit between West Asia and North Africa inferred from U6 phylogeography. Genetics 16:415.
Macaulay V, Richards M, Hickey E, Vega E, Cruciani F, Guida V, Scozzari R, Bonne-Tamir B, Sykes B, Torroni A. (1999) The emerging tree of West Eurasian mtDNAs: a synthesis of control-region sequences and RFLPs. Am J Hum Genet 64:23249.[CrossRef][ISI][Medline]
Mancosu G, Cosso M, Marras F, Borlino CC, Ledda G, Manias T, Adamo M, Serra D, Melis P, Pirastu M. (2005) Browsing isolated population data. BMC Bioinformatics 6:Suppl 4, S17.
Mancosu G, Ledda G, Melis PM. (2003) PedNavigator: a pedigree drawing servlet for large and inbred populations. Bioinformatics 19:66970.
Mancuso M, Conforti FL, Rocchi A, et al. (26 co-authors). (2004) Could mitochondrial haplogroups play a role in sporadic amyotrophic lateral sclerosis? Neurosci Lett 371:2315862.[CrossRef][ISI][Medline]
Meinila M, Finnila S, Majamaa K. (2001) Evidence for mtDNA admixture between the Finns and the Saami. Hum Hered 52:16070.[CrossRef][ISI][Medline]
Merriwether DA, Clark AG, Ballinger SW, Schurr TG, Soodyall H, Jenkins T, Sherry ST, Wallace DC. (1991) The structure of human mitochondrial DNA variation. J Mol Evol 33:654355.[CrossRef][ISI][Medline]
Moilanen JS, Finnila S, Majamaa K. (2003) Lineage-specific selection in human mtDNA: lack of polymorphisms in a segment of MTND5 gene in haplogroup. J Mol Biol Evol 20:12213242.
Morelli L, Grosso MG, Vona G, Varesi L, Torroni A, Francalacci P. (2000) Frequency distribution of mitochondrial DNA haplogroups in Corsica and Sardinia. Hum Biol 72:58595.[ISI][Medline]
Niemi AK, Hervonen A, Hurme M, Karhunen PJ, Jylha M, Majamaa K. (2003) Mitochondrial DNA polymorphisms associated with longevity in a Finnish population. Hum Genet 112:12933.[CrossRef][ISI][Medline]
Niemi AK, Moilanen JS, Tanaka M, Hervonen A, Hurme M, Lehtimaki T, Arai Y, Hirose N, Majamaa K. (2005) A combination of three common inherited mitochondrial DNA polymorphisms promotes longevity in Finnish and Japanese subjects. Eur J Hum Genet 13:216670.[CrossRef][ISI][Medline]
Palanichamy MG, Sun C, Agrawal S, Bandelt HJ, Kong QP, Khan F, Wang CY, Chaudhuri TK, Palla V, Zhang YP. (2004) Phylogeny of mitochondrial DNA macrohaplogroup N in India, based on complete sequencing: implications for the peopling of South Asia. Am J Hum Genet 75:696678.[CrossRef][ISI][Medline]
Piazza A, Cappello N, Olivetti E, Rendine S. (1988) A genetic history of Italy. Ann Hum Genet 52:20313.[ISI][Medline]
Pyle A, Foltynie T, Tiangyou W, et al. (12 co-authors). (2005) Mitochondrial DNA haplogroup cluster UKJT reduces the risk of PD. Ann Neurol 57:45647.[CrossRef][ISI][Medline]
Rajkumar R, Banerjee J, Gunturi HB, Trivedi R, Kashyap VK. (2005) Phylogeny and antiquity of M macrohaplogroup inferred from complete mtDNA sequence of Indian specific lineages. BMC Evol Biol 5:126.[CrossRef][Medline]
Reidla M, Kivisild T, Metspalu E, et al. (43 co-authors). (2003) Origin and diffusion of mtDNA haplogroup X. Am J Hum Genet 73:5117890.[CrossRef][ISI][Medline]
Richards M and Macaulay V. (2001) The mitochondrial gene tree comes of age. Am J Hum Genet 68:6131520.[CrossRef][ISI][Medline]
Richards MB, Macaulay VA, Bandelt HJ, Sykes BC. (1998) Phylogeography of mitochondrial DNA in western Europe. Ann Hum Genet 62:Pt 324160.[CrossRef][ISI][Medline]
Rieder MJ, Taylor SL, Tobe VO, Nickerson DA. (1998) Automating the identification of DNA variations using quality-based fluorescence re-sequencing: analysis of the human mitochondrial genome. Nucleic Acids Res 26:496773.
Rogers AR and Harpending H. (1992) Population growth makes waves in the distribution of pairwise genetic differences. Mol Biol Evol 9:55269.[Abstract]
Rose G, Passarino G, Carrieri G, Altomare K, Greco V, Bertolini S, Bonafe M, Franceschi C, De Benedictis G. (2001) Paradoxes in longevity: sequence analysis of mtDNA haplogroup J in centenarians. Eur J Hum Genet 9:97017.[CrossRef][ISI][Medline]
Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R. (2003) DnaSP, DNA polymorphism analysis by the coalescent and other methods. Bioinformatics 19:1824967.
Ruiz-Pesini E, Lapena AC, Diez-Sanchez C, et al. (11 co-authors). (2000) Human mtDNA haplogroups associated with high or reduced spermatozoa motility. Am J Hum Genet 67:368296.[CrossRef][ISI][Medline]
Schneider S, Roessli D, Excoffier L. (2000) Arlequin ver. 2.000: a software for population genetics data analysis. (Genetics and Biometry Laboratory, University of Geneva, Switzerland) Available from: http://lgb.unige.ch/arlequin/.
Tagliabracci A, Turchi C, Buscemi L, Sassaroli C. (2001) Polymorphism of the mitochondrial DNA control region in Italians. Int J Leg Med 114:452248.[CrossRef][ISI][Medline]
Tajima F. (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:58595.
Tolk HV, Pericic M, Barac L, Klaric IM, Janicijevic B, Rudan I, Parik J, Villms R, Rudan P. (2000) MtDNA haplogroups in the populations of Croatian Adriatic Islands. Coll Antropol 24:26780.[ISI][Medline]
Torroni A, Bandelt HJ, Macaulay V, et al. (33 co-authors). (2001) A signal, from human mtDNA, of postglacial recolonization in Europe. Am J Hum Genet 69:84452.[CrossRef][ISI][Medline]
Torroni A, Huoponen K, Francalacci P, Petrozzi M, Morelli L, Scozzari R, Obinu D, Savontaus ML, Wallace DC. (1996) Classification of European mtDNAs from an analysis of three European populations. Genetics 144:183550.[Abstract]
Verginelli F, Donati F, Coia V, Boschi I, Palmirotta R, Battista P, Mariani Costantini R, Destro-Bisol G. (2003) Variation of the hypervariable region-1 of mitochondrial DNA in central-eastern Italy. J Forensic Sci 48:24434.[ISI][Medline]
Workman PL, Lucarelli P, Agostino R, Scarabino R, Scacchi R, Carapella E, Palmarino R, Bottini E. (1975) Genetic differentiation among Sardinian villages. Am J Phys Anthropol 43:16576.[CrossRef][ISI][Medline]

