Skip Navigation


MBE Advance Access originally published online on March 8, 2007
Molecular Biology and Evolution 2007 24(5):1097-1100; doi:10.1093/molbev/msm051
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
24/5/1097    most recent
msm051v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Webster, M. T.
Right arrow Articles by Hagberg, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Webster, M. T.
Right arrow Articles by Hagberg, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

Letters

Is There Evidence for Convergent Evolution around Human Microsatellites?

Matthew T. Webster*,1 and Jonas Hagberg{dagger}

* Smurfit Institute of Genetics, University of Dublin, Trinity College, Dublin, Ireland
{dagger} Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden

E-mail: matthew.webster{at}ebc.uu.se.


    Abstract
 TOP
 Abstract
 Supplementary Material
 Acknowledgements
 References
 
A study by Vowles and Amos (2004)Go identified atypical patterns of base composition around human microsatellites and argued that microsatellites generate mutational biases in their flanking regions. Here, we perform simulations of molecular evolution using a simple model that suggest similar patterns can be produced without any such biases in genome evolution.

Key Words: microsatellite • convergent evolution • simulation • genome evolution • mutation bias

Vowles and Amos (2004)Go suggested that microsatellites generate biases in the rate and spectrum of mutations in their flanking sequences. If true, this has important ramifications because it implies that a large fraction (>30%) of the human genome is subject to a previously unrecognized mode of sequence evolution. Vowles and Amos searched the human genome for instances of (AC)5 dinucleotide repeats that were at least 100 bp from the nearest (AC)2 repeat and examined the 100 bp flanking each sequence for common characteristics. We reproduced this analysis using all (AC)5 repeats in the human genome (NCBI 35). A summary of the patterns observed in flanking regions is presented in figure 1A (7,856 repeats in total). There is a pronounced periodicity in the frequency of bases, which decays with distance from the repeat. The major conclusion of Vowles and Amos was that these patterns are due to a mutagenic effect of the microsatellite sequence on its flanking regions, implying widespread biases in point mutation patterns in the human genome.


Figure 1
View larger version (36K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 1.— (A) Summary of the periodic patterns in regions flanking (AC)5 microsatellites in the human genome. (B) Summary of patterns in regions flanking (AC)5 microsatellites produced by simulations of molecular evolution. In both cases, the data was analyzed in an identical fashion to Vowles and Amos (2004). The (AC)5 microsatellite and 2 flanking bases are centrally placed at position 0.

 
We wished to test an alternative possibility—that periodicity in base frequencies occurs around perfect repeats because their flanking regions are frequently also derived from tandem repeats. Sequence variation at human microsatellites is often complex. Many loci contain interruptions or are comprised of more than one repeat motif (Bull et al. 1999Go). It is therefore possible that (AC)5 repeats regularly occur within compound microsatellites, which are also subject to decay by the accumulation of point mutations. To test this hypothesis, we performed forward simulations of molecular evolution, starting with a sequence that did not contain any periodicity in base frequencies in the regions flanking microsatellites.

We generated an artificial ancestral sequence, which consisted of a large number of dinucleotide repeats of 25 units separated by 100 bp of random sequence. The motifs of each dinucleotide repeat array were chosen to match the known repeat composition of the human genome (reported in Katti et al. 2001Go). The values used were 26% AT/TA, 20% AG/GA/CT/TC, and 54% TG/GT/AC/CA, with a negligible contribution from CG/GC. The composition of the random intervening sequence was chosen to match the known dinucleotide base composition of the human genome. This length of random sequence was included to ensure that the flanking regions of each microsatellite were completely free of tandem repeats at the start of each simulation. We estimated the average pattern of nucleotide substitution in the human genome by inferring the relative frequency of each of the 12 possible single-base changes by parsimony in nongenic, nonrepetitive sequence alignments of human and chimpanzee with baboon as an outgroup (taken from Smith et al. 2002Go). These revealed a transition bias of 3.6 and a bias in mutations from G:C to A:T of 1.3. The microsatellite slippage mutation rate was set at 1,000 times the average point mutation rate.

In each cycle of the simulation, we applied a round of slippage mutations with an equal chance of expansions or contractions. The probability that a microsatellite expanded or contracted by n units was Formula which is similar to a geometric distribution. Expansions were performed by randomly choosing a dinucleotide within an array and repeating it the appropriate number of times. Contractions were performed by randomly deleting a tract of the appropriate number of bases within an array. Our assumption of a high slippage rate coupled with a strong bias toward expansions or contractions of a single-repeat unit is compatible with commonly accepted models of the microsatellite mutation process (Ellegren 2004Go). We also applied a round of point mutations using the probabilities calculated from the primate alignments. We ran each simulation until each array had accumulated an average of 50 slippage mutations. Assuming a human–chimpanzee split of 5 Myr, this corresponds to an accumulation of point mutations equivalent to ~43 Myr. We then searched the sequence for (AC)5 repeats with no (AC)2 repeats in their flanking sequence using the same procedure as for the human genome, retaining an identical sample size (n = 7,856) for further analysis.

The patterns of base frequencies found in the flanking regions of (AC)5 repeats generated by simulation (fig. 1B) are similar to those observed in the human genome, which both exhibit periodicities of the same phase. These results demonstrate that periodic patterns in flanking regions similar to those presented by Vowles and Amos can be generated by the accumulation of neutral mutations in microsatellites. In order to exclude the possibility that flanking sequences contain remnants of ancestral microsatellites, Vowles and Amos excluded those containing (AC)2 motifs from their analysis. Our results indicate that this method is inadequate. It is likely that the flanking regions of many microsatellites contain remnants of repetitive sequences that have decayed to such an extent that they are now impossible to detect by searching for particular motifs.

Vowles and Amos also reported that the strength and pattern of base periodicity depends on the 2 bases immediately flanking the (AC)5 repeat (the cassette) with some patterns exhibiting 5' to 3' asymmetry according to cassette type. Figure 2A shows the patterns observed in our reanalysis of the human genome divided by cassette. Nonrandom patterns of base frequencies are mainly restricted to cassettes with a 5' T or 3' A. Similar patterns can be observed in the simulated data (fig. 2B). However, in general, the periodicity is stronger in the simulations, and some of the cassettes (notably C/T) exhibit patterning that is not seen in the real data. In the simulations, cassette T/A has a strong and symmetrical periodicity in base frequencies. All other cassettes with a 5' T have stronger patterning in the 5' than 3' flanking sequence, whereas all other cassettes with a 3' A have stronger patterning in the 3' than 5' flanking sequence. These cassettes also show the strongest base periodicity and similar asymmetries in the real data, indicating that the decay of microsatellites by accumulation of single-base substitutions and slippage mutations could be an important process in generating these observed patterns. The cause of the asymmetric patterns in both the real and simulated data is unclear, but one possibility is that certain cassettes are more likely to be located at the 5' or 3' end of ancestral repeat tracts, which weakens the periodicity on one side of the repeat motif.


Figure 2
View larger version (39K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 2.— (A) Flanking sequence frequency distributions for all 16 possible cassettes flanking (AC)5 microsatellites in the human genome. (B) Flanking sequence frequency distributions for all 16 possible cassettes flanking (AC)5 microsatellites generated by simulation as described in the text. In both cases, the (AC)5 microsatellite and 2 flanking bases are centrally placed at position 0.

 
In order to produce the desired sample size of 7,856, we needed to search 1,353,798 repeats, indicating that about 0.6% of the ancestral arrays generate (AC)5 repeats that fit the criteria under our simulation conditions. More than 99% of the (AC)5 repeats in the sequence are derived from the ancestral arrays rather than flanking sequence, indicating that the microsatellite mutation processes are the primary way of generating these motifs in our simulations. In the human genome, (AC)5 repeats must be produced by a variety of processes other than the decay we have simulated here. However, these numbers indicate that it is plausible for a subset to be derived from a process similar to the one we have simulated. As the periodicities generated by our simulations are much stronger than observed in the human genome, only a fraction of (AC)5 repeats would be needed to be formed in this way in order to generate the observed periodicities.

We also examined the effect of modifying the length of the search pattern to all lengths between (AC)2 and (AC)15. We observed similar patterns to those presented by Vowles and Amos of an increase in the strength of base periodicities flanking shorter array lengths followed by a decline at longer array lengths. In our simulations, the peak in periodicities occurs around (AC)8 (see supplementary material, Supplementary Material online). This is because shorter AC arrays have a greater chance of appearing outside of ancestral arrays, whereas at longer lengths the AC tracts are likely to occupy a greater proportion of the full length of the microsatellite, so that the patterning extends for a shorter distance.

Microsatellites evolve by a complex interaction between point mutations and tandem repeat length mutations. Many models of microsatellite evolution have been proposed, and the interaction between these processes is poorly understood (Ellegren 2004Go). We performed simulations where a sequence containing long perfect repeats was subjected to rounds of length mutations and single-base changes. In practice, long perfect repeats are very rare, and our model is only an approximation of the genesis and evolution of microsatellites. However, although not exactly the same, our simulations generated many similar base periodicities and biases to those observed in the human genome. This suggests that a subset of (AC)5 motifs in the human genome are associated with compound microsatellites derived from a comparable process. A reconstruction of the origin of all (AC)5 motifs and their flanking sequences in the human genome would be impossible. However, the simulations presented here demonstrate that periodicity in base frequencies around microsatellites, such as those reported by Vowles and Amos, can be generated without regional biases in the spectrum of mutations and that evidence for convergent evolution is currently lacking.


    Supplementary Material
 TOP
 Abstract
 Supplementary Material
 Acknowledgements
 References
 
Supplementary material is available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).


    Acknowledgements
 TOP
 Abstract
 Supplementary Material
 Acknowledgements
 References
 
This study was funded by Science Foundation Ireland and the Swedish Research Council. We thank Ken Wolfe, Hans Ellegren, Marie Sémon, Meg Woolfit, Devin Scannell, and Gavin Conant for useful comments.


    Footnotes
 
1 Present address: Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden Back

Dan Graur, Associate Editor


    References
 TOP
 Abstract
 Supplementary Material
 Acknowledgements
 References
 

    Bull LN, Pabon-Pena CR, Freimer NB. Compound microsatellite repeats: practical and theoretical features. Genome Res (1999) 9:830–838.[Abstract/Free Full Text]

    Ellegren H. Microsatellites: simple sequences with complex evolution. Nat Rev Genet (2004) 5:435–445.[CrossRef][ISI][Medline]

    Katti MV, Ranjekar PK, Gupta VS. Differential distribution of simple sequence repeats in eukaryotic genome sequences. Mol Biol Evol (2001) 18:1161–1167.[Abstract/Free Full Text]

    Smith NG, Webster MT, Ellegren H. Deterministic mutation rate variation in the human genome. Genome Res (2002) 12:1350–1356.[Abstract/Free Full Text]

    Vowles EJ, Amos W. Evidence for widespread convergent evolution around human microsatellites. PLoS Biol (2004) 2:E199.[CrossRef][Medline]

Accepted for publication February 28, 2007.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
24/5/1097    most recent
msm051v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Webster, M. T.
Right arrow Articles by Hagberg, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Webster, M. T.
Right arrow Articles by Hagberg, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?