MBE Advance Access originally published online on January 12, 2008
Molecular Biology and Evolution 2008 25(4):688-695; doi:10.1093/molbev/msn008
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
Simulating DNA Coding Sequence Evolution with EvolveAGene 3
Bellingham Research Institute, Bellingham, WA
E-mail: drbh{at}mail.rochester.edu.
Accepted for publication January 5, 2008.
Phylogenetic reconstruction based upon multiple alignments of molecular sequences is important to most branches of modern biology and is central to molecular evolution. Understanding the historical relationships among macromolecules depends upon computer programs that implement a variety of analytical methods. Because it is impossible to know those historical relationships with certainty, assessment of the accuracy of methods and the programs that implement them requires the use of programs that realistically simulate the evolution of DNA sequences. EvolveAGene 3 is a realistic coding sequence simulation program that separates mutation from selection and allows the user to set selection conditions, including variable regions of selection intensity within the sequence and variation in intensity of selection over branches. Variation includes base substitutions, insertions, and deletions. To the best of my knowledge, it is the only program available that simulates the evolution of intact coding sequences. Output includes the true tree and true alignments of the resulting coding sequence and corresponding protein sequences. A log file reports the frequencies of each kind of base substitution, the ratio of transition to transversion substitutions, the ratio of indel to base substitution mutations, and the numbers of silent and amino acid replacement mutations. The realism of the data sets has been assessed by comparing the dN/dS ratio, the ratio of transition to transversion substitutions, and the ratio of indel to base substitution mutations of the simulated data sets with those parameters of real data sets from the "gold standard" BaliBase collection of structural alignments. Results show that the data sets produced by EvolveAGene 3 are very similar to real data sets, and EvolveAGene 3 is therefore a realistic simulation program that can be used to evaluate a variety of programs and methods in molecular evolution.
Key Words: simulation coding sequence evolution indels
Sudhir Kumar, Associate Editor
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
C. L. Strope, K. Abel, S. D. Scott, and E. N. Moriyama Biological Sequence Simulation for Testing Complex Evolutionary Hypotheses: indel-Seq-Gen Version 2.0 Mol. Biol. Evol., November 1, 2009; 26(11): 2581 - 2593. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Fletcher and Z. Yang INDELible: A Flexible Simulator of Biological Sequence Evolution Mol. Biol. Evol., August 1, 2009; 26(8): 1879 - 1888. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Chattopadhyay, S. J. Weissman, V. N. Minin, T. A. Russo, D. E. Dykhuizen, and E. V. Sokurenko High frequency of hotspot mutations in core genes of Escherichia coli due to short-term positive selection PNAS, July 28, 2009; 106(30): 12412 - 12417. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Anisimova and C. Kosiol Investigating Protein-Coding Sequence Evolution with Probabilistic Codon Substitution Models Mol. Biol. Evol., February 1, 2009; 26(2): 255 - 271. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. G. Hall How Well Does the HoT Score Reflect Sequence Alignment Accuracy? Mol. Biol. Evol., August 1, 2008; 25(8): 1576 - 1580. [Abstract] [Full Text] [PDF] |
||||

