Molecular Biology and Evolution, Vol 2, 526-538, Copyright © 1985 by Society for Molecular Biology and Evolution
SF Altschul and BW Erickson
The similarity of two nucleotide sequences is often expressed in terms of
evolutionary distance, a measure of the amount of change needed to
transform one sequence into the other. Given two sequences with a small
distance between them, can their similarity be explained by their base
composition alone? The nucleotide order of these sequences contributes to
their similarity if the distance is much smaller than their average
permutation distance, which is obtained by calculating the distances for
many random permutations of these sequences. To determine whether their
similarity can be explained by their dinucleotide and codon usage, random
sequences must be chosen from the set of permuted sequences that preserve
dinucleotide and codon usage. The problem of choosing random dinucleotide
and codon-preserving permutations can be expressed in the language of graph
theory as the problem of generating random Eulerian walks on a directed
multigraph. An efficient algorithm for generating such walks is described.
This algorithm can be used to choose random sequence permutations that
preserve (1) dinucleotide usage, (2) dinucleotide and trinucleotide usage,
or (3) dinucleotide and codon usage. For example, the similarity of two
60-nucleotide DNA segments from the human beta-1 interferon gene
(nucleotides 196-255 and 499-558) is not just the result of their nonrandom
dinucleotide and codon usage.
ORIGINAL ARTICLE
Significance of nucleotide sequence alignments: a method for random sequence permutation that preserves dinucleotide and codon usage
Department of Applied Mathematics, Massachusetts Institute of Technology.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
T. T. Tran, F. Zhou, S. Marshburn, M. Stead, S. R. Kushner, and Y. Xu De novo computational prediction of non-coding RNA genes in prokaryotic genomes Bioinformatics, November 15, 2009; 25(22): 2897 - 2905. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Anandam, E. Torarinsson, and W. L. Ruzzo Multiperm: shuffling multiple sequence alignments while approximately preserving dinucleotide frequencies Bioinformatics, March 1, 2009; 25(5): 668 - 669. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Pereira, P. Soares, J. Carneiro, L. Pereira, M. B. Richards, D. C. Samuels, and A. Amorim Evidence for Variable Selective Pressures at a Large Secondary Structure of the Human Mitochondrial DNA Control Region Mol. Biol. Evol., December 1, 2008; 25(12): 2759 - 2770. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Zhang and B. Su MicroRNA regulation and the variability of human cortical gene expression Nucleic Acids Res., August 1, 2008; 36(14): 4621 - 4628. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Grigoras, T. Timchenko, and B. Gronenborn Transcripts encoding the nanovirus master replication initiator proteins are terminally redundant J. Gen. Virol., February 1, 2008; 89(2): 583 - 593. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Torarinsson, J. H. Havgaard, and J. Gorodkin Multiple structural alignment and clustering of RNA sequences Bioinformatics, April 15, 2007; 23(8): 926 - 932. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. NG Kwang Loong and S. K. Mishra Unique folding of precursor microRNAs: Quantitative evidence and implications for de novo identification RNA, February 1, 2007; 13(2): 170 - 187. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Torarinsson, M. Sawera, J. H. Havgaard, M. Fredholm, and J. Gorodkin Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure Genome Res., July 1, 2006; 16(7): 885 - 889. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. T. Poulos, K. F. J. Tang, C. R. Pantoja, J. R. Bonami, and D. V. Lightner Purification and characterization of infectious myonecrosis virus of penaeid shrimp. J. Gen. Virol., April 1, 2006; 87(Pt 4): 987 - 996. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Yu. Mitrophanov and M. Borodovsky Statistical significance in biological sequence analysis Brief Bioinform, March 1, 2006; 7(1): 2 - 24. |
||||
![]() |
P. Clote, J. Waldispuhl, B. Behzadi, and J.-M. Steyaert Energy landscape of k-point mutants of an RNA molecule Bioinformatics, November 15, 2005; 21(22): 4140 - 4147. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. CLOTE, F. FERRE, E. KRANAKIS, and D. KRIZANC Structural RNA has lower folding energy than random RNA of the same dinucleotide frequency RNA, May 1, 2005; 11(5): 578 - 591. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Gorodkin, S. L. Stricklin, and G. D. Stormo Discovering common stem-loop motifs in unaligned RNA sequences Nucleic Acids Res., May 15, 2001; 29(10): 2135 - 2144. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Lawrence, S. Altschul, M. Boguski, J. Liu, A. Neuwald, and J. Wootton Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment Science, October 8, 1993; 262(5131): 208 - 214. [Abstract] [PDF] |
||||







