Molecular Biology and Evolution 18:2280-2288 (2001)
© 2001 Society for Molecular Biology and Evolution
Study of Intrachromosomal Duplications Among the Eukaryote Genomes
Structure et Dynamique des Génomes, Institut Jacques Monod, Paris, France
| Abstract |
|---|
|
|
|---|
Complete eukaryote chromosomes were investigated for intrachromosomal duplications of nucleotide sequences. The analysis was performed by looking for nonexact repeats on two complete genomes, Saccharomyces cerevisiae and Caenorhabditis elegans, and four partial ones, Drosphila melanogaster, Plasmodium falciparum, Arabidopsis thaliana, and Homo sapiens. Through this analysis, we show that all eukaryote chromosomes exhibit similar characteristics for their intrachromosomal repeats, suggesting similar dynamics: many direct repeats have their two copies physically close together, and these close direct repeats are more similar and shorter than the other repeats. On the contrary, there are almost no close inverted repeats. These results support a model for the dynamics of duplication. This model is based on a continuous genesis of tandem repeats and implies that most of the distant and inverted repeats originate from these tandem repeats by further chromosomal rearrangements (insertions, inversions, and deletions). Remnants of these predicted rearrangements have been brought out through fine analysis of the chromosome sequence. Despite these dynamics, shared by all eukaryotes, each genome exhibits its own style of intrachromosomal duplication: the density of repeated elements is similar in all chromosomes issued from the same genome, but is different between species. This density was further related to the relative rates of duplication, deletion, and mutation proper to each species. One should notice that the density of repeats in the X chromosome of C. elegans is much lower than in the autosomes of that organism, suggesting that the exchange between homologous chromosomes is important in the duplication process.
| Introduction |
|---|
|
|
|---|
All eukaryote genomes exhibit similar physical structures and constraints (i.e., linear chromosomes, scaffold attachment, nucleosome organization). However, many characteristics highlight important differences between them: (1) coding sequences represent 72% of the Saccharomyces cerevisiae genome (Goffeau et al. 1996
Within duplication events, four main subprocesses have been documented: abnormal segregation during cell division (leading to entire-chromosome[s] duplication, viz., hyperploidization, and sometimes to whole-genome doubling, viz. polyploidization), transposition (duplication of transposable elements), expansion of low-complexity sequences (microsatellites and minisatellites), and finally generic duplications of unspecific DNA regions within the same chromosome or between two chromosomes. We shall henceforth refer to this last subprocess as the iteration process. Polyploidization events were proposed to explain the large-scale duplications at the origin of vertebrates (Ohno 1970
), in many angiosperms (Masterson 1994
)even in Arabidopsis thaliana (Blanc et al. 2000)
, in the fish lineage (Amores et al. 1998
), and in the yeast S. cerevisiae (Wolfe and Shields 1997
). However, it is not clear if these large-scale duplications are always the result of polyploidization, successive hyperploidizations, or bursts of large iterations (Holland 1999
; Llorente et al. 2000
; Vision, Brown, and Tanksley 2000
; Hughes, Da Silva, and Friedman 2001
; Robinson-Rechavi et al. 2001
).
In order to investigate the iteration process, we focused our attention on intrachromosomal repeats in the chromosome sequences. Two complete genomes, S. cerevisiae and Caernorhabditis elegans, and four partial ones, H. sapiens, Drosophila melanogaster, A. thaliana, and Plasmodium falciparum, were analyzed. It should be noted that the genome of S. cerevisiae was already investigated for its repeats in a previous study (Achaz et al. 2000
) in which we proposed a model for the dynamics of the iteration process based on a continuous genesis of close direct repeats (CDR). A CDR is defined here as a repeat with its copies in the same orientation and with a physical distance between them (the spacer) smaller than 1 kb. The model supposes that most of the intrachromosomal repeats originate from these CDRs, the others being the result of further chromosomal rearrangements. In the present study, the model established in yeast was tested for new eukaryote chromosomes. We focused on the differences between genomes and tried to connect them to the genome context. In our model, supposing that most of the intrachromosomal repeats originate from tandem repeats, the chromosome sequences had been investigated to find the remnants of the chromosomal rearrangements. Hence, we view repeats as the markers of genome dynamics.
| Materials and Methods |
|---|
|
|
|---|
Data
We analyzed the complete eukaryote genomes of S. cerevisiae16 chromosomes(Goffeau et al. 1996
Sequences of H. sapiens, C. elegans, P. falciparum, and A. thaliana were extracted from GenBank (ftp://ncbi.nlm.nih.gov/genbank/genomes). The S. cerevisiae chromosomes were extracted from Saccharomyces Genome Database (http://genome-www.stanford.edu/Saccharomyces). Sequences of D. melanogaster were downloaded from Celera database (http://www.celera.com).
It should be pointed out that most sequences contain many gaps (stretches of N). For example, in chromosome 1 of C. elegans, 8.8% of its base pairs are N, and 29 gaps are longer than 10 kb. These stretches were not taken into account during the construction of the repeats' database.
Construction of the Repeats Database
General trends of repeats detection, like most of the heuristics already proposed (Leung et al. 1991
; Vincens et al. 1998
), are based on looking first for seeds (exact repeats) and then extending them with a local alignment program. The detailed methodology is described below through three main steps: searching, filtering, and extending.
First Step: Searching for Seeds
In this step, exact repeats (seeds) were detected by using the REPuter software (Kurtz and Schleiermacher 1999
). This software detects all seeds (direct and inverted) in a given sequence that are any distance apart from the chromosome. As we are interested in unusually large seeds, the minimum length of seeds (Lmin) was calculated using the statistics developed by Karlin and Ost (1985)
. For each chromosome, we chose Lmin such that the probability of finding a two-copy word with at least this length in a same-size, same-nucleotide composition random sequence is 0.001. Typically, Lmin ranges from 21 for the smallest chromosome (chromosome 1 of S. cerevisiae) to 28 for the largest ones (chromosomes 21 and 22 of H. sapiens).
Second Step: Filtering the Seeds
First, to remove all low-complexity seeds (i.e., microsatellites or poly-A stretches), we used an entropy filter based on dinucleotide composition (Achaz et al. 2000
). Second, all multicopy seeds were removed. A chromosome map in which each position is linked to its n-plication degree (duplication, triplication, etc.) was established. To build this map, we counted for each chromosome position the number of times this position is found in seeds (direct and inverted seeds were pooled together). This map is used to estimate the degree of redundancy of chromosomes (i.e., the number of duplications, triplications, etc.). Table 1
presents, for each species, the mean size of the chromosomes, the percentage of chromosomes included in two-copy seeds, and the percentage of chromosomes represented by all the seeds. As we are only interested here in two-copy seeds, we used the map to remove all seeds in which one of the positions is included in a multicopy repeat.
|
Third Step: Extending the Seeds
Seeds were extended into larger nonstrict repeats by using a local alignment program (Smith and Waterman 1981
It should be mentioned that the methodology was similar to the one previously used in the S. cerevisiae analysis (Achaz et al. 2000
), but was modified in order to analyze in the same way the chromosomes of yeast (<1.5 Mb) and man (35 Mb). The major modifications were applied to reduce the number of seeds and to keep only sensu stricto duplicated seeds (present only in two copies) for the alignment process.
| Results and Discussion |
|---|
|
|
|---|
The application of the methodology described above yields for direct and inverted repeats, respectively: 110 and 75 for S. cerevisiae, 136 and 48 for P. falciparum, 2,407 and 1,068 for A. thaliana, 6,885 and 1,845 for C. elegans, 1,479 and 691 for D. melanogaster, and 3,457 and 2,406 for H. sapiens.
Genome Style and History of Chromosomes
In order to analyze the relationship between chromosome size and redundancy level, we measured two parameters DN and DL, defined as follows:
|
|
= 0.30, P < 0.05 for DN and
= 0.40, P < 0.01 for DL). These observations are in agreement with an analysis of gene redundancy undertaken on partial genome sequences (Coissac, Maillier, and Netter 1997
Two hypotheses can be proposed to explain the low densities of the chromosomal arms of D. melanogaster. The first one is a data bias: it should be noted that the analyzed sequences are constituted exclusively of euchromatine (only around two-thirds of the complete genome), and it is known that repeats are concentrated inside heterochromatine (Henikoff 2000
). Moreover, assembly errors could lead to artificially deleted tandem repeats. The second hypothesis rests on biological grounds. One can imagine that Drosophila's genome has a special status in the duplication process (because there is no meiotic crossing-over in the male, the duplication process can be less active). The achievement of the complete sequence of D. melanogaster should solve this problem.
In order to investigate more precisely each chromosome, we analyzed DN and DL for direct and inverted repeats (fig. 1 ). It appears that DN is similar for all chromosomes within the same species, whereas DL is not. Thus, DN could define the style of redundancy of the genome. We assume that DN results from the iteration events combined with the loss of duplicated sequences, and then propose DN to be connected to the biological machinery of each species. Because the machinery is clearly different for each species, but similar for all chromosomes within the same genome, DN should be the consequence of each genome's dynamics. Furthermore, the differences between species come essentially from direct repeats, and less from inverted repeats. This suggests that the biological machinery is more connected to the creation and the loss of direct repeats than to the dynamics of inverted repeats.
|
The only two exceptions are the fourth chromosomal arm of D. melanogaster and the X chromosome of C. elegans. The high density of the small fourth chromosomal arm of D. melanogaster could be the result of its particular structure (if there is no data bias): it is mostly constituted of heterochromatin, but, contrary to centromeric chromatin (or Y chromosome), it is partially visible in polytene chromosomes. On the contrary, the X chromosome of C. elegans exhibits a lower DN than that of the other worm's chromosomes. This observation is in good agreement with the unequal distribution of repetitive elements, such as CeRep23 (Barnes et al. 1995
Contrary to DN, DL could reflect better the chromosome history than the effects of the cellular machinery: a unique event of iteration can lead to a high DL for direct or inverted repeats. For example, direct repeats of the chromosome 1 of C. elegans exhibit a high DL and a normal DN (when compared with the other C. elegans chromosome values). This particularity is mainly caused by two large duplicated sequences, one 250-kb long (with an identity of 98.7%) and the other 600-kb long (fractionated into several segments of high identity, often more than 99%). Furthermore, the inverted repeats of the chromosome 1 of S. cerevisiae show a high DL and a normal DN, as a consequence of two internal regions inversely repeated in subtelomeres (Britten 1998
).
A Model of Dynamics of Iteration
Our model of intrachromosomal iteration (Achaz et al. 2000
) is based on a permanent genesis of CDR. The CDRs are then submitted to a high level of exchange (conversion and deletion). This high exchange rate tends to maintain the two copies identically (conversion) and also to eliminate them (deletion). At each round of exchange, both events are possible, but whereas conversion may still be followed by deletion, a deletion event cannot be followed by conversion.
Therefore, on a long timescale, a bias in favor of deletion should be observed. A CDR has to disappear sooner or later (depending on the relative rates of conversion and deletion). However, there are two situations where a repeat would be maintained: when it is protected from deletions by functional pressures (i.e., located inside a gene) or when the copies are spaced by further chromosomal rearrangements. This model was mainly based on three observations for CDR: (1) they are overrepresented, (2) they are mostly located inside the same gene, and (3) their length is positively correlated with the spacer (the physical distance between copies), and their identity is negatively correlated with it.
Through the present analysis, the model was tested with other eukaryotes. It should be mentioned that a model of tandem creation and further dispersion was already invoked for the families of two genes (Hox and NBG) in C. elegans (Ruvkun and Hobert 1998
). The annotations of eukaryote chromosomes being partial, they were not taken into account. Thus, we did not analyze the relation of repeats position with genes location.
CDRs Are Overrepresented
The repartitions of spacer size for direct and inverted repeats (fig. 2
) reveal that CDRs are overrepresented as compared with close inverted repeats. Moreover, in the previous study (Achaz et al. 2000
), the repeats of S. cerevisiae were compared with the repeats that issued from random chromosomes. From this comparison, we showed that such close repeats (inverted or direct) are absent from random chromosomes. This strongly suggests that these CDRs are not the result of chance. The presence of many CDRs in all chromosomes is in good agreement with the model.
|
However, the repartition spacer's length indicates the existence of many direct repeats with a spacer between 1 and 10 kb in A. thaliana chromosomes (they represent more than one-third of all direct repeats). We looked for a plausible explanation for this overrepresentation in A. thaliana (as compared with other species), with particular attention to the sequence located between the two copies (the spacer). Several hypotheses can be envisaged and rejected: (1) repeats are not the edges of transposons because the spacers are not paralogous, (2) the hypothesis of campbell-like insertions of exogenous DNA, as it was proposed for B. subtilis (Rocha, Danchin, and Viari 1999a
In conclusion, we did not yet find any plausible hypothesis to understand why these repeats are overrepresented.
CDRs Are Identical and Short
We started by characterizing CDRs in terms of the distribution of their identity (fig. 3
). Except for S. cerevisiae, CDRs have their two copies more identical than distant direct repeats (P < 10-4, Mann-Whitney rank test).
|
In order to explain the S. cerevisiae exception, one should take into consideration that the distinction between close and distant repeats has been arbitrarily fixed at the same spacer size (1 kb) for each organism. The biological difference between close and distant repeats is connected to the recombination machinery. As this machinery varies from yeast to human, the limit between close and direct repeats should not be identical for all species. In that way, it can be shown that for S. cerevisiae, direct repeats with a spacer smaller than 500 bp are more identical than other direct repeats (P < 0.05, Mann-Whitney rank test).
This greater similarity could be explained, on the one hand, by the recent origin of these repeats and, on the other, by a high conversion rate between the two copies when they are close together. As previously discussed, CDR could also be submitted to a high deletion rate. It has been reported that recombination rate is positively correlated with repeat length in yeast (Jinks, Michelitch, and Ramcharan 1993
) and in mammalian cells (Rubnitz and Subramani 1984
). Thus, CDRs with long copies are too unstable to persist, and only small CDRs are conserved. In order to test this hypothesis, the length distributions of close and distant direct repeats were compared: it appeared that CDRs are smaller than the distant ones (P < 10-4, Mann-Whitney rank test).
CDRs Exhibit an Exchange Rate Negatively Correlated with the Spacer Size
We previously observed a positive rank correlation between length and spacer and a negative rank correlation between identity and spacer for CDR in yeast: the closer the repeats, the more identical and shorter they are. Except for the P. falciparum chromosomes, correlations between identity, length and spacer were found in all eukaryotes (Table 2
). This is in good agreement with an observation reported in C. elegans that the similarity between paralogous genes is negatively correlated with the physical distance between them (Semple and Wolfe 1999
).
|
In order to understand such a result, we proposed that, as in bacteria (Peeters et al. 1988
In conclusion, the properties which supported the model of iteration dynamics established in S. cerevisiae are shared by other eukaryotes. This suggests that the model could be extended to all eukaryotes.
The Case of P. falciparum: How Parasitism Influences the Genome Style
P. falciparum chromosomes exhibit a high level of redundancy as compared with similar-sized chromosomes of S. cerevisiae (fig. 1
), and their CDRs are extremely overrepresented: 74% have a spacer smaller than 1 kb (fig. 2
). They are very identical (fig. 3
) and very small (data not shown). However, no correlation between spacer, identity, and length can be highlighted (Table 2 ).
Two-thirds of the inverted repeats are located near the telomeres (one copy in each subtelomere), suggesting a peculiar history and a high exchange rate for these repeats. It was suggested that all subtelomeres exhibit a very plastic dynamics in S. cerevisiae (Pryde, Gorham, and Louis 1997
) and in H. sapiens (Coleman, Baird, and Royle 1999
). Their importance in the interchromosomal iteration process was demonstrated in S. cerevisiae (Coissac, Maillier, and Netter 1997
).
All these observations are consistent with what was described previously: the highly repeated gene families and the special status of subtelomeres in P. falciparum (Gardner et al. 1998
; Bowman et al. 1999
).
Do These Observations Mean that This Ciliate Does Not Follow the Same Dynamics as the Other Eukaryotes?
P. falciparum is a human pathogenic parasite, the main agent of malaria. It has been reported that many bacterial pathogens exhibit a high redundancy level (Rocha, Danchin, and Viari 1999b
) which has been related to high selective pressures for sequence variation. A significant number of repeats allows many recombination events, leading to a high plasticity of the genome, and then to a high evolution rate. As for these bacteria, the high redundancy level of P. falciparum could be a consequence of its parasitism.
The quasi-absence of distant repeats and the absence of correlation indicate that there are almost only young repeats. The absence of correlation is, in this way, not caused by the absence of the mechanism leading to them but by too short a time of evolution. Population studies suggest that P. falciparum spread worldwide from a limited area (Rich and Ayala 2000
). The absence of old repeats could be a consequence of the recent change in the ecological conditions of P. falciparum, associated with a burst of evolution. In conclusion, P. falciparum follows the same iteration dynamics as the other eukaryotes. However, because it is a recent parasite, its chromosomes are more repeated than those of the other eukaryotes (as a result of parasitism), and there are almost no ancient repeats (because of its recent emergence).
How Tandem Repeats Can Be Turned into Spaced Repeats
Intrachromosomal repeats, in our model, are mostly created in tandem (by recombination between sister chromatides or by replication slippage), and are turned into distant repeats by chromosomal rearrangements. Analyzing all the ending states after several rearrangements is difficult. However, it is interesting to examine all the theoretical resulting states obtained after only one rearrangement event. Three kinds of rearrangement have been taken into account (fig. 4
): deletion of a part of the tandem, insertion of a sequence inside the tandem repeat, and inversion taking away a piece of the tandem. The insertion process can be the result of either the insertion of a transposable element or the reparation of a double-strand break by sequence conversion (Voelkel and Roeder 1990
). Small inversions have been suggested to explain the evolution of the genes' order between C. albicans and S. cerevisiae (Seoighe et al. 2000
), highlighting their role in genome dynamics.
|
If the model is valid, one should find the vestiges of tandem rearrangement in the chromosome sequences. Thus, we used the wublastn software (http://blast.wustl.edu) to look for paralogs of the spacers in the complete chromosomes. Only spacers with size between 50 bp and 10 kb and flanked by direct repeats were taken into account. It should be stressed that the queried databases were constructed for each species with complete chromosomes only (the same that we used for the detection of the repeats). A sequence was arbitrarily considered as a paralog of the spacer if the sequence length was at least 80% of the spacer length, and if the two sequences were identical by more than 80%. Using this approach, large insertions (fig. 4.2a ) or some inversions (fig. 4.3b ) can be undoubtedly identified, but small internal deletions and small internal insertions (fig. 4.1b and 4.2b ) cannot be clearly differentiated. One should notice that deletion of an edge of a copy (fig. 4.1a ) or a complete inversion of a copy (fig. 4.3a ) cannot be detected by this method.
Results were sorted as a function of the number of paralogs detected in the chromosomes. For most spacers, no paralog was found. This has several possible reasons: (1) our criteria were very stringent, (2) the research was performed against the whole genome only for S. cerevisiae and C. elegans, and (3) we only detected paralogs for spacers issued from a recent unique event of rearrangement. Multiparalog families (when a spacer presented at least two paralogs) were separated because they give an idea of the relative transposition rate. All cases where the spacer had only one paralog have been analyzed more precisely as they appeared in figure 4 .
As shown in Table 3 , all possible remnants of the tandem rearrangement were detected in the sequence of chromosomes. These observations indicate that the theoretical rearrangements arise in the genome history, reinforcing the model of the iteration dynamics.
|
A striking result was the overrepresentation of intrachromosomal direct paralogs in C. elegans. A detailed analysis of these paralogs revealed that they are mostly part of larger old tandem repeats. This observation has to be connected to the presence of large tandem repeats in the chromosomes of this species (i.e., a 600-kb repeat in the first chromosome), also recently described by Friedman and Hughes (2000)
All generic duplications of nonspecific DNA regions within the same chromosome or between two chromosomes were referred to in this study as iteration. However, this iteration process should be divided into at least two distinct mechanisms. The first is the creation of tandem repeats (by sister chromatide exchange or replication slippage), which creates (under our model) most of the intrachromosomal repeats. The second is the genesis of repeats (inter- or intrachromosomal) by a double-strand break repair. Actually, this repair can lead to duplication when the repair is associated with a conversion mechanism. This implies that the duplication process can at least be divided into four mechanisms: abnormal chromosome segregation (hyperploidization); transposition (transposable elements); sister chromatide exchange, replication slippage (tandem repeats and satellites), or both; and double-strand break repair (iteration by conversion).
| Conclusions |
|---|
|
|
|---|
Through this study of eukaryotes' intrachromosomal repeats, several biological results were highlighted. We extended our model, proposed for S. cerevisiae, to other eukaryote chromosomes (S. cerevisiae, C. elegans, P. falciparum, A. thaliana, D. melanogaster, and H. sapiens). This suggests that despite the differences in chromosomal properties, the iteration process follows globally the same dynamics in the eukaryote kingdom and thus has to be connected to structures and mechanisms shared by all eukaryote chromosomes.
The density of repeats number defines a genome style where the evolution rate results from iteration, deletion, rearrangement, and mutation. This rate is similar for all chromosomes within the same genome and is specific to each species. The main exception being the X chromosome of C. elegans, it suggests that exchanges between homologous chromosomes are important in the genesis of repeats. Thus, we propose that the genesis of tandem repeats is at least a consequence of exchange between homologous chromosomes.
Finally, we brought out the remnants of rearrangements of tandem repeats into spaced repeats. This suggests that tandem repeats, which can be easily created, are submitted to rounds of chromosomal rearrangements leading to the pattern of repeats observed today. Hence, repeats can be used to follow chromosome rearrangements and are markers of genome dynamics.
| Acknowledgements |
|---|
|
|
|---|
We thank I. Gonçalves, E. Rocha, D. Higuet, E. Maillier, J. Pothier, and A. Viari for their scientific help and their friendly support. This work was supported by grants from Association pour la Recherche sur le Cancer. E.C. and P.N. are members of Université Pierre et Marie Curie, Paris.
| Footnotes |
|---|
Manolo Gouy, Reviewing Editor
Keywords: genome dynamics
evolution
duplication
eukaryotes ![]()
Address for correspondence and reprints: Guillaume Achaz, Structure et Dynamique des Génomes, Institut Jacques Monod, Tour 4344, 1° Étage, 4, Place Jussieu, 75251 Paris Cedex 05, France. achaz{at}ijm.jussieu.fr
. ![]()
| References |
|---|
|
|
|---|
Achaz G., E. Coissac, A. Viari, P. Netter, 2000 Analysis of intrachromosomal duplications in yeast Saccharomyces cerevisiae: a possible model for their origin Mol. Biol. Evol 17:1268-1275
Adams M. D., S. E. Celniker, R. A. Holt, et al. (195 co-authors) 2000 The genome of Drosophila melanogaster Science 287:2185-2195
Amores A., A. Force, Y. L. Yan, et al. (13 co-authors) 1998 Zebrafish hox clusters and vertebrate genome evolution Science 282:1711-1714
Barnes T. M., Y. Kohara, A. Coulson, S. Hekimi, 1995 Meiotic recombination, noncoding DNA and genomic organization in Caenorhabditis elegans Genetics 141:159-179[Abstract]
Baudat F., A. Nicolas, 1997 Clustering of meiotic double-strand breaks on yeast chromosome III Proc. Natl. Acad. Sci. USA 94:5213-5218
Bernardi G., 2000 Isochores and the evolutionary genomics of vertebrates Gene 241:3-17[Web of Science][Medline]
Blanc G., A. Barakat, R. Guyot, R. Cooke, M. Delseny, 2000 Extensive duplication and reshuffling in the Arabidopsis genome Plant Cell 12:1093-1101
Bowman S., D. Lawson, D. Basham, et al. (36 co-authors) 1999 The complete nucleotide sequence of chromosome 3 of Plasmodium falciparum Nature 400:532-538[Medline]
Britten R. J., 1998 Precise sequence complementarity between yeast chromosome ends and two classes of just-subtelomeric sequences Proc. Natl. Acad. Sci. USA 95:5906-5912
Coissac E., E. Maillier, P. Netter, 1997 A comparative study of duplications in bacteria and eukaryotes: the importance of telomeres Mol. Biol. Evol 14:1062-1074[Abstract]
Coleman J., D. M. Baird, N. J. Royle, 1999 The plasticity of human telomeres demonstrated by hypervariable telomeres repeat array that is located on some copies of 16p and 16q Hum. Mol. Genet 8:1637-1646
Consortium. 1998 Genome sequence of the nematode C. elegans: a platform for investigating biology Science 282:2012-2018
Dunham I., N. Shimizu, B. A. Roe, et al. (239 co-authors) 1999 The DNA sequence of human chromosome 22 Nature 402:489-495[Medline]
Friedman R., A. L. Hughes, 2000 Gene duplication and the structure of eukaryotic genomes Genome Res 11:373-381
Gardner M. J., H. Tettelin, D. J. Carucci, et al. (27 co-authors) 1998 Chromosome 2 sequence of the human malaria parasite Plasmodium falciparum Science 282:1126-1132
Goffeau A., B. G. Barrell, H. Bussey, et al. (16 co-authors) 1996 Life with 6000 genes Science 274:546
Hattori M., A. Fujiyama, T. D. Taylor, et al. (62 co-authors) 2000 The DNA sequence of human chromosome 21 The chromosome 21 mapping and sequencing consortium. Nature 405:311-319
Henikoff S., 2000 Heterochromatin function in complex genomes Biochem. Biophys. Acta 1470:O1-O8[Medline]
Holland P. W., 1999 Gene duplication: past, present and future Semin. Cell Dev. Biol 10:541-547[Web of Science][Medline]
Hughes A. L., J. Da Silva, R. Friedman, 2001 Ancient duplication did not structure the human Hox-bearing chromosomes Genome Res 11:771-780
Jinks R. S., M. Michelitch, S. Ramcharan, 1993 Substrate length requirements for efficient mitotic recombination in Saccharomyces cerevisiae Mol. Cell. Biol 13:3937-3950
Karlin S., F. Ost, 1985 Maximal segmental match length among random sequences from a finite alphabet Pp. 225243 in L. M. L. Cam and R. A. Olshen, eds. Proceedings of the Berkeley Conference in honor of Jerzy Neyman and Jack Kiefer, Vol. 1. Association for Computing Machinery, New York
Kurtz S., C. Schleiermacher, 1999 REPuter: fast computation of maximal repeats in complete genomes Bioinformatics 15:426-427
Leung M. Y., B. E. Blaisdell, C. Burge, S. Karlin, 1991 An efficient algorithm for identifying matches with errors in multiple long molecular sequences J. Mol. Biol 221:1367-1378[Web of Science][Medline]
Lin X., S. Kaul, S. Rounsley, et al. (39 co-authors) 1999 Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana Nature 402:761-768[Medline]
Llorente B., A. Malpertuy, C. Neuveglise, et al. (24 co-authors) 2000 Genomic exploration of the hemiascomycetous yeasts: 18 Comparative analysis of chromosome maps and synteny with Saccharomyces cerevisiae. FEBS Lett 487:101-112
Masterson J., 1994 Stomatal size in fossil plants: evidence for polyploidy in majority of angiosperms Science 264:421-424
Mayer K., C. Schuller, R. Wambutt, et al. (234 co-authors) 1999 Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana Nature 402:769-777[Medline]
Ohno S., 1970 Evolution by gene duplication Springer-Verlag, Heidelberg, Germany
Peeters B. P. H., J. H. De Boer, S. Bron, G. Venema, 1988 Structural plasmid instability in Bacillus subtilis: effect of direct and inverted repeats Mol. Gen. Genet 212:450-458[Web of Science][Medline]
Pryde F. E., H. C. Gorham, E. J. Louis, 1997 Chromosome ends: all the same under their caps Curr. Opin. Genet. Dev 7:822-828[Web of Science][Medline]
Rich S. M., F. J. Ayala, 2000 Population structure and recent evolution of Plasmodium falciparum Proc. Natl. Acad. Sci. USA 97:6994-7001
Robinson-Rechavi M., O. Marchand, H. Escriva, P. L. Bardet, D. Zelus, S. Hughes, V. Laudet, 2001 Euteleost fish genomes are characterized by expansion of gene families Genome Res 11:781-788
Rocha E. P., A. Danchin, A. Viari, 1999a Analysis of long repeats in bacterial genomes reveals alternative evolutionary mechanisms in Bacillus subtilis and other competent prokaryotes Mol. Biol. Evol 16:1219-1230[Abstract]
Rocha E. P., A. Danchin, A. Viari, 1999b Functional and evolutionary roles of long repeats in prokaryotes Res. Microbiol 150:725-733[Medline]
Rubnitz J., S. Subramani, 1984 The minimum amount of homology required for homologous recombination in mammalian cells Mol. Cell Biol 4:2253-2258
Ruvkun G., O. Hobert, 1998 The taxonomy of developmental control in Caenorhabditis elegans Science 282:2033-2041
Semple C., K. H. Wolfe, 1999 Gene duplication and gene conversion in the Caenorhabditis elegans genome J. Mol. Evol 48:555-564[Web of Science][Medline]
Seoighe C., N. Federspiel, T. Jones, et al. (20 co-authors) 2000 Prevalence of small inversions in yeast gene order evolution Proc. Natl. Acad. Sci. USA 97:14433-14437
Smith T. F., M. S. Waterman, 1981 Identification of common molecular subsequences J. Mol. Biol 147:195-197[Web of Science][Medline]
Surzycki S. A., W. R. Belknap, 2000 Repetitive-DNA elements are similarly distributed on Caenorhabditis elegans autosomes Proc. Natl. Sci. USA 97:245-249
Vincens P., L. Buffat, C. Andre, J. P. Chevrolat, J. F. Boisvieux, S. Hazout, 1998 A strategy for finding regions of similarity in complete genome sequences Bioinformatics 14:715-725
Vision T. J., D. G. Brown, S. D. Tanksley, 2000 The origins of genomic duplications in Arabidopsis Science 290:2114-2117
Voelkel K., G. S. Roeder, 1990 Gene conversion tracts stimulated by HOT1-promoted transcription are long and continuous Genetics 126:851-867[Abstract]
Wolfe K. H., D. C. Shields, 1997 Molecular evidence for an ancient duplication of the entire yeast genome Nature 387:708-713[Medline]
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
R. C. Karn and C. M. Laukaitis The Mechanism of Expansion and the Volatility it created in Three Pheromone Gene Clusters in the Mouse (Mus musculus) Genome Gen Biol Evol, December 24, 2009; 2009(0): 1 - 10. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Koressaar, K. Joers, and M. Remm Automatic identification of species-specific repetitive DNA sequences and their utilization for detecting microbial organisms Bioinformatics, June 1, 2009; 25(11): 1349 - 1355. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. D. Kern and D. J. Begun Recurrent Deletion and Gene Presence/Absence Polymorphism: Telomere Dynamics Dominate Evolution at the Tip of 3L in Drosophila melanogaster and D. simulans Genetics, June 1, 2008; 179(2): 1021 - 1027. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Knibbe, A. Coulon, O. Mazet, J.-M. Fayard, and G. Beslon A Long-Term Evolutionary Pressure on the Amount of Noncoding DNA Mol. Biol. Evol., October 1, 2007; 24(10): 2344 - 2353. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Elez, M. Radman, and I. Matic The frequency and structure of recombinant products is determined by the cellular level of MutL PNAS, May 22, 2007; 104(21): 8935 - 8940. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. W. Messer and P. F. Arndt The Majority of Recent Short DNA Insertions in the Human Genome Are Tandem Duplications Mol. Biol. Evol., May 1, 2007; 24(5): 1190 - 1197. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Frutos, A. Viari, C. Ferraz, A. Morgat, S. Eychenie, Y. Kandassamy, I. Chantal, A. Bensaid, E. Coissac, N. Vachiery, et al. Comparative Genomic Analysis of Three Strains of Ehrlichia ruminantium Reveals an Active Process of Genome Size Plasticity. J. Bacteriol., April 1, 2006; 188(7): 2533 - 2542. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Boeva, M. Regnier, D. Papatsenko, and V. Makeev Short fuzzy tandem repeats in genomic sequences, identification, and possible role in regulation of gene expression Bioinformatics, March 15, 2006; 22(6): 676 - 684. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sinha and E. D. Siggia Sequence Turnover and Tandem Repeats in cis-Regulatory Modules in Drosophila Mol. Biol. Evol., April 1, 2005; 22(4): 874 - 885. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. M. Figueiredo, E. P. C. Rocha, L. Mancio-Silva, C. Prevost, D.čl. Hernandez-Verdun, and A. Scherf The unusually large Plasmodium telomerase reverse-transcriptase localizes in a discrete compartment associated with the nucleolus Nucleic Acids Res., February 18, 2005; 33(3): 1111 - 1122. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. E. Thomas, N. Srebro, J. Sebat, N. Navin, J. Healy, B. Mishra, and M. Wigler Distribution of short paired duplications in mammalian genomes PNAS, July 13, 2004; 101(28): 10349 - 10354. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Katju and M. Lynch The Structure and Early Evolution of Recently Arisen Gene Duplicates in the Caenorhabditis elegans Genome Genetics, December 1, 2003; 165(4): 1793 - 1803. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Hall, M. Berriman, N. J. Lennard, B. R. Harris, C. Hertz-Fowler, E. N. Bart-Delabesse, C. S. Gerrard, R. J. Atkin, A. J. Barron, S. Bowman, et al. The DNA sequence of chromosome I of an African trypanosome: gene content, chromosome organisation, recombination and polymorphism Nucleic Acids Res., August 15, 2003; 31(16): 4864 - 4873. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. P.C. Rocha An Appraisal of the Potential for Illegitimate Recombination in Bacterial Genomes and Its Consequences: From Duplications to Genome Reduction Genome Res., June 1, 2003; 13(6): 1123 - 1132. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. M. Townson, K. M. Dobrzycka, A. V. Lee, M. Air, W. Deng, K. Kang, S. Jiang, N. Kioka, K. Michaelis, and S. Oesterreich SAFB2, a New Scaffold Attachment Factor Homolog and Estrogen Receptor Corepressor J. Biol. Chem., May 23, 2003; 278(22): 20059 - 20068. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. D. Emes, L. Goodstadt, E. E. Winter, and C. P. Ponting Comparison of the genomes of human and mouse lays the foundation of genome zoology Hum. Mol. Genet., April 1, 2003; 12(7): 701 - 709. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. A. Ziolkowski, G. Blanc, and J. Sadowski Structural divergence of chromosomal segments that arose from successive duplication events in the Arabidopsis genome Nucleic Acids Res., February 15, 2003; 31(4): 1339 - 1350. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Achaz, E. P. C. Rocha, P. Netter, and E. Coissac Origin and fate of repeats in bacteria Nucleic Acids Res., July 1, 2002; 30(13): 2987 - 2994. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Michon, J. R. Stevens, O. Kaneko, and J. H. Adams Evolutionary Relationships of Conserved Cysteine-Rich Motifs in Adhesive Molecules of Malaria Parasites Mol. Biol. Evol., July 1, 2002; 19(7): 1128 - 1142. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||














