MBE Advance Access originally published online on October 11, 2006
Molecular Biology and Evolution 2007 24(1):110-121; doi:10.1093/molbev/msl138
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
© 2006 The Authors
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Research Articles |
Rate and Polarity of Gene Fusion and Fission in Oryza sativa and Arabidopsis thaliana



* Division of Bioengineering and Bioinformatics, Graduate School of Information Science and Technology, Hokkaido University, Kita-ku, Sapporo, Japan
Institut für Botanik III, Heinrich-Heine Universität Düsseldorf, Düsseldorf, Germany
Genome Research Department, National Institute of Agrobiological Sciences, Kannondai, Tsukuba, Ibaraki, Japan
E-mail: yojnakam{at}ist.hokudai.ac.jp.
| Abstract |
|---|
|
|
|---|
Eukaryotic gene fusion and fission events are mechanistically more complicated than in prokaryotes, and their quantitative contributions to genome evolution are still poorly understood. We have identified all differentially composite or split genes in 2 fully sequenced plant genomes, Oryza sativa and Arabidopsis thaliana. Out of 10,172 orthologous gene pairs, 60 (0.6% of the total) revealed a verified fusion or fission event in either lineage after the divergence of O. sativa and A. thaliana. Polarizing these events by outgroup comparison revealed differences in the rate of gene fission but not of gene fusion in the rice and Arabidopsis lineages. Gene fission occurred at a higher rate than gene fusion in the O. sativa lineage and was furthermore more common in rice than in Arabidopsis. Nucleotide insertion bias has promoted gene fission in the O. sativa lineage, consistent with its generally longer nucleotide sequences than A. thaliana in selectively neutral regions, and with the abundance of transposable elements in rice. The divergence time of monocots and dicots (140–200 Myr) indicates that gene fusion/fission events occur at an average rate of 1 x 10–11 to 2 x 10–11 events per gene per year,
100-fold slower than the average per site nuclear nucleotide substitution rate in these lineages. Gene fusion and fission are thus rare and slow processes in higher plant genomes; they should be of utility to address deeper evolutionary relationships among plants—and the relationship of plants to other eukaryotic lineages—where sequence-based phylogenies provide equivocal or conflicting results.
Key Words: gene fusion and fission introns transposable elements plant phylogeny
| Introduction |
|---|
|
|
|---|
In eukaryotic gene fusion, 2 or more separate transcription units are joined, forming 1 transcription unit. Gene fission is the converse process in which a gene is split into 2 or more separate transcription units. The mutational mechanisms affecting gene fusions and fissions differ in prokaryotes and eukaryotes. In prokaryotes, operons are common (Price et al. 2005
Recently, the genome sequence of rice, Oryza sativa (O. sativa L. ssp. japonica cv. Nipponbare), has been determined. Its gene repertoire was quite thoroughly annotated using full-length cDNA libraries (International Rice Genome Sequencing Project 2005
; Ohyanagi et al. 2006
). This permits a monocot–dicot comparison to the Arabidopsis thaliana genome (Arabidopsis Genome Initiative 2000
). Here, we addressed the evolutionary dynamics of gene fusion and fission events in these plant genomes. We identify all of the candidates of gene fusion or fission events, which have occurred after the divergence of O. sativa and A. thaliana. We report the number and rate of the events including all genes and coordinates involved as well as their functional annotations and reconstruct the evolutionary scenario of differential gene fusion and fission in each lineage.
| Materials and Methods |
|---|
|
|
|---|
Protein Sequences in O. sativa and A. thaliana
We collected a total of 40,041 protein sequences in O. sativa genomes annotated in the Rice Annotation Project (RAP) as of 14 June 2005 (Ohyanagi et al. 2006
Detection of Gene Fusion or Fission Candidates (one-to-many orthologous pairs)
We constructed a database using the protein sequences in O. sativa and A. thaliana and performed protein similarity search for each sequence using BlastP (Altschul et al. 1997
) with the threshold e value < 10–10. We then collected one-to-one reciprocal best match pairs between O. sativa and A. thaliana as orthologous pairs. From these, we selected the pairs in which the query in a species has more than 1 hit in the other species, and the query is the best match from these hits in the backward search. We checked these hits in the order of BlastP score and discarded the hits matching to the one of higher score as possible paralogues. The hit sequences obtained are, therefore, the best, second best, ... and the n-th best hits from the query.
Validation of Gene Fusion or Fission Candidate Pairs
Next, we measured the overlapping length of matched regions on the query sequence in BlastP alignment in each of one-to-many orthologous pairs detected. Here, the overlap ratio of 2 hits A and B is
|
|
Then, we chose the query-hit pairs with the cutoff ratio < 0.3 following the bimodal distribution (supplementary fig. 1, Supplementary Material online). We excluded the pairs in which split genes are defined in a single locus by the RAP (Ohyanagi et al. 2006
) because these might not be reliable annotations.
Furthermore, we validated these pairs by the following 2 steps using BlastP and TBlastN: 1) we performed BlastP searches using the protein sequences in each pair to public databases, GenBank/European Molecular Biology Laboratory (EMBL)/DNA Data Bank of Japan (DDBJ) and Swiss-Prot, and compared the gene structures with those of entries in the databases. We did not use the pairs for further analysis if 1–1) the pair in which the query as a composite gene has separate entries as component genes from the same species in the databases or 1–2) the pairs in which the hits nearby located on a chromosome have a composite entry from the same species in the databases. Here, nearby located genes are defined as the ones between which there are 3 or less other genes. 2) We performed TBlastN using each of composite genes as a query, against noncoding regions around split genes with e value < 10–3. We then concatenated the matched "exon-like" sequences, translated them into amino acid sequences, and compared the Blast alignment and score with those of the split genes. We did not use the pairs in which such an exon-like sequence next to 1 split gene is aligned with a higher Blast score than the other split gene.
Estimation of Gene Fusion or Fission
We performed Blast comparisons using the protein sequences in each pair of a composite gene and split genes to the gene sets from the red algae Cyanidioschyzon merolae (Matsuzaki et al. 2004
), the green algae Chlamydomonas reinhardtii (http://genome.jgi-psf.org/Chlre3/Chlre3.home.html), and entries in GenBank/EMBL/DDBJ and Swiss-Prot by BlastP. Then we performed Blast comparisons using the top hits to the above-mentioned database of O. sativa and A. thaliana and chose the ones as orthologous outgroup genes with reciprocal best matching. Using these outgroups, we inferred the ancestral state of pair of a composite gene and split genes.
Assignment to Biological Function
We referred the gene function from RAP annotation for O. sativa genes. For A. thaliana, we used the GenBank annotation. To investigate the function at the domain level, we performed InterProScan (Zdobnov and Apweiler 2001
) using the Pfam database (Finn et al. 2006
) with e value < 10–3. In each of the gene fusion/fission candidate pairs, we defined the pairs in which gene splits occur "within" domains by the following criterion: the domain region in a composite gene detected by InterProScan is aligned across the regions aligned to split genes by BlastP.
Repetitive Element Sequences
We downloaded the Arabidopsis and rice repeat sequences from The Institute for Genomic Research Plant Repeat Database (http://www.tigr.org/tdb/e2k1/plant.repeats/) and constructed a BlastN database. We then performed sequence homology searches for intergenic regions around the fusion/fission candidate genes examined with the threshold e value < 10–5.
Graphical Views of Gene Fusion/Fission Candidate Pairs
We developed the Perl program package, "FUFIA viewer (gene FUsion and FIssion Alignment viewer)" for drawing the fusion/fission candidate pairs detected by Blast.
| Results |
|---|
|
|
|---|
Detection of Gene Fusion and Fission Events
A total of 10,172 one-to-one orthologous gene pairs between O. sativa and A. thaliana genomes were determined by reciprocal BlastP searches (see Materials and Methods). Out of those, 277 pairs were defined as one-to-many orthologous pairs in which 1 query (a composite gene) in a genome has more than 1 orthologous hit (split genes) in the other genome, and these hits are not paralogous to each other (Enright et al. 1999
0.3), we checked the RAP annotations and excluded the pairs in which Oryza split genes are defined by a single locus in RAP. Although these genes might be genuine split genes, here we adopted the RAP annotations. We thus obtained 114 conservative pairs as the preliminary fusion/fission candidates. Then we validated those pairs using BlastP and TBlastN (see Materials and Methods). We first found that in 45 pairs, either the rice or the Arabidopsis gene prediction was inconsistent with the public database entry. Next, for each of the remaining 69 pairs, we detected 9 pairs in each of which an exon-like structure near 1 split gene is aligned to the composite gene with a higher Blast score than the other split gene. Due to the possibility that these 54 (45 + 9) genes are misannotated in the genome sequence, they were excluded from further analysis. This left a total of 60 candidate pairs encompassing a composite gene in a species and 2 or more split orthologues in the other species (table 1). Of these, 21 were composite in O. sativa and split in A. thaliana (Oryza-composite–Arabidopsis-split), whereas 39 pairs, nearly twice as many, were composite in A. thaliana and split in O. sativa (Arabidopsis-composite–Oryza-split).
|
Next, we investigated the locations and orientations of the genes in 60 candidate pairs (figs. 1, 2, and 4; supplementary figs. 2 and 3, Supplementary Material online). Out of the 39 Arabidopsis-composite–Oryza-split pairs, 21 are termed "distal" pairs because the 2 split genes are distantly located on the same chromosome or dispersed on different chromosomes. In these pairs, recombination or translocation of components might have directly caused fusion or fission or occurred after insertion or deletion had generated fused genes or fissioned genes (fig. 1A). Seventeen pairs are termed "proximal" because 2 split genes were separated by
3 other genes on the same chromosome (fig. 1B). In the majority of these pairs, 2 split genes lie next to each other in the same orientation. In this case, insertion or deletion within/between genes probably caused fusion/fission.
|
|
|
We further found 3 special "proximal" subclasses (fig. 2A–C). As the first subclass, we detected a pair in which there is an unrelated gene between split genes (fig. 2A), involving insertion or recombination. As the other special subclass, we detected a pair in which rice split genes were located nearby in inverted orientation (fig. 2B). The second subclass also may involve recombination, as in the case of distal split genes. In this class, however, there is an unrelated gene between split, inverted genes, implying insertion or deletion mechanisms. In the third special subclass, 1 split gene is nested within another split gene (fig. 2C). The remaining 1 pair out of 39 Arabidopsis-composite-Oryza-split pairs was the hybrid of "distal" and "proximal," which involved 3 genes (fig. 1C). In this pair, 1 of the split genes was located on a different chromosome, whereas the others are next to each other.
For Oryza-composite–Arabidopsis-split pairs, we classified the 21 pairs into 7 distal and 14 proximal pairs (table 3 and supplementary fig. 3, Supplementary Material online). Of the proximal pairs, we found a pair of the first special subclass but none of the second or third subclass. In 1 of the proximal pairs (Os01g0388500 vs. At2g48060-40), 3 Arabidopsis genes were of the same orientation on chromosome 2 (supplementary fig. 3, Supplementary Material online).
|
Frequent Gene Fissions in Rice
To determine the evolutionary polarity of gene fusion or fission, we inferred the ancestral states of the candidate pairs by outgroup comparison using BlastP of composite or split translations to National Center for Biotechnology Information and Swiss-Prot and the available translations from the recently sequenced plants C. merolae (a red algae) and C. reinhardtii (a green algae). We then defined orthologous outgroup genes from these databases by reciprocal BlastP and inferred the ancestral gene structures by parsimony. This defined polarity in 14 cases (6 fusions and 8 fissions) out of 60 pairs examined (table 1). Nine were Arabidopsis-composite–Oryza-split and 5 were Oryza-composite–Arabidopsis-split cases. Among the polarized cases, the Oryza lineage has undergone 3 fusions and 6 fissions, and the Arabidopsis lineage has undergone 3 fusions and 2 fissions. Hence, our result shows that gene fission is more common than gene fusion in the rice genome (6:3), whereas fissions and fusions are equally common in A. thaliana (2:3). Moreover, many rice fission genes were nearby located on the chromosome (table 2).
|
Biological Functions of Fused or Fission Genes
We investigated the functional annotations of fused or fissioned genes (tables 2 and 3). Although many of the candidate genes were hypothetical or unknown proteins, some were assigned to biological functions. In 22 pairs, 1 composite gene and 1 split gene were involved in the same or related function and the other split gene(s) encode different protein(s) or were unknown/hypothetical. In the other pairs, all the genes of Oryza and Arabidopsis were unknown/hypothetical genes, just expression-confirmed genes found in cDNAs or ESTs or assigned to different functions. These gene pairs are interesting candidates for functional analysis.
We detected domain regions in 47 out of 60 gene fusion/fission candidate pairs using the Pfam database (Finn et al. 2006
). We then found that in 5 pairs, 3 in Arabidopsis-composite–Oryza-split pairs and 2 in Oryza-composite–Arabidopsis-split pairs, the split positions are located within domains (table 4). For 2 pairs of At3g23510 versus Os07g0474400–Os12g0267200 and At3g56330 versus Os05g0324200-100, gene fissions are inferred by outgroup comparison, and for others the directions are unknown.
|
| Discussion |
|---|
|
|
|---|
We identified 10,172 orthologous gene pairs, of which 60 confirmed pairs (0.6%) have undergone fusion or fission events after the divergence of O. sativa and A. thaliana (table 1). Even if we add to that number the 54 pairs excluded because of possible annotation errors, the percentage of differentially composite/split genes would still only rise to 1.1%. This paucity indicates that in these plant genomes, gene fusion or fission events are either mechanistically rare or often counterselected, or both. Of 60 pairs, we found that Arabidopsis-composite–Oryza-split cases (39) are nearly twice as common as Oryza-composite–Arabidopsis-split cases (21). This significant difference (P < 0.05) strongly indicates 3 possible polarities of gene fusion and fission events in each species: 1) frequent gene fusions in Arabidopsis, 2) frequent gene fissions in rice, or 3) both. Gene fission is twice as common as gene fusion in the rice genome, although it is not statistically significant due to the small number of observations (table 1). Because most gene splits detected involve proximal genes on the same chromosome, the issue arises whether these are true fusions/fissions or artifacts of annotation error. Here it is important to note that almost all of the fission genes in rice genome were supported by full-length rice cDNA records (table 2). Therefore, artifactual fissions due to frameshifts by sequencing errors can be excluded in the case of the rice genome. On the other hand, both gene fusion and fission are equally common in A. thaliana, implying a relative richness of gene fusion over fission as compared with rice (table 1).
The observed polarity trends are consistent with the length differences between orthologous regions in O. sativa and A. thaliana, intron lengths in particular. The distribution of intron lengths within orthologous genes showed a clear bimodal distribution: one conservative class and one shifted toward longer rice introns (fig. 3A). In the conservative distribution, the genes have few or no introns. The other component of the bimodal distribution indicates an insertion or deletion bias in introns. Moreover, the numbers of introns are not biased toward rice (fig. 3B), suggesting that the differences in length are not due to amplification or loss of introns but by nucleotide insertion or deletion within selectively neutral intron regions. Our observations reveal a "genome-wide" nucleotide insertion bias in the Oryza lineage and/or deletion bias in the Arabidopsis lineage after the divergence of these species.
|
It has been reported that transposable elements are abundant in rice, occupying more than one-third of the genome (International Rice Genome Sequencing Project 2005
We found that the points of gene splits are located within domains in only 5 out of 47 gene fusion/fission candidate pairs in which domains are detected. This suggests that gene fusion or fission events can be fixed more readily if they occur in such a manner as preserves domain structures and gene functions, in turn, to some extent. From this, it would appear that most of the observed gene fusion/fission events are not deleterious. Because it is less likely that fusion of nondomain or partial domain sequences results in the innovation of novel domain sequences, all of the 5 domain-splitting cases might be due to gene fission events. Consistent with that view, 2 cases of those, At3g23510 versus Os07g0474400–Os12g0267200 and At3g56330 versus Os05g0324200-100 were inferred as gene fission by outgroup comparison (table 2). Regarding the pair At3g23510 versus Os07g0474400–Os12g0267200, whose alignment is shown in figure 4, it has been reported that a Java olive, Sterculia foetida has an intact and functional homolog to At3g23510 encoding cyclopropane fatty acid (CPA-FA) synthase (Bao et al. 2002
, 2003
). In that study, the N terminus of these genes was annotated as flavin adenine dinucleotide (FAD) containing oxidase related to "amino oxidase" by Pfam (table 4). Because the significance of FAD-containing oxidase domain of Arabidopsis and Sterculia composite genes in CPA-FA biosynthesis is poorly understood (Bao et al. 2002
, 2003
), it may be of interest to investigate the function of Os07g0474400 and Os12g0267200, where the oxidase domain appears to be inactivated.
Newly generated fissions may be deleterious, neutral, or advantageous. But in the latter two cases, they entail the spontaneous origin of novel promoter sequences to afford transcription. These newly arisen promoters in the case of gene fissions may be of interest for further study because they might provide insights into de novo promoter origins. From the comparative standpoint, the maize genome is known to be rich in transposable elements (SanMiguel and Bennetzen 1998
) and may thus harbor even more gene fissions than rice. The polarity of gene fusion/fission in O. sativa might conceivably relate to rice domestication and breeding, with relaxed constraints during prolonged cultivation, consistent with the richness of transposable elements and the relatively recent occurrence of gene fissions by transposable element insertions in the rice genome (fig. 4).
Previous genome-wide investigations of fusion/fission frequencies have reported that gene fusion may be more common than fission (Snel et al. 2000
; Yanai et al. 2001
; Suhre and Claverie 2004
; Kummerfeld and Teichmann 2005
). However, we observe precisely the opposite in the heavily cDNA-supported rice annotations. Previous studies concerned mainly prokaryotic genomes (Snel et al. 2000
; Yanai et al. 2001
). We emphasize that the frequencies of gene fusion and fission may differ fundamentally for prokaryotic genomes and eukaryotic genomes because there is a much stronger correlation between the functions and locations of genes in prokaryotic genomes—operons (Price et al. 2005
)—than in eukaryotic genomes and because translational fusion within operons can involve simple micromutational events, which is not the case in eukaryotes. For example, the trp operon has undergone many independent gene fusion and fission events (Xie et al. 2003
). In the case of higher plant genomes, the earlier prokaryotic estimates clearly do not apply.
Another earlier investigation of fusion and fission concerned not only prokaryotic but also many eukaryotic genome sequences (Kummerfeld and Teichmann 2005
) and reported a 4-fold predominance of gene fusions over fissions. That estimate is inconsistent with our results, where frequent gene fissions have occurred in rice. However, the observations from that earlier study carry 2 caveats. First, there is the possibility of annotation errors, particularly in the genes predicted by the ab initio method in eukaryotic genomes. In that regard, we found that the gene structures of more than 40% of the preliminary fusion/fission candidates are equivocal by database comparison and noncoding region check; they likely represent false positives, and hence we excluded them from our analysis, unlike the previous study (Kummerfeld and Teichmann 2005
). Second, the earlier quantitative estimation of fusion and fission rates was contingent upon a particular phylogenetic tree linking all genomes considered. If either fusion or fission events had occurred anciently, the ancestral state so inferred will be heavily topology dependent. Furthermore, if any of the composite or split genes were subject to lateral gene transfer among prokaryotes, which does exist (Nakamura et al. 2004
; Kunin et al. 2005
) and which can also include transfer of operons (Lawrence 1997
) and might bear upon the variability of operon structures (Itoh et al. 1999
), the rates inferred will also be heavily affected. In particular, the earlier study (Kummerfeld and Teichmann 2005
) treated the occurrence of fusion and fission on a much longer timescale (prokaryotes–eukaryotes) as compared with our study (monocot–dicot). Thus, the influence of a guide topology and horizontal gene transfer, as well as the frequency of gene fusions/fissions in operons, will be much larger in the more ancient comparison. In this study, we focused on the events after the divergence of a monocot and dicot and used relatively close outgroups like C. merolae and C. reinhardtii, or closer where available. The phylogenetic relationships in this estimation are therefore clear and the polarity rather certain, given the rare nature of fusion and fission events in general.
In particular, the previous study estimated that about 30% of the genes examined have undergone multiple fusions or fissions (Kummerfeld and Teichmann 2005
), but that might not be a good estimate due to the aforementioned reasons (frequent gene fusions/fissions in operons, annotation errors, horizontal gene transfer, and also of operons). Also, the previous estimate might include lineage-specific amplified genes, many of which may be subject to frequent structural changes by mutation and affect the estimate of gene fusion and fission events. Here it should be noted that we defined gene fusion/fission candidates from one-to-one orthologous pairs between rice and Arabidopsis. Our results thus present an estimate on a conserved gene set that is unaffected by lineage-specific gene gain by duplication, suggesting that our estimate is comparable to the ones in other species pairs and applicable to the extrapolation of gene fusion and fission events in number (Enright and Ouzounis 2001
). In general, gene fusion or fission events may be very rare among conserved genes (Conant and Wagner 2005
).
The presence or absence of a gene fusion or fission itself can, in principle, be useful for investigating the phylogenetic relationships among taxa (Enright and Ouzounis 2001
; Stechmann and Cavalier-Smith 2002
). Because only
1% of orthologous gene pairs in the present genome comparison showed differential fusion or fission and because the divergence time of monocots and dicots is roughly 140–200 Myr (Wolfe et al. 1989
; Chaw et al. 2004
), the possibility of multiple fusions or fissions in each gene can virtually be neglected at this timescale. Treating each of the orthologous gene pairs examined as a gene "site" in computation, the average rate of fusion and fission events is approximately 1 x 10–11 to 2 x 10–11 per gene per year,
100-fold slower than the average rate of nucleotide substitution (
5 x 10–9 per nucleotide site per year). If we take the 54 unverified cases into account, the rate increases to 3 x 10–11 to 4 x 10–11 per gene per year. If we assume that an average gene has about 1,000-nt sites, it is clear that gene fusions and fissions in these 2 angiosperms occur roughly 105 times more slowly than nucleotide substitutions do.
With this slow rate, gene fusion and fission data should provide a means to address deeper evolutionary relationships among plants or other eukaryotes, where the information contained in sequence-based phylogenies is equivocal. As a prominent example, it was reported that dihydrofolate reductase and thymidylate synthase are encoded as a composite gene in protists and plants and as 2 split genes in fungi and metazoa, indicating a lineage-specific distribution (Stechmann and Cavalier-Smith 2002
). In our present study, 46 out of 60 candidates remain to be resolved regarding polarity. But we can extrapolate the numbers of gene fusions and fissions and estimate the total number of events during the evolution of O. sativa and A. thaliana (fig. 5). Although domestication might have affected the rate of fusion and fission events in O. sativa, the complete set of fusions and fissions for this pairwise genome comparison nonetheless provides a first benchmark for the plant rate. Determining the state of fusion or fission of the gene pairs identified here in the suspectedly basal angiosperm Amborella, for example, where a raging debate exists regarding its evolutionary position because large sequence data sets give conflicting results with strong support (Goremykin et al. 2004
; Lockhart and Penny 2005
), may shed further light on this and other currently difficult phylogenetic issues.
|
| Supplementary Material |
|---|
|
|
|---|
Supplementary figures 1–3 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
| Acknowledgements |
|---|
|
|
|---|
We thank Tal Dagan for helpful comments. This study was supported by a grant from the Japan Society for the Promotion of Science.
Funding to pay the Open Access publication charges for this article was provided by Oxford Journals for the editor in chief.
| Footnotes |
|---|
Manolo Gouy, Associate Editor
| References |
|---|
|
|
|---|
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402.
Arabidopsis Genome Initiative. (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796–815.[CrossRef][Medline]
Bao X, Katz S, Pollard M, Ohlrogge J. (2002) Carbocyclic fatty acids in plants: biochemical and molecular genetic characterization of cyclopropane fatty acid synthesis of Sterculia foetida. Proc Natl Acad Sci USA 99:7172–7177.
Bao X, Thelen JJ, Bonaventure G, Ohlrogge JB. (2003) Characterization of cyclopropane fatty-acid synthase from Sterculia foetida. J Biol Chem 278:12846–12853.
Chaw SM, Chang CC, Chen HL, Li W-H. (2004) Dating the monocot-dicot divergence and the origin of core eudicots using whole chloroplast genomes. J Mol Evol 58:424–441.[CrossRef][Web of Science][Medline]
Conant GC and Wagner A. (2005) The rarity of gene shuffling in conserved genes. Genome Biol 6:R50.[CrossRef][Medline]
Enright AJ, Iliopoulos I, Kyrpides NC, Ouzounis CA. (1999) Protein interaction maps for complete genomes based on gene fusion events. Nature 402:86–90.[CrossRef][Medline]
Enright AJ and Ouzounis CA. (2001) Functional associations of proteins in entire genomes by means of exhaustive detection of gene fusions. Genome Biol 2:RESEARCH0034.[Medline]
Finn RD, Mistry J, Schuster-Böckler B, et al. (13 co-authors). (2006) Pfam: clans, web tools and services. Nucleic Acids Res 34:D247–D251.
Goremykin VV, Holland B, Hirsch-Ernst KI, Hellwig FH. (2004) Analysis of Acorus calamus chloroplast genome and its phylogenetic implications. Mol Biol Evol 22:1813–1822.[Web of Science]
International Rice Genome Sequencing Project. (2005) The map-based sequence of the rice genome. Nature 436:793–800.[CrossRef][Medline]
Itoh T, Takemoto K, Mori H, Gojobori T. (1999) Evolutionary instability of operon structures disclosed by sequence comparisons of complete microbial genomes. Mol Biol Evol 16:332–346.[Abstract]
Kummerfeld SK and Teichmann SA. (2005) Relative rates of gene fusion and fission in multi-domain proteins. Trends Genet 21:25–30.[CrossRef][Web of Science][Medline]
Kunin V, Goldovsky L, Darzentas N, Ouzounis CA. (2005) The net of life: reconstructing the microbial phylogenetic network. Genome Res 15:954–959.
Lawrence JG. (1997) Selfish operons and speciation by gene transfer. Trends Microbiol 5:355–359.[CrossRef][Web of Science][Medline]
Lockhart PJ and Penny D. (2005) The place of Amborella within the radiation of angiosperms. Trends Plant Sci 10:201–202.[CrossRef][Web of Science][Medline]
Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D. (1999) Detecting protein function and protein-protein interactions from genome sequences. Science 285:751–753.
Matsuzaki M, Misumi O, Shin IT, et al. (42 co-authors). (2004) Genome sequence of the ultrasmall unicellular red alga Cyanidioschyzon merolae 10D. Nature 428:653–657.[CrossRef][Medline]
Nakamura Y, Itoh T, Matsuda H, Gojobori T. (2004) Biased biological functions of horizontally transferred genes in prokaryotic genomes. Nat Genet 36:760–766.[CrossRef][Web of Science][Medline]
Ohyanagi H, Tanaka T, Sakai H, et al. (14 co-authors). (2006) The Rice Annotation Project Database (RAP-DB): hub for Oryza sativa ssp. japonica genome information. Nucleic Acids Res 34:D741–D744.
Price MN, Huang KH, Arkin AP, Alm EJ. (2005) Operon formation is driven by co-regulation and not by horizontal gene transfer. Genome Res 15:809–819.
SanMiguel P and Bennetzen JL. (1998) Evidence that a recent increase in maize genome size was caused by the massive amplification of intergene retrotransposons. Ann Bot 82:37–44.
Snel B, Bork P, Huynen M. (2000) Genome evolution. Gene fusion versus gene fission. Trends Genet 16:9–11.[Web of Science][Medline]
Stechmann A and Cavalier-Smith T. (2002) Rooting the eukaryote tree by using a derived gene fusion. Science 297:89–91.
Suhre K and Claverie JM. (2004) FusionDB: a database for in-depth analysis of prokaryotic gene fusion events. Nucleic Acids Res 32:D273–D276.
Wolfe KH, Gouy M, Yang YW, Sharp PM, Li W-H. (1989) Date of the monocot-dicot divergence estimated from chloroplast DNA sequence data. Proc Natl Acad Sci USA 86:6201–6205.
Xie G, Keyhani NO, Bonner CA, Jenesen RA. (2003) Ancient origin of the tryptophan operon and the dynamics of evolutionary change. Microbiol Mol Biol Rev 67:303–342.
Yanai I, Derti A, DeLisi C. (2001) Genes linked by fusion events are generally of the same functional category: a systematic analysis of 30 microbial genomes. Proc Natl Acad Sci USA 98:7940–7945.
Zdobnov EM and Apweiler R. (2001) InterProScan—an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17:847–848.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||




