MBE Advance Access originally published online on January 12, 2008
Molecular Biology and Evolution 2008 25(3):603-615; doi:10.1093/molbev/msn009
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Research Articles |
The Mitochondrial Genome of the Gymnosperm Cycas taitungensis Contains a Novel Family of Short Interspersed Elements, Bpu Sequences, and Abundant RNA Editing Sites

,¶
* Research Center for Biodiversity, Academia Sinica, Taipei, Taiwan
Institute of Information Science, Academia Sinica, Taipei, Taiwan
Department of Informatics, Indiana University
Institute of Medical Biotechnology, Central Taiwan University of Science and Technology, Taichung City, Taiwan
E-mail: smchaw{at}sinica.edu.tw.
| Abstract |
|---|
|
|
|---|
The mtDNA of Cycas taitungensis is a circular molecule of 414,903 bp, making it 2- to 6-fold larger than the known mtDNAs of charophytes and bryophytes, but similar to the average of 7 elucidated angiosperm mtDNAs. It is characterized by abundant RNA editing sites (1,084), more than twice the number found in the angiosperm mtDNAs. The A + T content of Cycas mtDNA is 53.1%, the lowest among known land plants. About 5% of the Cycas mtDNA is composed of a novel family of mobile elements, which we designated as "Bpu sequences." They share a consensus sequence of 36 bp with 2 terminal direct repeats (AAGG) and a recognition site for the Bpu 10I restriction endonuclease (CCTGAAGC). Comparison of the Cycas mtDNA with other plant mtDNAs revealed many new insights into the biology and evolution of land plant mtDNAs. For example, the noncoding sequences in mtDNAs have drastically expanded as land plants have evolved, with abrupt increases appearing in the bryophytes, and then in the seed plants. As a result, the genomic organizations of seed plant mtDNAs are much less compact than in other plants. Also, the Cycas mtDNA appears to have been exempted from the frequent gene loss observed in angiosperm mtDNAs. Similar to the angiosperms, the 3 Cycas genes nad1, nad2, and nad5 are disrupted by 5 group II intron squences, which have brought the genes into trans-splicing arrangements. The evolutionary origin and invasion/duplication mechanism of the Bpu sequences in Cycas mtDNA are hypothesized and discussed.
Key Words: mitochondrial genome Cycas RNA editing sites repeats mobile elements
| Introduction |
|---|
|
|
|---|
Presently, the complete mitochondrial genomes (mtDNAs) of land plants are only known for 1 liverwort (Marchantia polymorpha, Oda et al. 1992
|
Cycads (Cycadales) appeared in the Pennsylvanian era, approximately 300 million years ago (MYA) and dominated the Mesozoic forests along with conifers and ginkgos. The extant cycads include 2 or 3 families with some 300 species in 10 genera (Chaw et al. 2005
Here we report the complete mtDNA sequence of Cycas taitungensis and its surprising organization. Notably, it contains abundant short interspersed repetitive elements and RNA editing sites. Compared with the known mtDNAs from other land plants, that of Cycas has a particularly low A + T content, fewer gene losses, and no large repeats >2 kb. Moreover, we describe a novel family of mobile elements, herein termed Bpu sequences/elements, more than 500 variants of which are distributed across the Cycas mtDNA. Our evolutionary analysis further reveals that, within seed plants, the mtDNA of Cycas shows higher substitution rates for protein-coding genes than in other known plants. The evolutionary origin and invasion/duplication mechanisms in Cycas mtDNA are hypothesized and discussed.
| Materials and Methods |
|---|
|
|
|---|
Determination of the Complete mtDNA Sequence of C. taitungensis
Young leaves (less than 10-day-old) were collected from an 8-year-old C. taitungensis tree grown in the greenhouse of the Academia Sinica. Intact mitochondria were isolated using the method described by Kadowaki et al. (1996)
2–3 kb with a Hydroshear device (Genomic Solutions Inc., Ann Arbor, MI), and then directly cloned into the EcoRV site of the pBluescriptSK vector to generate a shotgun library. Shotgun clones were sequenced as previously described (Wang et al. 2007
Sequence Data Analysis
Annotation of Protein-Coding, rRNA, and tRNA Genes
A database search was carried out by using the National Center for Biotechnology Information's (NCBI) Web-based Blast service (http://www.ncbi.nlm.nih.gov/BLAST/), and genes with e-values smaller than 0.001 were selected. The exact gene and exon boundaries were determined by alignment of homologous genes from available and annotated plant mtDNAs (table 1). Multiple sequence alignments were performed using the MAP2 Web service (http://deepc2.psi.iastate.edu/aat/map2/map2.html). Alignments of both nucleotide sequences and the translated amino acid sequences of the protein-coding genes were manually inspected. The tRNA genes were annotated using the tRNAscan-SE program (Lowe and Eddy 1997
).
ORF Finding and Intron Identification
Identification of open reading frames (ORFs) was performed with the Web-based NCBI ORF Finder (http://www.ncbi.nlm.nih.gov/projects/gorf/). The standard genetic code was applied. The intron types were identified on the basis of their sequences and secondary structures (Michel and Ferat 1995
). The predicted introns were also verified by alignment of their sequences with orthologous sequences previously elucidated from other species.
RNA Editing Sites
Putative RNA editing sites in protein-coding genes were predicted using the PREP-mt Web-based program (http://www.prep-mt.net/) (Mower 2005
). To achieve a balanced trade-off between the number of false positive and false negative sites, the cutoff score (C-value) was set to 0.6 as suggested by the author. In addition, the top 4 genes with highest number of editing sites were verified by reverse transcription–polymerase chain reaction (RT–PCR) with gene-specific primers (supplementary table S7, Supplementary Material online).
Analysis of Repeats
Identification of repeats was carried out using the REPuter Web-based interface (http://bibiserv.techfak.uni-bielefeld.de/reputer/) (Kurtz et al. 2001
). Both sequence directions (forward and reverse complement) were searched. The number of maximum computed repeats was set to 5,000. Overlapped repeating sequences were manually removed from each result. Information on tandem repeats was obtained using the tandem repeats finder (http://tandem.bu.edu/trf/trf.html) (Benson 1999
). Searches for repeated regions and tandem repeats were carried out on all elucidated plant mtDNAs.
Phylogenetic Analysis
The 22 protein-coding genes common to the 11 sampled mtDNAs (atp1, atp4, atp6, atp8, atp9, ccmB, cob, cox1, cox2, cox3, nad1, nad2, nad3, nad4, nad4L, nad5, nad6, nad9, mttB, rps3, rps4, and rps12) were extracted for phylogenetic analysis. The sequences were separately aligned and then concatenated. Gaps and stop codons were removed manually. Divergence of nucleotide sequence between each pair of taxa was estimated in terms of the numbers of substitutions per synonymous (Ks) or nonsynonymous (Ka) site, based on the Pamilo–Bianchi–Li method implemented in MEGA 3.0 program (Kumar et al. 2004
). The Neighbor-Joining trees reconstructed with the Ka values and Ks values were rooted at Chara. The number of bootstrap replicates was set to 500. All phylogenetic analyses and tree reconstructions were performed using MEGA 3.0.
| Results and Discussion |
|---|
|
|
|---|
Evolution of mtDNA Organization in Land Plants
Characteristics of Cycas mtDNA and Insights into the Evolution of Land Plant mtDNAs
The complete mtDNA of C. taitungensis is a circular molecule of 414,903 bp (fig. 1). Table 2, which compares the main features of mtDNAs from a charophyte (Chara vulgaris), 2 bryophytes (Marchantia and Physcomitrella), and 7 angiosperms, shows that the Cycas mtDNA is about 6-, 2.2-, and 4.0-fold larger than those of Chara, Marchantia, and Physcomitrella, respectively, but does not significantly differ from the average (414 ± 102 kb) of the previously elucidated angiosperm mtDNAs (P = 0.502). The A + T content of the Cycas mtDNA is 53.1%, the lowest among known algae and land plants. As further shown in table 2, the total numbers of protein- and tRNA-coding genes decrease from charophytes (39 and 26, respectively) to seed plants (29–40 and 17–27, respectively) (detailed in supplementary table S2, Supplementary Material online). In contrast, the numbers of rRNA gene species remain the same in all lineages, with the exception of obvious gene duplications in the Poaceae (grass family) and Beta lineages, in which some tRNA genes are also duplicated.
|
|
Table 2 also shows that noncoding sequences (spacers, introns, and pseudogenes) account for 89.9% of the Cycas mtDNA sequence, consistent with the proportions found in other angiosperm mtDNAs (89.4 ± 3.1%). As land plants evolved from charophycean green algae (Chara, 9.3%), the closest living relatives of land plants (Karol et al. 2001
Repeated sequences that present in the genome as multiple copies comprise approximately 15.1% of the Cycas mtDNA (table 2). The repeats, very few of which are over 2-kb long, are evenly distributed across the genome and mainly occur in the noncoding regions, including the intergenic spacers and introns (fig. 1B). As shown in table 2, most mtDNAs of land plants (except for rice) have 2–5 times more repeated sequences than the mtDNA of Chara, whereas more than a quarter of the rice mtDNA consists of repeated sequences. Among the sampled plant mtDNAs, that of Cycas contains the highest percentage of tandem repeats (4.97%; fig. 2A, detailed in supplementary table S3 [Supplementary Material online]); these include a novel family of mobile elements, the Bpu sequences/elements (see below), which are mainly found in 3- or 4-copy arrays (fig. 2B). Ogihara et al. (2005)
noted the presence of many repeats in wheat mtDNA and hypothesized that alternative physical structures may be adopted by wheat mtDNA. Therefore, it seems logical to hypothesize that alternative circular mtDNAs in various recombinant forms might coexist in Cycas cells in vivo.
|
No group I intron was detected in the mtDNA of Cycas or the other 7 previously elucidated angiosperm mtDNAs (table 2). This observation is consistent with Knoop's (2004)
In contrast, we found 20–25 group II introns in the mtDNAs of the examined land plants (table 2), nearly double the number found in those of their sister group, the charophytes (13). Similar to the angiosperms, the 3 genes nad1, nad2, and nad5 in Cycas are disrupted by 5 group II intron sequences, which have brought the genes into trans-splicing arrangements. Studies of Malek et al. (1997)
and Malek and Knoop (1998)
concluded that trans-spliced group II introns had evolved from formerly cis-spliced introns before the emergence of hornworts. Most recently, the evolving date was discovered to be even before the emergence of mosses (Groth-Malonek et al. 2005
).
Gene Content and Evolution of Gene Loss in the mtDNAs of Land Plants
Our phylogenetic analysis using either nonsynonymous (Ka) or synonymous (Ks) substitutions of 22 mitochondrial protein-coding genes shared by the 11 studied plants generated identical tree topologies. Figure 3A shows the Neighbor-Joining trees reconstructed using the Ka and Ks values, respectively, with Chara being designated as the outgroup. The topologies of these 2 trees strongly indicate that, excluding the bryophytes Physcomitrella and Marchantia, the seed plants (including Cycas and the 7 angiosperms) and the angiosperms form nested monophyletic clades. Within the angiosperms, the monocots and eudicots constitute 2 distinct subclades. Noting that sisterhood relationship of Physcomitrella and Marchantia has to be treated with caution because the sampled seed plants share a long branch. In addition, a recent multigene analysis (Qiu et al. 2006
) based on dense taxon sampling suggested that hornwort (including Marchantia) diverged before moss (including Physcomitrella). The Ka- and Ks-derived branch lengths leading to Cycas are nearly equal (fig. 3A), whereas the Ka-based branch lengths for the other species are strikingly shorter than the corresponding Ks-based branches. Statistical analysis using a Z-test further indicated that Ka value for the Cycas branch is higher than the average Ka of the 7 studied angiosperms (P < 0.05, Z-test). The elevated Ka in Cycas suggests that a rapid evolution/divergence may have occurred in some of the protein-coding genes of the Cycas mtDNA. This result is consistent with the observation that some genes in the Cycas mtDNA contain abundant RNA editing sites (see below).
|
A total of 39 protein-coding genes were identified in the Cycas mtDNA, which is the highest gene number identified to date among the studied seed plant mtDNAs (fig. 3B). When the distributions of conserved genes in the mtDNA from the 11 sampled species (supplementary table S2, Supplementary Material online) are mapped to their respective branches in a maximum parsimony tree (fig. 3B; based on the data used in fig. 3A), it is possible then to estimate the time of loss of a particular gene. As shown in figure 3B, our analysis indicates that there have been at least 31 independent events of gene loss from all the land plant mtDNAs elucidated to date.
When a gene is missing from the mtDNA of a given species, it is generally believed that the original copy has been transferred to nucleus, where it functions through cytosolic protein synthesis followed by transit peptide–assisted import back to the mitochondria (Adams and Palmer 2003
; Knoop 2004
). Frequent gene losses, especially of the ribosomal protein genes (30 of 34), appear to have occurred after the divergence of the angiosperm lineages approximately 150 MYA (Chaw et al. 2004
). However, only 1 gene loss was observed in the Cycas lineage after it branched off from the common ancestor of angiosperms, approximately 300 MYA. Adams and Palmer (2003)
suggested that angiosperm mtDNAs have experienced a recent evolutionary surge of loss and/or transfer of genes (primarily those encoding ribosomal proteins) to the nucleus. Our data give additional support to their contention but suggest that the mtDNA of Cycas appears to have been excluded from this surge. Our results suggest that the Cycas mtDNA tends to evolutionarily maintain its gene diversity and/or enjoys less gene transfer than other angiosperm mtDNAs (Selosse et al. 2001
; Adams et al. 2002
; Adams and Palmer 2003
). Coincidentally, among the some 40 published cpDNAs of seed plants, that of Cycas also undergoes the least gene loss (Wu et al. 2007
, supplementary fig. 1 [Supplementary Material online]). Future work will be required to examine why the genomes of these 2 Cycas organelles appear to have frozen after the cycads divergence from angiosperms, especially in comparison to mtDNAs from other major gymnosperm clades, such as Ginkgo, gnetophytes, and pines.
A Novel Family of Short Interspersed Mitochondrial Elements
Numerous Short Interspersed Mitochondrial Elements, Termed Bpu Sequences, Are Present in the Cycas mtDNA
Sequence analysis revealed that numerous copies of a 36-nt repeat, herein designated as a "Bpu sequence/element," are interspersed throughout the Cycas mtDNA (fig. 1B). Figure 4A shows the characteristic sequences of 500 Bpu elements having 0 to 4 mismatches. If up to a 7-nt mismatch to the dominant type is allowed, the total copy number of Bpu sequences increases to 512. The Bpu sequences feature 2 conserved terminal direct repeats (AAGG) and the recognition sequence for the restriction endonuclease, Bpu10I (CCTGAAGC; nt 15–21). These repeat elements/sequences do not appear to have coding potential.
|
Because Bpu elements are extremely short in their lengths and terminal repeats and contain 2 terminal direct repeats rather than the inverted repeats found in plant miniature inverted-repeat transposable elements (MITEs), they are here classified as short interspersed mitochondrial elements (SIMEs). This distinguishes them from the MITEs and short interspersed nuclear elements (SINEs, e.g., Alu DNA repeats in the genomes of primates) that have been extensively reported from the genomes of plants and animals (Feschotte et al. 2002
The Bpu10I recognition site characteristic of the Bpu elements is highly conserved except at the very last base pair, that is, the 21st bp of Bpu sequence (fig. 4A, lower chart). In contrast, the 5' terminal direct repeat and its downstream 6 nt (nt 5–9) tend to be more variable than the 3' terminal direct repeat. Enigmatically, that elements can form a secondary structure (fig. 4B) with a predicted free energy of –12.5 (kcal/mole), as calculated by the MFOLD program (http://frontend.bioinfo.rpi.edu/zukerm/comments/FAQs.html). The significance of this secondary structure remains to be elucidated. Comparison of the Bpu element insertion sites in the Cycas mtDNA with corresponding sites in the mtDNAs of other species reveals that the target sites for transpositions of Bpu elements are identical (within 1–2 mismatched base pair) to the 5' terminal direct repeat (fig. 4C–F).
Bpu Sequences Distinguish Cycas from Other Cycads and Seed Plants
Bpu sequences are present exclusively in the noncoding regions of the Cycas mtDNA, that is, within the introns and intergenic spacers, with 1 exception: a Bpu sequence is present within the coding region of rrn 18, which codes the 18S rRNA (fig. 4C). Among the many mitochodrial rrn18 (mt-rrn18) genes available in GenBank, only those of Cycas (including Cycas revoluta, GenBank accession number: AB029356) and Ginkgo biloba (the only living species of the order Ginkgoales) have Bpu elements (fig. 4C). The Bpu insertion sites of the 2 Cycas taxa are orthologous (data not shown), whereas that of Ginkgo is different, indicating that the insertions of Bpu elements in rrn18 did not occur in the common ancestor of cycads and ginkgo. In sequenced mtDNAs, the oldest cpDNA-derived sequences (also termed mtpt) cluster, trnV(uac)-trnM(cau)-atpE-atpB-rbcL, is reported to have existed since the common ancestor of seed plants (Wang et al. 2007
; fig. 3). Bpu elements are present in the Cycas mtpt-atpB (fig. 4D) and mtpt-rbcL (fig. 4E) sequences but not in the corresponding sequences of the 7 angiosperm mtDNAs elucidated to date (Wang et al. 2007
), suggesting that these Bpu insertions took place after the split of the Cycas lineage from the other seed plants.
We further examined the occurrence of Bpu elements in 1 of the other 2 cycad families, Zamiaceae (including 1 species from each of Dioon, Macrozamia, and Zamia) by sequencing the exons 1 and 2 of their nad2 genes. Whereas the nad2.i1 gene of Cycas mtDNA contains 5 Bpu elements, such elements are absent from the nad2 genes of the 3 sampled Zamiaceae genera. However, we found 1, 2, and 6 Bpu elements in mitochondrial nad5.i1 of Cycas panzhihuaensis (GenBank accession number: AF43425) and mitochondrial nad1.i1 and cp rps19-rpl16 spacers/intron of C. revoluta (GenBank accession number: AY354955, AY345867), respectively. These data seem to suggest that no Bpu sequence has successfully invaded the mtDNAs of Cycadales genera other than Cycas.
A Bpu-like sequence with a 1-bp insertion and 89% similarity to the dominant type in Cycas mtDNA was retrieved from the coding sequence of the mitochondrial rps11 gene from the core eudicot, Weigela hortensis. Alignment of this Bpu-like sequence reveals that its predicted Bpu10I endonuclease recognition site differs from that of Cycas by 2 bp (gCTGAGt). Because this Bpu element-like sequence does not interrupt the reading frame of rps11 and because the mitochondrial rps11 gene of Weigela lacks a target duplication site, we do not consider that this Bpu-like element shares a common origin with the Bpu sequences of Cycas. We further postulate that Bpu elements are likely absent from or very rare in angiosperms.
Surprisingly, the cpDNA of Cycas also contains 2 Bpu elements and each of them locates in the petN-psbM and psbA-trnK spacers, respectively (see Wu et al. 2007
). Additionally, we also identified 1 Bpu sequence in the atpB-rbcL spacer of the cpDNA from Pinus luchuensis (GenBank accession number: DQ196799). However, no Bpu sequence has been detected in the cpDNAs of the 3 other Pinaceae genera and 3 gnetophyte orders we have sequenced to date (Chaw SM, Wu CS, Lai YT, Wang YN, Lin CP, Liu SM, unpublished data). Collectively, the available evidence inclines us to believe that the Bpu sequences have proliferated specifically in the mtDNA of Cycas (or the Cycadaceae). The sporadic occurrence of Bpu elements in the cpDNAs of Cycas and Pinus suggests that they are likely derived from nonhomologous recombination with DNA fragments that leaked out of mitochondria in the former case and via lateral transfer in the latter case.
The Tai Sequence and Its Association with Bpu Sequences
The second intron of the rps3 gene (rps3.i2) is only found in the mtDNAs of Cycas (Regina et al. 2005
), including those studied in the present work. Regina et al. (2005)
suggested that rps3.i2 is a group II intron that was independently gained in the gymnosperms likely at or just after the divergence of the angiosperms. Moreover, the authors reported a high similarity between a partial segment of the Cycas rps3.i2 and orf760 of the Chara mtDNA (Turmel et al. 2003
). Orf760 harbors functional domains for a maturase and a reverse transcriptase, prompting Regina et al. (2005)
to propose that the Cycas rps3.i2 gene originally encoded a maturase and a reverse transcriptase but evolved over time into a partially degenerated ORF. Here, we further report that rps3.i2 of the Cycas mtDNA contains a 900-bp fragment comprising an array of 4 Bpu elements with the Bpu elements on each end lacking 1 terminal repeat, followed by a 440-bp fragment (designated as "Tai" sequence), a
orf760 sequence, and 1 perfect Bpu sequence (fig. 5A). Most intriguingly, additional Tai sequences are scattered throughout the Cycas mtDNA; they occur more densely in longer spacers than in shorter ones (fig. 5B) and are generally found in close association with repeated Bpu sequences. These findings lend extra and robust support to the proposal of Regina et al. (2005)
.
|
In conclusion, we hypothesize that a Tai sequence and an orf760 flanked by a Bpu sequence at each end most likely constitute an ancestral retrotransposon. This hypothesis is founded on 2 observations: 1) Tai sequences are highly associated with Bpu elements and 2) small segments of Tai sequences are vigorously and diversely rearranged, deleted, or truncated as illustrated in the 3 examples shown in figure 5B. These observations also suggest that an ancestral retrotransposon in Cycas mtDNA spread very actively, presumably after Cycas branched off from the 10 other extant cycad genera. Afterward, the offspring or duplicates of the ancestral transposon gradually may have lost their mobility through varying degrees of deletions/truncations from the 3' region, which encoded both maturase and reverse transcriptase functions.
We further speculate that even after the ancestral retrotransposon lost its transposon function, the remaining Bpu elements might have retained the ability to proliferate or amplify via unequal crossing-over or slipped-strand mispairing because their 2 direct terminal repeats can pair with each other's complimentary strands. Future studies and sequencing of additional mtDNAs from basal Cycas will be required to confirm this hypothesis and may provide additional insight into the evolutionary origins of Bpu sequences and the molecular mechanisms underlying their amplification.
The Fates of CpDNA-Derived Sequences (mtpts) in Cycas mtDNA
Table 2 shows a comparative analysis of mtpts among the 11 studied plant mtDNAs. The total percentage of mtpts in Cycas (4.4%) is relatively high compared with that in dicots (1.1–3.6%) but falls within the range seen among monocots (3.0–6.3%). We previously discovered that the frequency of mtpt transfer is positively correlated with variations in mtDNA size (coefficient value r2 = 0.47) (Wang et al. 2007
). Here, we report that the Cycas mtDNA contains 8 protein-coding genes, as well as 2 rRNA and 5 tRNA gene sequences originating from the cpDNA (fig. 1). However, frameshifts and indels within the protein-coding genes suggest that these Cycas mtpts have degenerated and are nonfunctional. In contrast, the 5 cpDNA-derived tRNAs in the Cycas mtDNA are able to fold into standard cloverleaf structures, as shown by tRNAscan-SE analysis (Lowe and Eddy 1997
), and are thus likely to be functional. Previously, Sugiyama et al. (2005)
used tRNAscan-SE to scan the tobacco mtDNA and concluded that the 6 cp-derived tRNAs shared by angiosperms are functional. Because no mtpt was observed in Chara, Marchantia, or Physcomitrella, Wang et al. (2007)
concluded that frequent DNA transfer from cpDNA to mtDNA has taken place no later than in the common ancestor of seed plants, approximately 300 MYA.
Abundant RNA Editing Sites in Cycas mtDNA
Among the land plant DNAs, Cycas has the most predicted RNA editing sites (1,084 sites; supplementary table S4 [Supplementary Material online]). It is commonly believed that RNA editing arose together with the first terrestrial plants (Steinhauser et al. 1999
). By using the PREP-mt software (Mower 2005
) with the cutoff score set to 0.6, 1,084 sites within the protein-coding genes of the Cycas mtDNA were predicted to be C-to-U RNA editing sites. This is more than double the number of predicted sites within the elucidated mtDNAs of other land plants (table 2). If the cutoff score, which indicates the conservation degree of each editing site compared those found in the other published plant mtDNAs, is set to the most stringent criterion in PREP-mt (i.e., = 1), the number of editing sites decreases to 738, which is still the largest number found among the land plant mtDNAs elucidated to date.
It is believed that RNA editing is essential for functional protein expression as it is required to modify amino acids or generate new start or stop codons (Hoch et al. 1991
; Wintz and Hanson 1991
; Kotera et al. 2005
; Shikanai 2006
). For this reason, the large number of RNA editing sites in the Cycas mtDNA may indicate higher complexity at the DNA level and formation of various transcripts through RNA editing, thus potentially reflecting rapid divergence in Cycas.
Although RNA editing sites are known to be sporadically distributed in the genomes of plant organelles/mitochondria from seed plants, the mechanisms underlying this distribution pattern are not yet known (Shikanai 2006
; Mulligan et al. 2007
). Supplementary table S4 (Supplementary Material online) shows that within the Cycas mtDNA, gene complex I (which includes 9 nad genes), is the most extensively edited gene category. Nad4 and nad5 show the first and second highest number of editing sites, followed by the cox1 gene of complex III. The least edited gene is rps10, which contains only 6 predicted sites even when the cutoff score is set to 0.6. Mulligan et al. (2007)
interpreted the sporadic pattern as a result of lineage-specific loss of editing sites through retroconversion, which could remove adjacent editing sites by replacement with the edited sequences. The abundant RNA editing sites in the Cycas mtDNA, however, are mysterious and completely deviate from the sporadic patterns reported in other seed plant mtDNAs. Therefore, the mechanism of RNA editing in Cycas mtDNA may have a different evolutionary history from those of other seed plants.
Previous studies (Steinhauser et al. 1999
; Lurin et al. 2004
; Shikanai 2006
) led to the proposal that variation or multiplication of trans-acting factors (e.g., members of the pentatricopeptide repeat family) may allow rapid increases in the number of editing sites. This could possibly be correlated to the lineage-specific explosion of RNA editing sites observed in Cycas mtDNA. Furthermore, Handa's (2003)
conclusion that the evolutionary speed in RNA editing is higher at the level of gene regulation than at the primary gene sequence level appears to provide additional support for our speculation. To improve understanding of the mechanisms involved in abundant RNA editing sites in Cycas mtDNA, critical analysis of mtDNAs from ferns, conifers, and other cycads will be required.
In all plant lineages, RNA editing often alters the amino acid sequence (Adams and Palmer 2003
). Supplementary table S5 (Supplementary Material online) summarizes the number of predicted preediting codons and potential edited codons in the Cycas mtDNA. The 2 most frequent edited codons are TCA (Ser) and TCT (Ser), which are predicted to be edited into TTA (Leu) and TTT (Phe), respectively. The least frequently edited codon is CAA (Gln), which would putatively be edited in only 2 cases, wherein the codon's first position would be altered to yield the stop codon, TAA. Supplementary table S6 (Supplementary Material online) further depicts the formation of 4 start (from ACG to ATG) and 7 stop codons (from CGA to TGA) through RNA editing. To verify the accuracy of the editing sites predicted using PREP-mt, partial transcripts of the cox1, atp1, atp6, and ccmB genes were experimentally assayed using RT-PCR. Comparison of our RT-PCR sequences with the editing sites predicted by PREP-mt indicated accuracy values of 97.66%, 99.13%, 97.91%, and 99.25% for cox1, atp1, atp6, and ccmB, respectively.
Our RT-PCR analysis additionally identified a U-to-C editing type in atp1; this was missed in the prediction because detection of the U-to-C editing type is limited when using PREP-mt (Mower 2005
). Unfortunately, programs for predicting RNA editing sites in nonprotein-coding genes, such as ribosomal RNA and tRNA genes, and that of silent editing sites (which alter the mRNA but do not change the translated amino acid) are not yet available. We are currently examining these additional editing types and the locations of such sites in the Cycas mtDNA.
No Large Repeats in Gymnosperms but Some in Angiosperms
In the mtDNAs of angiosperms, long repeated sequences vary in size from 2,427 bp in rapeseed to 127,600 bp in rice, and they show no homology to each other, implying that repeated sequences were independently acquired during the evolution from Marchantia to angiosperms (Sugiyama et al. 2005
). In Cycas mtDNA, we only detected a few long repeated sequences, all less than 1.5 kb in length and mostly composed of Bpu sequences. It has been demonstrated that genes are continuously transferred from the nuclear genome and cpDNA to mtDNA (Knoop 2004
). However, the mechanisms underlying the emergence of large repeats are not yet fully understood.
| Conclusions |
|---|
|
|
|---|
The complete Cycas mtDNA shows a number of unprecedented features that are atypical of the mtDNAs previously elucidated for other land plants, including the lowest A + T content, the highest proportion of tandem repeat sequences (mainly Bpu sequences) and gene number found so far, abundant RNA editing sites, and an exceptionally elevated Ka value for the protein-coding genes. The latter 2 features might be correlated. In comparison with the other known angiosperm organelle genomes, the cpDNA and mtDNA of Cycas have experienced the least gene loss. Peculiarly, the cpDNA of Cycas has the fewest RNA editing sites among land plants (Wu et al. 2007
A novel family of SIMEs, designated as Bpu elements/sequences, represents another unique feature of the Cycadaceae lineage. Bpu elements are widely distributed throughout the noncoding regions of Cycas mtDNA (over 500 copies), with only 1 occurrence in a coding region (rrn 18). In the Cycas mtDNA, the highly conserved nature of Bpu sequences and their association with Tai sequences as well as
orf760 (located in rps3.i2) suggest that these sequences may have collectively originated from an ancestral retrotransposon. Further investigation into the origin, proliferation, and evolution of Bpu sequences will be desirable to help clarify the peculiar features of Cycas mtDNA.
| Supplementary Material |
|---|
|
|
|---|
Supplementary tables S1–S7 and figure 1 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
| Acknowledgements |
|---|
|
|
|---|
This work was supported by joint research grants from the Research Center for Biodiversity, Academia Sinica, to S.-M.C. and D.W., and partially by National Science Council grant to S.-M.C. (94-2311-B001-059) and by a grant from the Institute of Information Science, Academia Sinica, to A.C.-C.S. We thank Chiao-Lei Cheng for sequencing the RT-PCR products. We are grateful for the 3 anonymous reviewers, who provided critical comments and valuable suggestions.
| Footnotes |
|---|
William Martin, Associate Editor
| References |
|---|
|
|
|---|
Adams KL, Palmer JD. Evolution of mitochondrial gene content: gene loss and transfer to the nucleus. Mol Phylogenet Evol (2003) 29:380–395.[CrossRef][ISI][Medline]
Adams KL, Qiu YL, Stoutemyer M, Palmer JD. Punctuated evolution of mitochondrial gene content: high and variable rates of mitochondrial gene loss and transfer to the nucleus during angiosperm evolution. Proc Natl Acad Sci USA. (2002) 99:9905–9912.
Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res (1999) 27:573–580.
Bergthorsson U, Richardson AO, Young GJ, Goertzen LR, Palmer JD. Massive horizontal transfer of mitochondrial genes from diverse land plant donors to the basal angiosperm Amborella. Proc Natl Acad Sci USA. (2004) 101:17747–17752.
Chaw SM, Chang CC, Chen HL, Li WH. Dating the monocot-dicot divergence and the origin of core eudicots using whole chloroplast genomes. J Mol Evol (2004) 58:424–441.[CrossRef][ISI][Medline]
Chaw SM, Walters TW, Chang CC, Hu SH, Chen SH. A phylogeny of cycads (Cycadales) inferred from chloroplast matK gene, trnK intron, and nuclear rDNA ITS region. Mol Phylogenet Evol. (2005) 37:214–234.[CrossRef][ISI][Medline]
Chaw SM, Zharkikh A, Sung HM, Lau TC, Li WH. Molecular phylogeny of extant gymnosperms and seed plant evolution: analysis of nuclear 18S rRNA sequences. Mol Biol Evol (1997) 14:56–68.[Abstract]
Cho Y, Qiu YL, Kuhlman P, Palmer JD. Explosive invasion of plant mitochondria by a group I intron. Proc Natl Acad Sci USA. (1998) 95:14244–14249.
Clifton SW, Minx P, Fauron CM, et al, (13 co-authors). Sequence and comparative analysis of the maize NB mitochondrial genome. Plant Physiol (2004) 136:3486–3503.
Feschotte S, Zhang X, Wessler SR. Miniature inverted-repeat transposable elements and their relationship to established DNA transposons (2002) Washington (DC): ASM Press.
Groth-Malonek M, Pruchner D, Grewe F, Knoop V. Ancestors of trans-splicing mitochondrial introns support serial sister group relationships of hornworts and mosses with vascular plants. Mol Biol Evol (2005) 22:117–125.
Handa H. The complete nucleotide sequence and RNA editing content of the mitochondrial genome of rapeseed (Brassica napus L.): comparative analysis of the mitochondrial genomes of rapeseed and Arabidopsis thaliana. Nucleic Acids Res (2003) 31:5907–5916.
Hoch B, Maier RM, Appel K, Igloi GL, Kossel H. Editing of a chloroplast mRNA by creation of an initiation codon. Nature (1991) 353:178–180.[CrossRef][Medline]
Jones DL. Cycads of the World—ancient plant in today's landscape (2002) 2nd ed. Sydney (Australia): Reed New Holland.
Kadowaki K, Kubo N, Ozawa K, Hirai A. Targeting presequence acquisition after mitochondrial gene transfer to the nucleus occurs by duplication of existing targeting signals. EMBO J (1996) 15:6652–6661.[ISI][Medline]
Karol KG, McCourt RM, Cimino MT, Delwiche CF. The closest living relatives of land plants. Science (2001) 294:2351–2353.
Knoop V. The mitochondrial DNA of land plants: peculiarities in phylogenetic perspective. Curr Genet (2004) 46:123–139.[ISI][Medline]
Kotera E, Tasaka M, Shikanai T. A pentatricopeptide repeat protein is essential for RNA editing in chloroplasts. Nature (2005) 433:326–330.[CrossRef][Medline]
Kubo T, Nishizawa S, Sugawara A, Itchoda N, Estiati A, Mikami T. The complete nucleotide sequence of the mitochondrial genome of sugar beet (Beta vulgaris L.) reveals a novel gene for tRNA(Cys)(GCA). Nucleic Acids Res (2000) 28:2571–2576.
Kumar S, Tamura K, Nei M. MEGA3: integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief Bioinform (2004) 5:150–163.
Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R. REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res (2001) 29:4633–4642.
Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res (1997) 25:955–964.
Lurin C, Andres C, Aubourg S, et al, 19 co-authors. Genome-wide analysis of Arabidopsis pentatricopeptide repeat proteins reveals their essential role in organelle biogenesis. Plant Cell (2004) 16:2089–2103.
Malek O, Brennicke A, Knoop V. Evolution of trans-splicing plant mitochondrial introns in pre-Permian times. Proc Natl Acad Sci USA. (1997) 94:553–558.
Malek O, Knoop V. Trans-splicing group II introns in plant mitochondria: the complete set of cis-arranged homologs in ferns, fern allies, and a hornwort. RNA (1998) 4:1599–1609.[Abstract]
Michel F, Ferat JL. Structure and activities of group II introns. Annu Rev Biochem (1995) 64:435–461.[CrossRef][ISI][Medline]
Mower JP. PREP-Mt: predictive RNA editor for plant mitochondrial genes. BMC Bioinformatics (2005) 6:96.[CrossRef][Medline]
Mulligan RM, Chang KL, Chou CC. Computational analysis of RNA editing sites in plant mitochondrial genomes reveals similar information content and a sporadic distribution of editing sites. Mol Biol Evol (2007) 24:1971–1981.
Notsu Y, Masood S, Nishikawa T, Kubo N, Akiduki G, Nakazono M, Hirai A, Kadowaki K. The complete sequence of the rice (Oryza sativa L.) mitochondrial genome: frequent DNA sequence acquisition and loss during the evolution of flowering plants. Mol Genet Genomics (2002) 268:434–445.[CrossRef][ISI][Medline]
Oda K, Kohchi T, Ohyama K. Mitochondrial DNA of Marchantia polymorpha as a single circular form with no incorporation of foreign DNA. Biosci Biotechnol Biochem (1992) 56:132–135.[Medline]
Ogihara Y, Yamazaki Y, Murai K, et al, (14 co-authors). Structural dynamics of cereal mitochondrial genomes as revealed by complete nucleotide sequencing of the wheat mitochondrial genome. Nucleic Acids Res (2005) 33:6235–6250.
Qiu YL, Li L, Wang B. (13 co-authors). The deepest divergences in land plants inferred from phylogenomic evidence. Proc Natl Acad Sci USA (2006) 103:15511–15516.
Regina TM, Picardi E, Lopez L, Pesole G, Quagliariello C. A novel additional group II intron distinguishes the mitochondrial rps3 gene in gymnosperms. J Mol Evol (2005) 60:196–206.[CrossRef][Medline]
Selosse M, Albert B, Godelle B. Reducing the genome size of organelles favours gene transfer to the nucleus. Trends Ecol Evol (2001) 16:135–141.[CrossRef][Medline]
Shikanai T. RNA editing in plant organelles: machinery, physiological function and evolution. Cell Mol Life Sci (2006) 63:698–708.[CrossRef][ISI][Medline]
Steinhauser S, Beckert S, Capesius I, Malek O, Knoop V. Plant mitochondrial RNA editing. J Mol Evol (1999) 48:303–312.[CrossRef][ISI][Medline]
Stevenson DW. Morphology and systematics of the Cycadales. Mem N Y Bot Gard (1990) 57:8–15.
Stewart CN Jr, Via LE. A rapid CTAB DNA isolation technique useful for RAPD fingerprinting and other PCR applications. Biotechniques (1993) 14:748–750.[ISI][Medline]
Sugiyama Y, Watase Y, Nagase M, Makita N, Yagura S, Hirai A, Sugiura M. The complete nucleotide sequence and multipartite organization of the tobacco mitochondrial genome: comparative analysis of mitochondrial genomes in higher plants. Mol Genet Genomics (2005) 272:603–615.[CrossRef][ISI][Medline]
Terasawa K, Odahara M, Kabeya Y, Kikugawa T, Sekine Y, Fujiwara M, Sato N. The mitochondrial genome of the moss Physcomitrella patens sheds new light on mitochondrial evolution in land plants. Mol Biol Evol (2007) 24:699–709.
Turmel M, Otis C, Lemieux C. The mitochondrial genome of Chara vulgaris: insights into the mitochondrial DNA architecture of the last common ancestor of green algae and land plants. Plant Cell (2003) 15:1888–1903.
Unseld M, Marienfeld JR, Brandt P, Brennicke A. The mitochondrial genome of Arabidopsis thaliana contains 57 genes in 366,924 nucleotides. Nat Genet (1997) 15:57–61.[CrossRef][ISI][Medline]
Wang D, Wu YW, Chun-Chieh Shih A, Wu CS, Wang YN, Chaw SM. Transfer of chloroplast genomic DNA to mitochondrial genome occurred at least 300 MYA. Mol Biol Evol (2007) 24:2040–2048.
Wasserman WW, Sandelin A. Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet (2004) 5:276–287.[CrossRef][ISI][Medline]
Wintz H, Hanson MR. A termination codon is created by RNA editing in the petunia atp9 transcript. Curr Genet (1991) 19:61–64.[CrossRef][ISI][Medline]
Wu CS, Wang YN, Liu SM, Chaw SM. Chloroplast genome (cpDNA) of Cycas taitungensis and 56 cp protein-coding genes of Gnetum parvifolium: insights into cpDNA evolution and phylogeny of extant seed plants. Mol Biol Evol (2007) 24:1366–1379.
![]()
CiteULike
Connotea
Del.icio.us What's this?
| ||||||||||||||||||||||||||||||





orf760 and the 4 tandem Bpu sequences is designated as a Tai sequence (gray). Arrowheads indicate a Bpu sequence lacking the 2 terminal repeats (banded bars) and its complementary sequence. (B) Upper: association of Tai sequences with Bpu tandem repeats across the genome (abscissa) and their respective lengths (ordinate). Lower: examples of 3 Tai variants (boxed) showing that the sequences are variously degenerate and truncated as compared with the typical Tai shown in (A) (broken line indicates mismatching sequences). Thin arrows indicate the orientations of homologous segments between Tai variants and the typical Tai.