Skip Navigation


MBE Advance Access originally published online on September 13, 2006
Molecular Biology and Evolution 2006 23(12):2405-2412; doi:10.1093/molbev/msl112
This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
23/12/2405    most recent
msl112v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Ueda, M.
Right arrow Articles by Kadowaki, K.-i.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ueda, M.
Right arrow Articles by Kadowaki, K.-i.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2006. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org

Research Articles

Evidence for Transit Peptide Acquisition through Duplication and Subsequent Frameshift Mutation of a Preexisting Protein Gene in Rice

Minoru Ueda*,{dagger}, Masaru Fujimoto{dagger}, Shin-ichi Arimura{dagger}, Nobuhiro Tsutsumi{dagger} and Koh-ichi Kadowaki*

* Genetic Diversity Department, National Institute of Agrobiological Sciences, Tsukuba, Ibaraki, Japan
{dagger} Laboratory of Plant Molecular Genetics, Graduate School of Agricultural and Life Sciences, University of Tokyo, Bunkyo-ku, Tokyo, Japan

E-mail: kadowaki{at}affrc.go.jp.


    Abstract
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Many proteins synthesized in the cytosol are delivered to their appropriate compartments in the cell by specific targeting signals. Here, we provide new insight into the generation of the chloroplast-targeting signal (called the transit peptide) in rice. First, we identified the mitochondrial ribosomal protein L13 (mt rpl13) gene on chromosome 5. Downstream of the gene, we identified a DNA fragment of 266 bp: a segment within a duplication of mt rpl13. The duplicated region was transcribed and found to encode an open reading frame (ORF) of 160 amino acids (aa) (orf160). The orf160 gene comprises C-terminal 60 aa derived from the mt rpl13 gene and N-terminal 100 aa derived from another duplicated fragment of a pentatricopeptide repeat (ppr)564 gene that encodes 564 aa with ppr motifs on chromosome 1. Examination of the localization of the ORF160 protein tagged with green fluorescent protein (GFP) showed that it is targeted to the chloroplasts. As such, ORF160 clearly contains a transit peptide. Interestingly, this was translated from the alternative reading frame of the duplicated fragment of ppr564. To confirm this, the reading frame of the ppr564 gene was shifted according to that of the orf160 gene, and the frameshifted ppr564 sequence was fused to the gene for GFP. The expressed GFP-fused protein was also located in the chloroplasts. These results provide clear evidence for the generation of the transit peptide through duplication and subsequent frameshifting of a reading frame of a preexisting protein gene. We also demonstrate the importance of sequence redundancy and frameshift mutation in this evolutionary process.

Key Words: frameshift mutation • transit peptide • plastid • mitochondria • ribosomal protein • rice


    Introduction
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
It is generally accepted that mitochondria and chloroplasts are descendants of {alpha}-proteobacteria and cyanobacteria, respectively. Most of the genes in the ancestral endosymbiont have either been translocated to the nuclear genome of the host cell or have been lost during evolution after the initial endosymbiotic event (Gray 1992Go; Martin 2003Go). Although such gene transfers are important genetic events in evolution, little is known about the process.

The complete mitochondrial genome has been identified for about 70 animal species including such phyla as Chordata, Arthropoda, Mollusca, and Nematoda. With few exceptions, a typical animal mitochondrial genome contains 13 protein genes (Wolstenholme 1992Go; Boore and Brown 1998Go). On the other hand, only 7 higher plant mitochondrial genomes have been completely sequenced to date (Unseld et al. 1997Go; Kubo, Nishizawa, et al. 2000Go; Notsu et al. 2002Go; Handa 2003Go; Clifton et al. 2004Go; Ogihara et al. 2005Go; Sugiyama et al. 2005Go). They are Arabidopsis thaliana, sugar beet (Beta vulgaris), rice (Oryza sativa), rapeseed (Brassica napus), maize (Zea mays), tobacco (Nicotiana tabacum), and wheat (Triticum aestivum). Despite the limited number of completely sequenced mitochondrial genomes, higher plants show great divergence regarding protein gene content among species. These variations in gene content between plant species strongly suggest that gene transfer from the mitochondrial to the nuclear genome is an ongoing process in higher plants (Brennicke et al. 1993Go), whereas gene transfer is largely complete in animal mitochondrial genomes. Several gene transfer events from the mitochondrial to the nuclear genome have been discovered recently in higher plants, particularly in angiosperms (Adams and Palmer 2003Go).

During the process of gene transfer from the organelle to the nucleus, transferred genes must undergo several steps. These include gene translocation, acquisition of a promoter and a targeting signal, and elimination of the original sequence from the organelle genome (Brennicke et al. 1993Go). In those steps, how did the transferred genes acquire the targeting signal? Until now, several mechanisms for the acquisition of the mitochondrial targeting signal (called the presequence) have been revealed by the analysis of the newly transferred mitochondrial genes in angiosperms (Adams and Palmer 2003Go). The first is the duplication of an existing presequence, as in genes for rice rps11 (Kadowaki et al. 1996Go), maize rpl5 (Sandoval et al. 2004Go), and Arabidopsis sdh3 (Adams et al. 2001Go). The second is the acquisition of a presequence derived from an irrelevant fragment to the targeting signal, for example, the accumulation of point mutations in the coding sequence of the transferred gene without the acquisition of the obvious N-terminal extension as a presequence, such as in the rice rps10 gene (Kubo, Jordana, et al. 2000Go). Likewise, in the potato, coxI gene presequences were generated by the duplication and subsequent accumulation of point mutations in the sequence resembling the protein structure of the presequence (Long et al. 1996Go). The experimental precedence that sequences having no relationship with targeting signals could function as targeting signals has also been reported (Baker and Schatz 1987Go; Lucattini et al. 2004Go). The third mechanism is the acquisition of the presequence by alternative splicing in rice and maize rps14 genes (Figueroa et al. 1999Go; Kubo et al. 1999Go).

Compared with plant mitochondrial genomes, the number and order of genes encoded in chloroplast genomes are well conserved among species in higher plants. Therefore, it is assumed that the gene transfer event from the chloroplast genome is less active than that from mitochondria. Therefore, examples of recent gene transfers during higher plant evolution are very limited. The mechanisms of chloroplast-targeting signal (called the transit peptide) acquisition are rarely reported in comparison with mitochondria. To the best of our knowledge, the analogous mechanism to the duplication of an existing presequence as mentioned above was reported for the acquisition of transit peptides (McFadden 1999Go; Timmis et al. 2004Go), as in the genes for rice rps9 (Arimura et al. 1999Go) and maize glyceraldehyde-3-phosphate dehydrogenase (Quigley et al. 1988Go).

It is widely accepted that both duplications and point mutations in a gene may provide it with new functions. This idea is in good agreement with the above examples for the acquisition of targeting signals. Are there any other possible mechanisms for the acquisition of the targeting signal? Ohno first suggested that frameshift mutations, which are less frequent than duplications and point mutations, may be involved in alterations of function during evolution (Ohno 1970Go). Because whole-genome sequences and many expressed sequence tags (ESTs) in various species are now known, we are now able to address whole-genome analysis comprehensively.

Frameshift mutations may provide a gene with altered functions. Until now, most reported functional divergences leading from frameshift mutations were caused by alternative splicing or premature stop codons and were biased toward the C-terminal regions of genes (Raes and Van de Peer 2005Go). Here, we show that a sequence duplication followed by a frameshift mutation has allowed a gene to acquire a new function via the generation of a transit peptide at the N-terminal region. This provides new insight into the importance of gene duplication and frameshift mutations in evolution.


    Materials and Methods
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Plant Material and Growth Conditions
Etiolated seedlings of rice (O. sativa L. var. nipponbare), bright yellow-2 (BY-2) tobacco (N. tabacum) suspension cells, and A. thaliana L. cv. Columbia were used as plant material.

BY-2 tobacco cell suspension cultures were grown in modified Murashige and Skoog medium enriched with 0.2 mg/2,4-dichlorophenoxyacetic acid and were maintained as described by Nagata et al. (1992)Go. Arabidopsis thaliana plants were grown in a growth chamber at 22 °C with a short-day photoperiod (10 h/14 h light/dark cycle). Oryza sativa plants were grown in a growth chamber at 28 °C in the dark.

Database Analysis
We used the available rice genome database (http://rgp.dna.affrc.go.jp/) and the Blast programs (http://rapdb.lab.nig.ac.jp/blast/index.html) in the Rice Annotation Project Database to conduct our analysis. The version we used was the International Rice Genome Sequencing Project (IRGSP) genome sequence (Build 4.0). The nucleotide sequences of the IRGSP genome sequence (Build 4.0) were obtained from the Web site (http://rapdownload.lab.nig.ac.jp/index.html). Subcellular localization of the protein was predicted by TargetP (Emanuelsson et al. 2000Go) and Predotar version 0.5 http://urgi.infobiogen.fr/predotar/predotar.html). Protein motif analyses were carried out using Pfam (http://www.sanger.ac.uk/Software/Pfam/index.shtml).

Construction of Green Fluorescent Protein Fusion Proteins
The DNA sequence of the rice mitochondrial rpl13 (mt rpl13) gene was amplified by polymerase chain reaction (PCR) using the full-length cDNA clone AK108395 as a template and primers P1 (5'-GGCGTCGACCATGAAAGCACTTG-3') and P2 (5'-GCCCATGGATGCAGTGATCTCGG-3'), which contain a SalI and an NcoI site, respectively. The DNA sequence of the open reading frame (orf)160 gene was amplified using the full-length cDNA clone AK061317 and primers P3 (5'-GAGCCCCTCGAGCATGCTGCCG-3') and P4 (5'-CATTTCAGTCCATGGATCCATTAAGCAC-3'), which were designed to contain an XhoI and an NcoI site, respectively. In the case of pentatricopeptide repeat (ppr)564 and frameshifted ppr564, each fragment was amplified from genomic DNA (Nipponbare) with primer pairs P5 (5'-CGAGGTCGACATGTGGCGGCGTTC-3') and P6 (5'-GGAGCCATGGGCGCTCGGGTTAAG-3') for ppr564, and P7 (5'-CCCCCTCGAGATGCCGGCGGC-3') and P8 (5'-CTCGCCATGGTCCACGACCATGTC-3') for frameshifted ppr564, which were engineered to contain restriction enzyme recognition sites. P5 and P7 include a SalI and an XhoI site, respectively, and P6 and P8 each contain an NcoI site. All restriction enzyme recognition sites in the primers are shown by underlines. All PCR products were cloned in-frame upstream of the gfp-coding sequence, which was regulated by the cauliflower mosaic virus (CaMV) 35S promoter and a 3' nopaline synthase transcription terminator (Chiu et al. 1996Go). The nucleotide sequence of the resultant plasmid was confirmed by DNA sequencing.

Transient Expression Analysis of GFP
Plasmid DNA (10 µg) was precipitated onto 1.0-µm diameter spherical gold beads (Bio-Rad, Hercules, CA), and the beads were bombarded into suspension-cultured tobacco BY-2 cells or Arabidopsis rosette leaves using a PDS-1000 particle delivery system (Bio-Rad). After bombardment, the BY-2 cells and Arabidopsis rosette leaves were incubated for 6 h in the dark at 24 °C and in the light at 22 °C, respectively. Transformed BY-2 cells were treated with 500 nM MitoTracker Orange CMTMRos (Invitrogen, Carlsbad, CA) according to the manufacturer's manual. Green fluorescent protein (GFP) and MitoTracker fluorescence, as well as chlorophyll autofluorescence, were visualized with a confocal laser scanning microscope, as described by Arimura and Tsutsumi (2002)Go. In the case of Arabidopsis rosette leaves, under our bombardment conditions, the plasmids were expressed only in epidermal cells because they could not penetrate palisade parenchymal cells (see fig. 3D). Using a confocal laser scanning microscope, GFP fluorescence was observed only in epidermal cells, whereas the autofluorescence of chlorophyll in the chloroplasts was observed in both epidermal and palisade parenchymal cells. Images of the autofluorescence of chlorophyll from both types of cells overlapped when examined.


Figure 3
View larger version (49K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 3.— Evaluation of protein localizations using GFP proteins. (A) Schematic representation of the chimeric gene structure. The colored boxes show the fused sequence upstream of the gene for GFP. Red, blue, and orange boxes correspond to the aa in each color in figure 2D. Exons, introns, and intergenic regions are indicated by boxes, thick horizontal lines, and thin horizontal lines, respectively. (B) Transient expression analysis of the GFP fusion protein in tobacco BY-2 suspension cells and Arabidopsis rosette leaf epidermal cells. Each recombinant plasmid shown in A was transformed into Arabidopsis rosette leaf (i) and (ii) and tobacco BY-2 cell suspensions (iii). Key: MitoTracker, MitoTracker Orange fluorescence; Chlorophyll, autofluorescence of chlorophyll; Merge, merging of GFP and MitoTracker Orange fluorescence or autofluorescence of chlorophyll; GFP, GFP fluorescence (scale bars, 10 µm). (C) Alignment of the deduced aa sequences from ORF160 (i, aa 1–82) and frameshifted PPR564 (ii, aa 1–90). Black backgrounds indicate aa that are identical between the 2. (D) Schematic representations of general leaf structure. Plasmids introduced to Arabidopsis rosette leaves by bombardment penetrated only the epidermal cells and did not penetrate the palisade parenchymal cells.

 

    Results
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
Identification of a Gene Encoding for mt rpl13 in the Nuclear Genome
The genome of the bacterium Rickettsia prowazekii is more closely related to that of mitochondria than to that of any other organism studied so far (Andersson et al. 1998Go; Muller and Martin 1999Go). After the original endosymbiosis, the ancestor of mitochondria must have transferred genes from its own genome to the host genome, or else some of its genes may have been lost. The gene for rpl13 is one such missing sequence. Instead, mt rpl13 is now found in the nuclear genomes of mammals (Koc et al. 2001Go) and yeasts (Grohmann et al. 1994Go).

Because no mt rpl13 gene has been identified in rice to date, we tried to identify it in the nuclear genome. A TBlastN search against the full-length rice database was conducted using the RPL13 amino acid (aa) sequence from R. prowazekii as a query. Two full-length cDNA clones (GenBank accession numbers AK108395 and AK099041) were obtained. The deduced aa from AK108395 and AK099041 showed 52% and 51% conserved aa similarity against R. prowazekii RPL13, respectively. On the other hand, the deduced aa from AK108395 and AK099041 showed 57% and 69% aa similarity against Synechocystis sp. PCC 6803 RPL13 (Kaneko et al. 1996Go), respectively. It is assumed that the chloroplast is the evolutionary descendant of cyanobacterium and that Synechocystis sp. PCC 6803 is one such cyanobacterium. AK099041 showed higher deduced aa similarity against cyanobacterium than did AK108395, although AK108395 and AK099041 showed almost the same aa similarity against R. prowazekii RPL13. These results suggest that AK099041 and AK108395 encode chloroplast RPL13 (cp RPL13) and mitochondrial RPL13 (mt RPL13), respectively.

The localization of each protein encoded by the 2 clones was predicted using 2 computer programs, TargetP (Andersson et al. 1998Go) and Predotar version 0.5 (http://urgi.infobiogen.fr/predotar/predotar.html). Both programs concurred with the predictions that AK108395 is a mt RPL13 protein and AK099041 is a cp RPL13 protein.

To confirm that the predicted mt RPL13 (AK108395) is a mitochondrial protein, the localization of RPL13 was monitored in tobacco BY-2 cells. The mt RPL13 was fused to the N terminus of the gene for GFP, and the fused protein (fig. 1A) was expressed under the control of the CaMV 35S promoter. The chimeric gene was bombarded into BY-2 cells cultured in suspension. Expression of the gene was monitored 6 h after its introduction by using confocal laser scanning microscopy. The locations of the chimeric GFP protein and the mitochondria stained by MitoTracker were visualized as green and red signals, respectively. If the GFP proteins were properly targeted into the mitochondrion, colocalization of the green and the red signals should appear as composite yellow signals. After the fluorescent signals from GFP and MitoTracker were merged, almost all GFP spots appeared yellow (fig. 1C), strongly suggesting that the chimeric GFP proteins were indeed localized in the mitochondria. Thus, it is likely that the full-length cDNA clone AK108395 carries a gene encoding for mt RPL13 protein. In the case of the full-length cDNA clone AK099041, GFP fused with AK099041 localized in the chloroplasts (data not shown). Therefore, the full-length cDNA clone AK099041 is predicted to encode cp RPL13.


Figure 1
View larger version (33K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 1.— Identification of the mitochondrial rpl13 gene. (A) Schematic representations of gene structures for mt rpl13 and orf160 on chromosome 5. Exons, introns, and intergenic regions are indicated by boxes, thick horizontal lines, and thin horizontal lines, respectively. Numbered arrows show tandemly duplicated regions of mt rpl13. The 4 pairs of arrows labeled by the same number show duplicated fragments. (B) Alignment of the deduced aa sequences from the mt RPL13 and ORF160 proteins corresponding to the second numbered arrows shown in (A). Black backgrounds indicate identical aa. (C) Transient expression analysis of mt RPL13. The gene for GFP was fused to a sequence downstream of mt rpl13 and introduced into tobacco BY-2 cells. The constructed plasmid is shown in (A). Localizations of marker molecules are shown as follows: MitoTracker, MitoTracker Orange fluorescence; Merge, merged images of GFP and MitoTracker Orange fluorescence; and GFP, GFP fluorescence (scale bars, 10 µm).

 
A Chimeric Gene Arose from Fusion of Parts of the rpl13 and ppr564 Genes
Examination of the rice genome database showed that the mt rpl13 (GenBank accession number AK108395) gene is on chromosome 5. Part of the mt RPL13 protein-coding sequence was duplicated independently, and the duplicated sequences were inserted into 5 sites as follows. Three duplicated fragments (266, 590, and 803 bp) were found 4.4, 6.8, and 5.2 kb downstream of mt rpl13, respectively. Part of the intronic sequence of mt rpl13 was also duplicated (fig. 1A). Part of the RPL13 protein-coding sequence was also duplicated and inserted into chromosomes 8 and 1. All exons excluding the first exon of mt rpl13, positions 8711353–8708626 on chromosome 5 (GenBank accession number AP008211), were identified at positions 3635621–3636814 on chromosome 8 (GenBank accession number AP008214). Part of the RPL13 protein-coding sequence, positions 8708626–8708978 on chromosome 5, was found at positions 40755626–40755992 on chromosome 1 (GenBank accession number AP008207).

Interestingly, a partial sequence (266 bp) of the mt rpl13 gene has been duplicated, and the duplicated segment is present 4.4 kb downstream of the original sequence. This duplicated region contains a transcriptional unit because a full-length cDNA (GenBank accession number AK061317) was identified (fig. 1A). This encodes an ORF for a 160 aa protein, which we named AK061317 orf160. Alternatively, spliced transcripts were found in many transcriptional units by a BlastN search of the National Center for Biotechnology Information EST database, as well as the rice full-length cDNA database in the region encoding ORF160. It is possible that the other ORFs could encode different polypeptides that are alternatively spliced in this orf160 region.

The aa from positions 101–160 of ORF160 show high identity with RPL13 (fig. 1B), whereas the N-terminal portion (positions 1–100) of ORF160 show high identity with a sequence on chromosome 1. Around the above sequence on chromosome 1, 1 gene was predicted to encode 564 aa. We named this gene ppr564 (GenBank accession number AB244817). The Pfam program predicted that this gene contains 6 PPR motifs (Small and Peeters 2000Go) and 2 incomplete PPR motifs. Harr plot analysis was conducted to understand the evolutionary relationship between the orf160 gene on chromosome 5 and the ppr564 gene on chromosome 1 (fig. 2B and C). They showed 89% DNA sequence identity over 700 bp (fig. 2D). Also, the intronic DNA sequence and the flanking sequence at the 3' end of mt rpl13 and orf160 showed sequence similarity to the upstream sequences of ppr564 (fig. 2B). These results indicate that segmental duplication between chromosome 5 and chromosome 1 around ppr564 has happened in the past. On the other hand, it showed no aa identity between the deduced aa sequences from the 2 genes because the translational reading frames were different (fig. 2D). In short, orf160 is a chimeric gene composed of part of ppr564 and part of mt rpl13. The reading frame of rpl13 is the same as that of orf160, but that of ppr564 has been altered.


Figure 2
View larger version (37K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
FIG. 2.— Evolutional relationship of orf160, mt rpl13, and ppr564. (A) The locus of mitochondrial rpl13, chloroplast rpl13, orf160, and ppr564 on the rice chromosome. The arrows represent the positions for each gene. "//" is used to show a region skipped from this illustration. Mb shows Megabase. (B) Harr plot analysis of the sequence containing orf160 from chromosome 5 and its related sequences from chromosome 1. The horizontal axis represents 13 kb in the bacterial artificial chromosome (BAC) clone AC137606 from chromosome 5. The vertical axis represents 6 kb in the BAC clone AP003418 from chromosome 1. AC137606 and AP003418 are GenBank accession numbers. Dots are placed at locations in which more than 6 nt are continuously identical. The black and red boxes on the horizontal sequence show mt rpl13 and orf160, respectively. The orange box on the vertical sequence shows ppr564. Dotted lines denote homologous sequence regions between the chromosomes. (C) Schematic representations of homologous regions between orf160 and ppr564. Dotted lines correspond to (B). Exons, introns, and intergenic regions are indicated by boxes and by thick and thin horizontal lines, respectively. (D) Alignment of genomic DNA sequences around the translational initiation sites of orf160 and ppr564. Identical nucleotides are indicated by asterisks. The aa sequences from the ORF160, frameshifted PPR564, and PPR564 proteins are colored in red, blue, and orange, respectively, and were fused to GFP. The background of the aa at the junction of GFP is indicated as green-colored circles. In the case of ORF160, only 86 of the 160 aa for GFP fusion protein are shown. A solid open box in the sequence indicates a 21-bp deletion found in orf160.

 
A 21-bp deletion was found in orf160 when compared with ppr564 (shown by a solid open box in fig. 2D). A direct repeat of 5'-CCGCCGA-3' was found in ppr564, but this was not present in orf160 around the 21-bp deletion. This strongly suggests that the deletion event happened via recombination through the 5'-CCGCCGA-3' sequence after segmental duplication between chromosome 1 and chromosome 5. The partial duplication of the mt rpl13 gene before or after the partial duplication of ppr564 was also involved in the generation of a chimeric gene, orf160.

The Subcellular Localization of ORF160 and PPR564 Proteins
The above results strongly suggest that the first exon of orf160 on chromosome 5 was derived from ppr564 on chromosome 1. Homologues of ppr564 have been found in other species, such as A. thaliana and Z. mays. It is well known that PPR proteins in higher plants are usually located in the mitochondria or chloroplasts (Lurin et al. 2004Go). Both TargetP and Predotar programs predicted that PPR564 is localized in the mitochondria and was equipped with a presequence within N-terminal 35 aa of PPR564. For this reason, N-terminal 35 aa of PPR564 were fused to GFP (PPR564N–GFP) (fig. 3A(iii)).

On the other hand, because the homologues of orf160 have not been found in other species, the function of ORF160 is unknown. Using TargetP, the subcellular localization of the ORF160 protein was predicted to be in the chloroplasts. However, a Predotar program could not predict the localization of the ORF160 protein. We could not confirm whether ORF160 has a targeting signal. Therefore, the full-length protein of ORF160 was used for the GFP-fused protein (ORF160-GFP) (fig. 3A(i)), and the localization of the protein was observed.

PPR564N–GFP was introduced into tobacco BY-2 cells cultured in suspension. This GFP-fused protein colocalized with mitochondria (fig. 3B(iii)), strongly suggesting that native PPR564 is targeted to the mitochondria. ORF160-GFP was introduced into Arabidopsis leaf epidermal cells. Using merged images of the autofluorescence of chlorophyll and of GFP fluorescence, the spots of GFP were found to colocalize with those of chlorophyll (fig. 3B(i)). This strongly suggests that ORF160 is located in the chloroplasts and contains a transit peptide.

These results suggest that, after the partial duplication of ppr564, translation occurred in a different reading frame of the duplicated sequence, resulting in a novel transit peptide in ORF160. The presequence of PPR564 is within 35 aa (basepair positions 1–105). This indicates that the transit peptide of ORF160 has no relationship with the presequence of PPR564 because the translational initiation codon of ORF160 started at the nucleotide corresponding to nucleotide position 161 of ppr564 (fig. 2D).

Generation of a Transit Peptide by Shifting a Protein Reading Frame
These results raise the possibility that a frameshifted reading frame of the current ppr564 gene may be able to encode a transit peptide. To verify this working hypothesis, part of ppr564 was frameshifted to correspond with that of orf160 (fig. 3A(ii)). We then analyzed the localization of the GFP-fused protein (frameshifted PPR564–GFP). The protein fused with GFP is shown by blue-colored aa in figure 2D. The alignment of the aa sequences of the frameshifted PPR564 (90 aa) and ORF160 (82 aa) is shown in figure 3C; the aa were 80% identical. The frameshifted PPR564–GFP in Arabidopsis leaves was clearly located in the chloroplasts (fig. 3B(ii)). This demonstrates directly that a peptide translated from the shifted reading frame of the ppr564 gene could play a role as a transit peptide. In summary, this result gives direct evidence that the transit peptide is translated from a sequence that has undergone duplication and a subsequent frameshift mutation of the partial sequence of a preexisting gene.

Although the ORF160 protein contains only a portion of the mt RPL13 protein (31%) and is located in the chloroplasts, it probably has no biological function. This is because the functional cp RPL13 protein is already encoded (AK099041) in the rice genome.


    Discussion
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
The Properties of Proteins Produced from Frameshifted Genes
Frameshift mutations may provide the resulting proteins with various functions. Here, we found evidence that orf160 acquired the sequence for a transit peptide from the frameshift mutation of a duplicated gene. A number of aa sequences for the N-terminal extensions of nuclear-encoded chloroplast proteins have been analyzed. The transit peptide found in higher plants contains a stretch of aa rich in hydroxylated aa such as serine and threonine. It also contains some basic acids but only a few acidic acids (von Heijne et al. 1989Go). The proportion of hydroxylated aa of the frameshifted protein (90 aa) fused to the GFP was deduced to be 21%, whereas that of the original PPR564 was deduced to be 12%. Furthermore, the PPR564 protein from aa 1–82 contains 23 acidic aa (Asp and Glu), but the frameshifted protein of the same region has no such acidic aa. Such alterations of the mutated sequence must change the inherent characteristics of the protein.

The Role of Frameshift Mutations in the Evolutionary Divergence of Genes
Frameshift mutations are generally considered detrimental through the appearance of premature stop codons. However, the involvement of frameshift mutations for the divergence of some gene families in vertebrate and higher plants has been uncovered by an in silico analysis; among vertebrates, 16 examples (in 15 families) have led to functional diversity by frameshifting (Raes and Van de Peer 2005Go). It is often the case that frameshift mutations by in/del and alternative splicing are positioned in a particular region, especially the C terminal, resulting in the production of protein isoforms. In theory, although the genes influenced by frameshift mutations may gain a new function, they mostly possess a potential or "protofunction." In higher plants, the DEF/AP3 subfamily influences the formation of petals and stamens via B-function genes for MCM1, AGAMOUS, DEFICIENS and SRF (MADS) box proteins. The 2 lineages derived from the same origin can clearly be distinguished from their completely different C-terminal motifs. One motif is referred to as paleoAP3 and is found in lower eudicots, magnoliid dicots, monocots, and basal angiosperms, and the other is termed the euAP3 motif and is found in higher eudicots (Kramer et al. 1998Go). The paleoAP3 motif can still promote stamen formation, but, as result of divergence, it has lost the ability to induce petal formation in higher eudicots (Lamb and Irish 2003Go). Interestingly, the functional difference between the 2 motifs has derived from a frameshift mutation resulting in an 8-bp in/del (Vandenbussche et al. 2003Go). Clearly, frameshift mutations have played key roles in the evolutionary divergence of genes.

Evidence for the Involvement of a Frameshift Mutation in the Generation of Transit Peptides
Gene families are generated by repeated duplications. Thus, the Arabidopsis genome includes at least 780 gene families, each with more than 5 members, and the rice genome has 824 gene families (Horan et al. 2005Go). In addition to such genes, segmental duplications are widespread in plants. For example, in the Arabidopsis and rice genomes, up to 90% and 62% of loci are duplicated, respectively. However, the importance of such segmental duplication is still under investigation.

The origin of the transit peptide of orf160 was a partially duplicated sequence from ppr564, which belongs to the PPR gene family. This family comprises 470 genes in Arabidopsis and 655 genes in rice. Although this gene family comprises quite a few genes in vertebrates and yeast, there has been extraordinary expansion in plants. This indicates that the PPR gene family expanded dramatically after the divergence of plants from other life forms (Lurin et al. 2004Go). Genes encoding for functional proteins must be under positive selection pressure. Functionally redundant genes are not protected against the accumulation of deleterious mutations: they became pseudogenes or evolved into other functional genes during evolution (Nowak et al. 1997Go). Pseudogenes have also been proposed to serve as a sequence pool for generating genetic diversity (Balakirev and Ayala 2003Go). In our study, a partial duplication of ppr564 turned out to become the transit peptide by frameshift mutation. Thus, intriguingly, frameshift mutations do not necessarily have to be deleterious. We provide a novel mechanism of targeting signal acquisition by frameshift mutation derived from an irrelevant fragment of targeting signal and indicate very liberal requirements for transit peptide composition.


    Supplementary Material
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
The nucleotide sequence data reported in this paper have been deposited in the Data Bank of Japan/European Molecular Biology Laboratory/GenBank database under accession number AB244817 (ppr564).


    Acknowledgements
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 
We thank Dr Y. Niwa for providing the S65TGFP construct. This work was partly supported by a Grant-in-Aid for Scientific Research from the Ministry of Education, Science, Sports and Culture of Japan (grant 15208001) to N.T. and K.K.


    Footnotes
 
Geoffrey McFadden, Associate Editor


    References
 TOP
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 Supplementary Material
 Acknowledgements
 References
 

    Adams KL and Palmer JD. (2003) Evolution of mitochondrial gene content: gene loss and transfer to the nucleus. Mol Phylogenet Evol 29:380–395.[CrossRef][Web of Science][Medline]

    Adams KL, Rosenblueth M, Qiu YL, Palmer JD. (2001) Multiple losses and transfers to the nucleus of two mitochondrial succinate dehydrogenase genes during angiosperm evolution. Genetics 158:1289–1300.[Abstract/Free Full Text]

    Andersson SG, Zomorodipour A, Andersson JO, Sicheritz-Ponten T, Alsmark UC, Podowski RM, Naslund AK, Eriksson AS, Winkler HH, Kurland CG. (1998) The genome sequence of Rickettsia prowazekii and the origin of mitochondria. Nature 396:133–140.[CrossRef][Medline]

    Arimura S, Takusagawa S, Hatano S, Nakazono M, Hirai A, Tsutsumi N. (1999) A novel plant nuclear gene encoding chloroplast ribosomal protein S9 has a transit peptide related to that of rice chloroplast ribosomal protein L12. FEBS Lett 450:231–234.[CrossRef][Web of Science][Medline]

    Arimura S and Tsutsumi N. (2002) A dynamin-like protein (ADL2b), rather than FtsZ, is involved in Arabidopsis mitochondrial division. Proc Natl Acad Sci USA 99:5727–5731.[Abstract/Free Full Text]

    Baker A and Schatz G. (1987) Sequences from a prokaryotic genome or the mouse dihydrofolate reductase gene can restore the import of a truncated precursor protein into yeast mitochondria. Proc Natl Acad Sci USA 84:3117–3121.[Abstract/Free Full Text]

    Balakirev ES and Ayala FJ. (2003) Pseudogenes: are they "junk" or functional DNA? Annu Rev Genet 37:123–151.[CrossRef][Web of Science][Medline]

    Boore JL and Brown WM. (1998) Big trees from little genomes: mitochondrial gene order as a phylogenetic tool. Curr Opin Genet Dev 8:668–674.[CrossRef][Web of Science][Medline]

    Brennicke A, Grohmann L, Hiesel R, Knoop V, Schuster W. (1993) The mitochondrial genome on its way to the nucleus: different stages of gene transfer in higher plants. FEBS Lett 325:140–145.[CrossRef][Web of Science][Medline]

    Chiu W, Niwa Y, Zeng W, Hirano T, Kobayashi H, Sheen J. (1996) Engineered GFP as a vital reporter in plants. Curr Biol 6:325–330.[CrossRef][Web of Science][Medline]

    Clifton SW, Minx P, Fauron CM, et al. (13 co-authors). (2004) Sequence and comparative analysis of the maize NB mitochondrial genome. Plant Physiol 136:3486–3503.[Abstract/Free Full Text]

    Emanuelsson O, Nielsen H, Brunak S, von Heijne G. (2000) Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J Mol Biol 300:1005–1016.[CrossRef][Web of Science][Medline]

    Figueroa P, Gomez I, Holuigue L, Araya A, Jordana X. (1999) Transfer of rps14 from the mitochondrion to the nucleus in maize implied integration within a gene encoding the iron-sulphur subunit of succinate dehydrogenase and expression by alternative splicing. Plant J 18:601–609.[CrossRef][Web of Science][Medline]

    Gray MW. (1992) The endosymbiont hypothesis revisited. Int Rev Cytol 141:233–357.[Web of Science][Medline]

    Grohmann L, Kitakawa M, Isono K, Goldschmidt-Reisin S, Graack HR. (1994) The yeast nuclear gene MRP-L13 codes for a protein of the large subunit of the mitochondrial ribosome. Curr Genet 26:8–14.[CrossRef][Web of Science][Medline]

    Handa H. (2003) The complete nucleotide sequence and RNA editing content of the mitochondrial genome of rapeseed (Brassica napus L.): comparative analysis of the mitochondrial genomes of rapeseed and Arabidopsis thaliana. Nucleic Acids Res 31:5907–5916.[Abstract/Free Full Text]

    Horan K, Lauricha J, Bailey-Serres J, Raikhel N, Girke T. (2005) Genome cluster database. A sequence family analysis platform for Arabidopsis and rice. Plant Physiol 138:47–54.[Abstract/Free Full Text]

    Kadowaki K, Kubo N, Ozawa K, Hirai A. (1996) Targeting presequence acquisition after mitochondrial gene transfer to the nucleus occurs by duplication of existing targeting signals. EMBO J 15:6652–6661.[Web of Science][Medline]

    Kaneko T, Sato S, Kotani H, et al. (24 co-authors). (1996) Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions. DNA Res 3:109–136.[Abstract]

    Koc EC, Burkhart W, Blackburn K, Moyer MB, Schlatzer DM, Moseley A, Spremulli LL. (2001) The large subunit of the mammalian mitochondrial ribosome. Analysis of the complement of ribosomal proteins present. J Biol Chem 276:43958–43969.[Abstract/Free Full Text]

    Kramer EM, Dorit RL, Irish VF. (1998) Molecular evolution of genes controlling petal and stamen development: duplication and divergence within the APETALA3 and PISTILLATA MADS-box gene lineages. Genetics 149:765–783.[Abstract/Free Full Text]

    Kubo N, Harada K, Hirai A, Kadowaki K. (1999) A single nuclear transcript encoding mitochondrial RPS14 and SDHB of rice is processed by alternative splicing: common use of the same mitochondrial targeting signal for different proteins. Proc Natl Acad Sci USA 96:9207–9211.[Abstract/Free Full Text]

    Kubo N, Jordana X, Ozawa K, Zanlungo S, Harada K, Sasaki T, Kadowaki K. (2000) Transfer of the mitochondrial rps10 gene to the nucleus in rice: acquisition of the 5' untranslated region followed by gene duplication. Mol Gen Genet 263:733–739.[CrossRef][Web of Science][Medline]

    Kubo T, Nishizawa S, Sugawara A, Itchoda N, Estiati A, Mikami T. (2000) The complete nucleotide sequence of the mitochondrial genome of sugar beet (Beta vulgaris L.) reveals a novel gene for tRNA(Cys)(GCA). Nucleic Acids Res 28:2571–2576.[Abstract/Free Full Text]

    Lamb RS and Irish VF. (2003) Functional divergence within the APETALA3/PISTILLATA floral homeotic gene lineages. Proc Natl Acad Sci USA 100:6558–6563.[Abstract/Free Full Text]

    Long M, de Souza SJ, Rosenberg C, Gilbert W. (1996) Exon shuffling and the origin of the mitochondrial targeting function in plant cytochrome c1 precursor. Proc Natl Acad Sci USA 93:7727–7731.[Abstract/Free Full Text]

    Lucattini R, Likic VA, Lithgow T. (2004) Bacterial proteins predisposed for targeting to mitochondria. Mol Biol Evol 21:652–658.[Abstract/Free Full Text]

    Lurin C, Andres C, Aubourg S, et al. (19 co-authors). (2004) Genome-wide analysis of Arabidopsis pentatricopeptide repeat proteins reveals their essential role in organelle biogenesis. Plant Cell 16:2089–2103.[Abstract/Free Full Text]

    Martin W. (2003) Gene transfer from organelles to the nucleus: frequent and in big chunks. Proc Natl Acad Sci USA 100:8612–8614.[Free Full Text]

    McFadden GI. (1999) Endosymbiosis and evolution of the plant cell. Curr Opin Plant Biol 2:513–519.[CrossRef][Web of Science][Medline]

    Muller M and Martin W. (1999) The genome of Rickettsia prowazekii and some thoughts on the origin of mitochondria and hydrogenosomes. Bioessays 21:377–381.[CrossRef][Web of Science][Medline]

    Nagata T, Nemoto Y, Hasezawa S. (1992) Tobacco BY-2 cells as the "HeLa" cells in the cell biology of higher plants. Int Rev Cytol 132:1–30.[CrossRef][Web of Science]

    Notsu Y, Masood S, Nishikawa T, Kubo N, Akiduki G, Nakazono M, Hirai A, Kadowaki K. (2002) The complete sequence of the rice (Oryza sativa L.) mitochondrial genome: frequent DNA sequence acquisition and loss during the evolution of flowering plants. Mol Genet Genomics 268:434–445.[CrossRef][Web of Science][Medline]

    Nowak MA, Boerlijst MC, Cooke J, Smith JM. (1997) Evolution of genetic redundancy. Nature 388:167–171.[CrossRef][Medline]

    Ogihara Y, Yamazaki Y, Murai K, et al. (14 co-authors). (2005) Structural dynamics of cereal mitochondrial genomes as revealed by complete nucleotide sequencing of the wheat mitochondrial genome. Nucleic Acids Res 33:6235–6250.[Abstract/Free Full Text]

    Ohno S. (1970) Evolution by gene duplication(Springer-Verlag, Berlin, Germany).

    Quigley F, Martin WF, Cerff R. (1988) Intron conservation across the prokaryote-eukaryote boundary: structure of the nuclear gene for chloroplast glyceraldehyde-3-phosphate dehydrogenase from maize. Proc Natl Acad Sci USA 85:2672–2676.[Abstract/Free Full Text]

    Raes J and Van de Peer Y. (2005) Functional divergence of proteins through frameshift mutations. Trends Genet 21:428–431.[CrossRef][Web of Science][Medline]

    Sandoval P, Leon G, Gomez I, Carmona R, Figueroa P, Holuigue L, Araya A, Jordana X. (2004) Transfer of RPS14 and RPL5 from the mitochondrion to the nucleus in grasses. Gene 324:139–147.[CrossRef][Web of Science][Medline]

    Small ID and Peeters N. (2000) The PPR motif—a TPR-related motif prevalent in plant organellar proteins. Trends Biochem Sci 25:46–47.[Web of Science][Medline]

    Sugiyama Y, Watase Y, Nagase M, Makita N, Yagura S, Hirai A, Sugiura M. (2005) The complete nucleotide sequence and multipartite organization of the tobacco mitochondrial genome: comparative analysis of mitochondrial genomes in higher plants. Mol Genet Genomics 272:603–615.[CrossRef][Web of Science][Medline]

    Timmis JN, Ayliffe MA, Huang CY, Martin W. (2004) Endosymbiotic gene transfer: organelle genomes forge eukaryotic chromosomes. Nat Rev Genet 5:123–135.[CrossRef][Web of Science][Medline]

    Unseld M, Marienfeld JR, Brandt P, Brennicke A. (1997) The mitochondrial genome of Arabidopsis thaliana contains 57 genes in 366,924 nucleotides. Nat Genet 15:57–61.[CrossRef][Web of Science][Medline]

    Vandenbussche M, Theissen G, Van de Peer Y, Gerats T. (2003) Structural diversification and neo-functionalization during floral MADS-box gene evolution by C-terminal frameshift mutations. Nucleic Acids Res 31:4401–4409.[Abstract/Free Full Text]

    von Heijne G, Steppuhn J, Herrmann RG. (1989) Domain structure of mitochondrial and chloroplast targeting peptides. Eur J Biochem 180:535–545.[Web of Science][Medline]

    Wolstenholme DR. (1992) Animal mitochondrial DNA: structure and evolution. Int Rev Cytol 141:173–216.[Web of Science][Medline]

Accepted for publication September 7, 2006.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Phil Trans R Soc BHome page
C.J Howe, A.C Barbrook, R.E.R Nisbet, P.J Lockhart, and A.W.D Larkum
The origin of plastids
Phil Trans R Soc B, August 27, 2008; 363(1504): 2675 - 2685.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow FREE Full Text (PDF) Freely available
Right arrow All Versions of this Article:
23/12/2405    most recent
msl112v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Ueda, M.
Right arrow Articles by Kadowaki, K.-i.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ueda, M.
Right arrow Articles by Kadowaki, K.-i.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?