Being one of the largest families in Angiosperm, Orchidaceae displays a great biodiversity resulted from adaptation to diverse habitats. Genomic information of orchids is rather limited regardless of their unique and interesting biological features, thus impeding advanced molecular research. Here the authors report a strategy to integrate sequence outputs of the moth orchid, Phalaenopsis aphrodite, from two high-throughput sequencing platform technologies, Roche 454 and Illumina/Solexa, in order to maximize assembly efficiency. Tissues collected for cDNA library preparation included wide range of vegetative and reproductive tissues.
After assembly and trimming processes, 233,823 unique sequences were obtained. Among them, 42,590 contigs averaging 875 base pairs in length were annotated to protein-coding genes, of which 7,263 coding genes were found to be near full length. Sequence accuracy of assembled contigs was validated to be as high as 99.9 %. Genes of tissue-specific expression were also categorized by profiling analysis with RNA-Seq. Gene products targeted to specific subcellular localizations were identified by their annotations. The authors concluded that, with proper assembly to combine outputs of next generation sequencing platforms, transcriptome information can be enriched in gene discovery, functional annotation and expression profiling of a non-model organism.
Su CL, Chao YT, Alex Chang YC, Chen WC, Chen CY, Lee AY, Hwa KT, Shih MC. (2011) De novo Assembly of Expressed Transcripts and Global Analysis of Phalaenopsis aphrodite Transcriptome. Plant Cell Physiol [Epub ahead of print]. [abstract]