Iso-Seq analysis of SMRT sequencing data produces most detailed maize transcriptome yet

Improvements in maize breeding are required to feed a growing population and will depend on a more complete understanding of global gene expression in the plant. In a new publication from Cold Spring Harbor Laboratory (CSHL), scientists produced the largest collection of full-length transcripts for the maize genome and a significantly improved genome annotation using Single Molecule, Real-Time (SMRT®) Sequencing and the Iso-Seq™ analysis method from Pacific Biosciences. The work appears online today in Nature Communications.

Led by CSHL scientists Doreen Ware, who is also affiliated with the USDA Agricultural Research Service, and Bo Wang, the team embarked on the project to leverage the advantages that SMRT Sequencing and the Iso-Seq method offer for transcriptome analysis in plants. The Iso-Seq protocol allows scientists to generate long reads covering full-length gene transcripts, providing a more accurate view of gene structure, gene expression, and important mechanisms such as alternative gene splicing.

Iso-Seq analysis of SMRT Sequencing data more than doubled the number of isoforms, corrected numerous previously mis-annotated gene models, and identified many novel genes and long non-coding RNAs. Additionally, the team showed that long reads are even more important than expected for transcriptome studies. The average transcript length in this project — almost 3 kb — is much longer than that from the previous maize annotation, highlighting a strong bias limitation in previous approaches.

CIRCOS visualization of different data at the genome-wide level

rna-seq

(a) Karyotype of maize genome. (b) Comparison of gene density between genes covered by RefGen_v3 and the PacBio data set. Gene density was calculated in a 1-Mb sliding window at 20kb intervals. (c) Comparison of isoform density between RefGen_v3 and PacBio sequences; isoforms density was calculated in a 1-Mb sliding window at 20kb intervals. (d) CG methylation level. (e) CHG methylation level. (f) CHH methylation level. Each methylation in 1Mb bins on each chromosome. (g) Repeat density in genome. (h) lncRNA density, in 1Mb bins on each chromosome. (i) Linkage of fusion transcripts: purple, intra-chromosomal; dark yellow, inter-chromosomal.

“Although data from short-read sequencing have accumulated over recent years, they do not provide full-length sequence for each RNA, limiting their utility for defining alternatively spliced forms,” Wang et al. write. “In some cases, short-read sequencing generates low-quality transcripts, leading to incorrect annotations.”

“We congratulate the scientists at Cold Spring Harbor Laboratory for this very impressive work,” said Jonas Korlach, Chief Scientific Officer of Pacific Biosciences. “Maize is an incredibly difficult genome to work with, and therefore really benefits from the advantages of SMRT Sequencing and the Iso-Seq method through their ability to produce highly accurate full-length transcripts without the need for post-sequence assembly.”

About Pacific Biosciences

Pacific Biosciences of California, Inc. (NASDAQ:PACB) offers sequencing systems to help scientists resolve genetically complex problems. Based on its novel Single Molecule, Real-Time (SMRT®) technology, Pacific Biosciences’ products enable: de novo genome assembly to finish genomes in order to more fully identify, annotate and decipher genomic structures; full-length transcript analysis to improve annotations in reference genomes, characterize alternatively spliced isoforms in important gene families, and find novel genes; targeted sequencing to more comprehensively characterize genetic variations; and real-time kinetic information for epigenome characterization. Pacific Biosciences’ technology provides high accuracy, ultra-long reads, uniform coverage, and is the only DNA sequencing technology that provides the ability to simultaneously detect epigenetic changes. PacBio® sequencing systems, including consumables and software, provide a simple, fast, end-to-end workflow for SMRT Sequencing. More information is available at www.pacb.com.

Source – Globe Newswire

Wang B, Tseng E, Regulski M, Clark TA, Hon T, Jiao Y, Lu Z, Olson A, Stein JC, Ware D. (2016) Unveiling the complexity of the maize transcriptome by single-molecule long-read sequencing. Nat Commun 7:11708. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.