Improvements in maize breeding are required to feed a growing population and will depend on a more complete understanding of global gene expression in the plant. In a new publication from Cold Spring Harbor Laboratory (CSHL), scientists produced the largest collection of full-length transcripts for the maize genome and a significantly improved genome annotation using Single Molecule, Real-Time (SMRT®) Sequencing and the Iso-Seq™ analysis method from Pacific Biosciences. The work appears online today in Nature Communications.

Led by CSHL scientists Doreen Ware, who is also affiliated with the USDA Agricultural Research Service, and Bo Wang, the team embarked on the project to leverage the advantages that SMRT Sequencing and the Iso-Seq method offer for transcriptome analysis in plants. The Iso-Seq protocol allows scientists to generate long reads covering full-length gene transcripts, providing a more accurate view of gene structure, gene expression, and important mechanisms such as alternative gene splicing.

Iso-Seq analysis of SMRT Sequencing data more than doubled the number of isoforms, corrected numerous previously mis-annotated gene models, and identified many novel genes and long non-coding RNAs. Additionally, the team showed that long reads are even more important than expected for transcriptome studies. The average transcript length in this project — almost 3 kb — is much longer than that from the previous maize annotation, highlighting a strong bias limitation in previous approaches.

“Although data from short-read sequencing have accumulated over recent years, they do not provide full-length sequence for each RNA, limiting their utility for defining alternatively spliced forms,” Wang et al. write. “In some cases, short-read sequencing generates low-quality transcripts, leading to incorrect annotations.”

“We congratulate the scientists at Cold Spring Harbor Laboratory for this very impressive work,” said Jonas Korlach, Chief Scientific Officer of Pacific Biosciences. “Maize is an incredibly difficult genome to work with, and therefore really benefits from the advantages of SMRT Sequencing and the Iso-Seq method through their ability to produce highly accurate full-length transcripts without the need for post-sequence assembly.”

Wang B, Tseng E, Regulski M, Clark TA, Hon T, Jiao Y, Lu Z, Olson A, Stein JC, Ware D. (2016) Unveiling the complexity of the maize transcriptome by single-molecule long-read sequencing. Nat Commun 7:11708. [article]

