RNA-sequencing (RNA-Seq) technologies hold enormous promise for novel discoveries in genomics and transcriptomics. In the past year, a surge of reports has analyzed RNA-Seq data to gain a global view of the RNA editome. Opposing results have been presented, giving rise to extensive debate surrounding one of the first such studies in which a daunting list of all 12 types of RNA-DNA differences (RDDs) were identified. Although a consensus is forming that some of the initial “paradigm-shifting” results of this study may be questionable, recent reports on this topic differed in terms of the number and relative abundance of each type of RDD. Many outstanding issues exist, most importantly, the choice of bioinformatic approaches. Here the authors discuss the critical data analysis and experimental design issues of such studies to enable improved systematic investigation of the largely unexplored frontier of single-nucleotide variants in RNA.
Recommended variables for consideration in the design of RNA-Seq experiments for identifying RNA-editing events:
Number of RDDs and % of A-to-G events increase with sequencing depth; accuracy of estimated editing levels increases with read coverage of putative RDDs.
Recommended in order to ensure high total coverage of candidate RDD sites after removal of duplicate reads.
Paired or single-end sequencing
Paired-end sequencing and read pairing during data analysis can significantly improve RDD accuracy.
Quality of sequencing library
High fidelity enzymes for RT and PCR should be adopted. Rate of duplicate reads should be evaluated and minimized. Base quality of reads should be inspected and optimized by sequencing chemistry.
Type of sequencing library
Strand-specific libraries are advantageous for pinpointing specific types of RDDs.
- Lee JH, Ang JK, Xiao X. (2013) Analysis and design of RNA sequencing experiments for identifying RNA editing and other single-nucleotide variants. RNA [Epub ahead of print]. [abstract]