RNA-editing is an important post-transcriptional RNA sequence modification performed by two catalytic enzymes, “ADAR”(A-to-I) and “APOBEC”(C-to-U). By utilizing high-throughput sequencing technologies, the biological function of RNA-editing has been actively investigated. Currently, RNA-editing is considered to be a key regulator that controls various cellular functions, such as protein activity, alternative splicing pattern of mRNA, and substitution of miRNA targeting site. DARNED, a public RDD database, reported that there are more than 300-thousands RNA-editing sites detected in human genome(hg19). Moreover, multiple studies suggested that RNA-editing events occur in highly specific conditions. According to DARNED, 97.62 % of registered editing sites were detected in a single tissue or in a specific condition, which also supports that the RNA-editing events occur condition-specifically. Since RNA-seq can capture the whole landscape of transcriptome, RNA-seq is widely used for RDD prediction. However, significant amounts of false positives or artefacts can be generated when detecting RNA-editing from RNA-seq. Since it is difficult to perform experimental validation at the whole-transcriptome scale, there should be a powerful computational tool to distinguish true RNA-editing events from artefacts.
Substantial amounts of systematic artefacts: As simulation-test results, significant amounts of artefacts that consist of 0.29 % of total mapped sites are detected, which turned out to be indistinguishable even by the method combined with two state-of-art variant-callers and extensive a priori knowledges of genomic repeats
Resdeveloped at Seoul National University have developed RDDpred, a Random Forest RDD classifier. RDDpred reports potentially true RNA-editing events from RNA-seq data. RDDpred was tested with two publicly available RNA-editing datasets and successfully reproduced RDDs reported in the two studies (90 %, 95 %) while rejecting false-discoveries (NPV: 75 %, 84 %).
1) RDDpred takes raw alignments or raw RDDs 2) RDDpred arranges condition-specific training-data extracting positive/negative examples by mapping raw RDDs to public database (RADAR, DARNED) and MES-predicted sites, respectively. 3) RDDpred train the condition-specific classifier with arranged training-data and predicts true-editing against artefacts
Availability – RDDpred is available at http://biohealth.snu.ac.kr/software/RDDpred