In eukaryotic cells, alternative cleavage of 3′ untranslated regions (UTRs) can affect transcript stability, transport and translation. For polyadenylated (poly(A)) transcripts, cleavage sites can be characterized with short-read sequencing using specialized library construction methods. However, for large-scale cohort studies as well as for clinical sequencing applications, it is desirable to characterize such events using RNA-seq data, as the latter are already widely applied to identify other relevant information, such as mutations, alternative splicing and chimeric transcripts.
Here researchers from the Michael Smith Genome Sciences Centre describe KLEAT, an analysis tool that uses de novo assembly of RNA-seq data to characterize cleavage sites on 3′ UTRs. They demonstrate the performance of KLEAT on three cell line RNA-seq libraries constructed and sequenced by the ENCODE project, and assembled using Trans-ABySS. Validating the KLEAT predictions with matched ENCODE RNA-seq and RNA-PET libraries, the researchers show that the tool has over 90% positive predictive value when there are at least three RNA-seq reads supporting a poly(A) tail and requiring at least three RNA-PET reads mapping within 100 nucleotides as validation. They also compare the performance of KLEAT with other popular RNA-seq analysis pipelines that reconstruct 3′ UTR ends, and show that it performs favourably, based on an ROC-like curve.
Availability – KLEAT is available at www.bcgsc.ca/platform/bioinfo/software, and is offered free for academic use.