Gene annotations, such as those in GENCODE, are derived primarily from alignments of spliced cDNA sequences and protein sequences. The impact of RNA-seq data on annotation has been confined to major projects like ENCODE and Illumina Body Map 2.0. Researchers ...
Read More »Long-read sequencing of the transcriptome reveals novel spliced genes that are not annotated in GENCODE and are missed by short-read RNA-Seq
Short-read sequencing has enabled the de novo assembly of several individual human genomes, but with inherent limitations in characterizing repeat elements. Now, an international team led by researchers at USC have sequenced a Chinese individual HX1 by single-molecule real-time (SMRT) ...
Read More »Comparison of GENCODE and RefSeq gene annotation and the impact of reference geneset on variant effect prediction
A vast amount of DNA variation is being identified by increasingly large-scale exome and genome sequencing projects. To be useful, variants require accurate functional annotation and a wide range of tools are available to this end. McCarthy et al recently ...
Read More »51% of non-canonical splice sites are not annotated in GENCODE
Scientists at the Pontifical Catholic University of Chile have uncovered the diversity of non-canonical splice sites at the human transcriptome using deep transcriptome profiling. They mapped a total of 3.7 billion human RNA-seq reads and developed a set of stringent ...
Read More »