Quality Control in any high-throughput sequencing technology is a critical step, which if overlooked can compromise an experiment and the resulting conclusions. A number of methods exist to identify biases during sequencing or alignment, yet not many tools exist to interpret biases due to outliers.
Researchers at the Sidney Kimmel Cancer Center have developed iSeqQC, an expression-based QC tool that detects outliers either produced due to variable laboratory conditions or due to dissimilarity within a phenotypic group. iSeqQC implements various statistical approaches including unsupervised clustering, agglomerative hierarchical clustering and correlation coefficients to provide insight into outliers.
Quality control metrics produced by iSeqQC
a) Unsupervised PCA clustering (z-scored normalized) showing tight cluster of samples within the phenotype; b) Hierarchical relationship assigning each sample to its own phenotypic cluster; c) Unsupervised PCA clustering (un-normalized) showing control4 to be phenotypically different; d) Pearson correlation showing relationships between samples among biological replicates; e) Normalized expression of housekeeping genes (GAPDH and beta-actin) among different samples showing low expression of control4 sample; f) GC bias plot showing control4 with lower gene-counts relative to GC content
Availability – iSeqQC can be utilized through command-line (Github: https://github.com/gkumar09/iSeqQC) or web-interface (http://cancerwebpa.jefferson.edu/iSeqQC). A local shiny installation can also be obtained from github (https://github.com/gkumar09/iSeqQC).