SeqAssist – a novel toolkit for preliminary analysis of next-generation sequencing data

While next-generation sequencing (NGS) technologies are rapidly advancing, an area that lags behind is the development of efficient and user-friendly tools for preliminary analysis of massive NGS data. As an effort to fill this gap to keep up with the fast pace of technological advancement and to accelerate data-to-results turnaround, researchers at the University of Southern Mississippi developed a novel software package named SeqAssist (“Sequencing Assistant” or SA).

SeqAssist takes NGS-generated FASTQ files as the input, employs the BWA-MEM aligner for sequence alignment, and aims to provide a quick overview and basic statistics of NGS data.

It consists of three separate workflows:

  1. the SA_RunStats workflow generates basic statistics about an NGS dataset, including numbers of raw, cleaned, redundant and unique reads, redundancy rate, and a list of unique sequences with length and read count;
  2. the SA_Run2Ref workflow estimates the breadth, depth and evenness of genome-wide coverage of the NGS dataset at a nucleotide resolution; and
  3. the SA_Run2Run workflow compares two NGS datasets to determine the redundancy (overlapping rate) between the two NGS runs.

 

rna-seq

Statistics produced by SeqAssist or derived from SeqAssist output files are designed to inform the user: whether, what percentage, how many times and how evenly a genomic locus (i.e., gene, scaffold, chromosome or genome) is covered by sequencing reads, how redundant the sequencing reads are in a single run or between two runs. These statistics can guide the user in evaluating the quality of a DNA library prepared for RNA-Seq or genome (re-)sequencing and in deciding the number of sequencing runs required for the library. The devlopers have tested SeqAssist using a synthetic dataset and demonstrated its main features using multiple NGS datasets generated from genome re-sequencing experiments.

Availability – SeqAssist is avialble at: http://orca.st.usm.edu/cbbl/seqassist/

Peng Y, Maxwell AS, Barker ND, Laird JG, Kennedy AJ, Wang N, Zhang C, Gong P. (2014) SeqAssist: a novel toolkit for preliminary analysis of next-generation sequencing data. BMC Bioinformatics 15 Suppl 11:S10. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.