SeqGSEATranscriptome sequencing (RNA-Seq) has become a key technology in transcriptome studies because it can quantify overall expression levels and the degree of alternative splicing for each gene simultaneously. Many methods and tools, including quite a few R / Bioconductor packages, have been developed to deal with RNA-Seq data for differential expression analysis and thereafter functional analysis aiming at novel biological and biomedical discoveries. However, those tools mainly focus on each gene’s overall expression and may miss the opportunities for discoveries regarding alternative splicing or the combination of the two.

SeqGSEA is novel R / Bioconductor package to derive biological insight by integrating differential expression (DE) and differential splicing (DS) from RNA-Seq data with functional gene set analysis. Due to the digital feature of RNA-Seq count data, the package utilizes negative binomial distributions for statistical modeling to first score differential expression and splicing in each gene, respectively.

Then, integration strategies are applied to combine the two scores for integrated gene set enrichment analysis. See the publication Wang and Cairns (2013) for more details. The SeqGSEA package can also give detection results of differentially expressed genes and differentially spliced genes based on sample label permutation.

Availability – The package can be accessed at the URL: http://bioconductor.org/packages/rel…l/SeqGSEA.html

Documentationhttp://bioconductor.org/packages/release/bioc/vignettes/SeqGSEA/inst/doc/SeqGSEA.pdf

Questions, comments, or suggestions -  xi.wang@newcastle.edu.au

Incoming search terms:

  • www rna-seqblog com seqgsea-gene-set-enrichment-analysis-of-rna-seq-data
  • gene set enrichment analysis ppt kai wang
  • gsea snp rna seq
  • GSEA USING R PPT
  • pathway rna-seq
  • seqgsea gene set enrichment
  • r help gene enrichment analysis using r
  • ppt for rna enrichment
  • pathway analysis rna seq
  • GSEA USING R PPT \

The RegulatoryGenomics website posts and updates a comprehensive list of tools for RNA-Seq analysis.

This is their current version.

Spliced-mappers

Method

Reference

Web-site

TopHap

(Trapnell et al. 2009)

http://tophat.cbcb.umd.edu/

MapSplice

(Wang et al. 2010)

http://www.netlab.uky.edu/p/bioinfo/MapSplice

SpliceMap

(Auger et al. 2010)

http://www.stanford.edu/group/wonglab/SpliceMap/

HMMSplicer

(Dimon et al. 2010)

http://derisilab.ucsf.edu/index.php?software=105

TrueSight

(Li et al. 2012b)

http://bioen-compbio.bioen.illinois.edu/TrueSight/

SOAPsplice

(Huang et al. 2011)

http://soap.genomics.org.cn/soapsplice.html

PASSion

(Zhang et al. 2012)

https://trac.nbic.nl/passion

PALMapper

(Jean et al. 2010)

http://galaxy.raetschlab.org/

SplitSeek

(Ameur et al. 2010)

http://solidsoftwaretools.com/gf/project/splitseek

Supersplat

(Bryant et al. 2010)

http://mocklerlab-tools.cgrb.oregonstate.edu/

SeqSaw

(Wang et al. 2011)

http://bioinfo.au.tsinghua.edu.cn/software/seqsaw

MapNext

(Bao et al. 2009)

http://evolution.sysu.edu.cn/english/software/mapnext.htm

STAR

(Dobin et al. 2012)

http://gingeraslab.cshl.edu/STAR/

GSNAP

(Wu et al. 2010)

http://research-pub.gene.com/gmap/

QPALMA

(De Bona et al. 2008)

http://www.raetschlab.org/suppl/qpalma

OSA

(Hu et al. 2012)

http://omicsoft.com/osa/

  Read more

Incoming search terms:

  • pathyway analysis for rna seq data
  • statistical methods for differential pathway activities
  • star splice junctions
  • solas rna analysis
  • scarf file rna
  • rna seq alternative splicing method
  • hts-clip rna
  • MethodstostudyEvent/IsoformExpressionandAlternativeSplicingfromRNA-Seq|RNA-SeqBlog
  • junction map mrna deep sequencing
  • juncbase alternative splicing

Gene set analysis (GSA) is used to elucidate genome-wide data, in particular transcriptome data. A multitude of methods have been proposed for this step of the analysis, and many of them have been compared and evaluated. Unfortunately, there is no consolidated opinion regarding what methods should be preferred, and the variety of available GSA software and implementations pose a difficulty for the end-user who wants to try out different methods.

To address this, researchers at Chalmers University of Technology, Sweden have developed the R package Piano, that collects a range of GSA methods into the same system, for the benefit of the end-user. Further on they refine the GSA workflow by using modifications of the gene-level statistics. This enables them to divide the resulting gene set P-values into three classes, describing different aspects of gene expression directionality at gene set level.

Piano RNA-SeqThe researchers demonstrate their fully implemented workflow by investigating the impact of the individual components of GSA by using microarray and RNA-seq data. The results show that the evaluated methods are globally similar and the major separation correlates well with our defined directionality classes. As a consequence of this, they suggest to use a consensus scoring approach, based on multiple GSA runs. In combination with the directionality classes, this constitutes a more thorough basis for an enriched biological interpretation.

Availability – Piano is available, together with a user manual, for download at www.sysbio.se/piano.

  • Väremo L, Nielsen J, Nookaew I. (2013) Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods. Nucleic Acids Res [Epub ahead of print]. [article]

Incoming search terms:

  • piano RNA seq
  • biological interpretation
  • gene sets cummeRBund
  • query regarding rna-seq hit/read count
  • piano rna
  • patent gene set
  • gsa rna-seq
  • gene set analysis
  • gene function enrichment 2013
  • functional annotation and enrichment rna sequening

Gene set enrichment (GSE) analysis is a popular framework for condensing information from gene expression profiles into a pathway or signature summary. The strengths of this approach over single gene analysis include noise and dimension reduction, as well as greater biological interpretability. As molecular profiling experiments move beyond simple case-control studies, robust and flexible GSE methodologies are needed that can model pathway activity within highly heterogeneous data sets.

To address this challenge, researchers at Hospital del Mar Medical Research Institute (IMIM),  Spain have developed Gene Set Variation Analysis (GSVA), a GSE method that estimates variation of pathway activity over a sample population in an unsupervised manner. They demonstrate the robustness of GSVA in a comparison with current state of the art sample-wise en-richment methods. Further, they provide examples of its utility in differential pathway activity and survival analysis. Lastly, the researchers show how GSVA works analogously with data from both microarray and RNA-seq experiments.

GSVA provides increased power to detect subtle pathway activity changes over a sample population in comparison to corresponding methods. While GSE methods are generally regarded as end points of a bioinformatic analysis, GSVA constitutes a starting point to build pathway-centric models of biology. Moreover, GSVA contributes to the current need of GSE methods for RNA-seq data. GSVA

Availability – GSVA is an open source software package for R which forms part of the Bioconductor project and can be downloaded at http://www.bioconductor.org/packages/release/bioc/html/GSVA.html.

  • Hänzelmann S, Castelo R, Guinney J. (2013) GSVA: gene set variation analysis for microarray and RNA-Seq data. BMC Bioinformatics 14(1), 7. [abstract]

Incoming search terms:

  • gsva seqanswers
  • BMC bioinformatics gene analysis 2013
  • gsva run
  • GSVA: gene set variation analysis for microarray and RNA-Seq data
  • library(gsva)
  • R GSVA
  • survival analysis rsem rna-seq

  • Social Networking Pages

    Linkedin Group

  • Follow Me on Pinterest
  • RSS SEQanswers – RNA Sequencing

    • How to increase rowsize in heatmap? May 16, 2013
      Hi, I am a complete newbie to all things cummeRbund and am currently fighting with generating readable heatmaps. When I use ... […]
      Mags
    • novoalign mapping May 15, 2013
      Hi, I want to use novoalign to map reads - allowing up to 15 mismatches for 100 bp paired-end reads I am new to novoalign(went through the... […]
      abh
    • Design of expt across multiple lanes May 15, 2013
      Hi, I am performing an RNA-seq experiment to look at differential expression. The design is as follows: 2 populations x 3 biological... […]
      jbono
    • RNA kinds expected in RNA-seq results May 15, 2013
      Hi, We use RNA isolation and library preparation protocols which capture polyadenylated RNA. My question is what kinds of RNA can we expect to... […]
      Kocur
    • Discrepancy between genotype and expressed alleles May 15, 2013
      Hi all, I am working on the analysis of allele-specific expression using both genotype information and RNA-seq data from the same individuals. I... […]
      RedMary
    • Does Cufflinks Give Me Trascriptomes? May 14, 2013
      Hi Everyone, I'm a beginner in this area, please forget any silly question. My situation is that I have a raw scaffold whole genome... […]
      hchang10
  • RSS Biostar – RNA-Seq

    • How do TopHat options -g , --supress-hits, and Bowtie options interplay?
      Hi, I am currently using TopHat2 to map RNA-seq runs. I think there have been some changes pertaining the -g option. Does anyone know how it works now? I used to think that setting -g would look for n alignments for a given read, report them [if top-scoring] and discard those reads that had more than g [top scoring] alignments. Now, the description sounds mo […]
    • What happened to -k in TopHat for multiple-mapping reads?
      Selecting -g n in tophat does not discard reads mapping more than n, but instead only reports n alignments for those out all all their TOP scoring alignments. I think there used to be an option -k that would allow one to discard reads that topped x alignments -- whatever happened to that? I only see -g in the tophat 2 manual, no reporting options like before […]
    • Does tophat use the library-type information for mapping, or just for the XS flag?
      When I specify library-type to TopHat, i.e., first-strand, second-strand, unstranded, TopHat appends a value + or - to the XS:A tag, which is useful for subsequent analyses, such as annotation. However, does this information influence the "mappability" of reads, or is this unaffected? My guess is that the information will be considered for mapping […]
    • Purpose of Y-shaped adapters in Illumina Sequencing?
      Hi all, Y adapters different sequences to be annealed to the 5' and 3' ends of each molecule in a library. The arms of the Y are unique, and the middle part, connected to the DNA fragment, is complementary. What are the advantages of this? My take of this over having fully-complementary adapters (ADAPTER1 - - - - - ADAPTER1) is that: -Upon primer a […]
    • Cell Type composition in a tissue based on gene marker expression
      I am not sure if the following would even make sense.... Tissues are composed of composite cell types, and often there are studies such as microarray/NGS where we perform a collective sampling of cells from these tissues. Information about the composition (say percentage of cell type) is not taken into consideration. In some case (such as brain/cancer), ther […]
    • Which SNP caller / method to use after aligning RNA-seq with TopHat
      Which SNP caller / method can / should I use after aligning RNA-seq data with TopHat? For genomic data I use GATK, but supposedly it is not just as easy as running GATK on the TopHat RNA-seq data. The team from Broad has no information / documentation on how to use GATK for RNA-seq data. I don't have any variants yet from DNA re-sequencing. […]