Given the results of the last poll (most folks are using RNA-Seq for gene expression analysis), I thought we could dive a little deeper on the subject and see just what y’all think about some of the specifics of using RNA-Seq for gene expression analysis.

Based on what I’ve seen, there seems to be quite a range when discussing the appropriate sequence depth for gene expression analysis. i.e. the instrument manufacturer people say one thing and the genomics core people recommend quite another. Let us know what you think!

arrow

The poll is located in the right side-bar. I’ve had to move it down below the fold due to the addition of some advertising space. Sorry about that but got to pay the bills.

Incoming search terms:

  • right side arrow
  • arrows pointing right and down

The authors have developed an RNA-Seq analysis workflow for single-ended Illumina reads, termed RseqFlow. This workflow includes a set of analytic functions, such as quality control for sequencing data, signal tracks of mapped reads, calculation of expression levels, identification of differentially expressed genes and coding SNPs calling. This workflow is formalized and managed by the Pegasus Workflow Management System, which maps the analysis modules onto available computational resources, automatically executes the steps in the appropriate order and supervises the whole running process. RseqFlow is available as a Virtual Machine (VM) with all the necessary software, which eliminates any complex configuration and installation steps.

Availability and implementation: http://genomics.isi.edu/rnaseq

Wang Y et al. (2011) RseqFlow: Workflows for RNA-Seq data analysis Bioinformatics [Epub ahead of print]. [abstract]

Incoming search terms:

  • rna seq workflow
  • rna seq analysis ppt
  • rna seq analysis workflow
  • analysis of rna-seq data
  • RNAseq data analysis workflow
  • rna seq data analysis workflow
  • rna sequencing workflow
  • rna seq data analysis workflow ppt
  • snp calling from rna-seq
  • rna seq snp calling

Incoming search terms:

  • rna-seq differential expression
  • rna seq differential expression analysis
  • rna seq differential expression
  • differential expression rna-seq
  • differential expression analysis rna-seq
  • rnaseq differential expression
  • differential expression rna seq
  • RNA-seq differential expression analysis
  • differential expression rnaseq
  • rna seq analysis differential expression

Phalaenopsis aphroditeBeing one of the largest families in Angiosperm, Orchidaceae displays a great biodiversity resulted from adaptation to diverse habitats. Genomic information of orchids is rather limited regardless of their unique and interesting biological features, thus impeding advanced molecular research. Here the authors report a strategy to integrate sequence outputs of the moth orchid, Phalaenopsis aphrodite, from two high-throughput sequencing platform technologies, Roche 454 and Illumina/Solexa, in order to maximize assembly efficiency. Tissues collected for cDNA library preparation included wide range of vegetative and reproductive tissues. Read more

Incoming search terms:

  • De novo assembly of expressed transcripts and construction of a transcriptome database of Phalaenopsis aphrodite Plant Cell Physiol

Comparative Analysis of RNA-Seq Alignment Algorithms and the RNA-Seq Unified Mapper (RUM).

A critical task in high throughput sequencing is aligning millions of short reads to a reference genome. Alignment is especially complicated for RNA sequencing (RNA-Seq) because of RNA splicing. A number of RNA-Seq algorithms are available, and claim to align reads with high accuracy and efficiency while detecting splice junctions. RNA-Seq data is discrete in nature; therefore with reasonable gene models and comparative metrics RNA-Seq data can be simulated to sufficient accuracy to enable meaningful benchmarking of alignment algorithms. The exercise to rigorously compare all viable published RNA-Seq algorithms has not previously been performed. Read more

Incoming search terms:

  • RUM rna-seq
  • rum rnaseq analysis
  • rum bioinformatics
  • fusion genes rna seqrum
  • rum rna unified mapper
  • rum rna
  • rum pipeline
  • rum mapping
  • rum mapper read names
  • rum mapper arabidopsis

My thanks to a reader who pointed out this publication covering a recent topic of discussion on the RNA-Seq Blog.  (see The Magic of RNA-Seq)

This study found that for experiments performed with a small number of biological replicates, significant results may be due to biological variation and may not be reproducible; and it is impossible to know whether expression patterns are specific to the individuals in the study or are a characteristic of the study populations.

These ideas are now widely accepted for DNA microarray experiments, where a large number of biological replicates are now required to justify scientific conclusions.

Their analysis suggests that as biological variability is a fundamental characteristic of gene expression, sequencing experiments should be subject to similar requirements.

  • Hansen, K.D., Wu, Z., Irizarry, R.A. & Leek, J.T. Sequencing technology does not eliminate biological variability. Nat Biotechnol 29, 572–573 (2011). [abstract]

Incoming search terms:

  • rna-seq no replicate
  • sequencing technology does not eliminate biological variability
  • why rna sequencing is not widely

High throughput sequencing technology provides us unprecedented opportunities to study transcriptome dynamics. Compared to microarray-based gene expression profiling, RNA-Seq has many advantages, such as high resolution, low background, and ability to identify novel transcripts. Moreover, for genes with multiple isoforms, expression of each isoform may be estimated from RNA-Seq data. Despite these advantages, recent work revealed that base level read counts from RNA-Seq data may not be randomly distributed and can be affected by local nucleotide composition. It was not clear though how the base level read count bias may affect gene level expression estimates. Read more

Incoming search terms:

  • rna-seq bias
  • biases in rna-seq
  • rnaseq bias
  • expression rna seq length bias
  • bias detection theme
  • transcript length bias in rna-seq data
  • rnaseq additive
  • rna-seq read bias
  • rna-seq orf correction
  • rna-seq bias two peaks

Slides of the talk on “Quantitatively deconvolving alternative RNA secondary structures” by Regina Bohnert at the HiTSeq-SIG of ISMB/ECCB 2011 in Vienna, Austria, on July 15, 2011 can be found at:

http://www.fml.tuebingen.mpg.de/raetsch/lectures/HiTSeq-SIG-sQuant.pdf/at_download/file

The package is focused on finding differential exon usage using RNA-seq exon counts from samples with different experimental designs. It provides functions that allows the user to make the necessary statistical tests based on a model that uses the negative binomial distribution to estimate the variance between biological replicates. The package also provides functions for the visualization and exploration of the results.

The software is available at: http://watson.nci.nih.gov/bioc_mirror/packages/2.9/bioc/html/DEXSeq.html

Incoming search terms:

  • dexseq
  • dexseq tutorial
  • dex seq
  • dex-seq
  • dexseq rna-seq
  • athway analysis dexseq
  • isoform analysis with dexseq
  • dexseq transcriptomics analyze using
  • dexseq rna seq
  • dexseq rna analysis

why-do-you-rna-seq

  • 71 Total Respondents
  • More than twice as many of you performing RNA-Seq for gene expression as for the next highest application, novel transcript detection.

Given all that RNA-Seq is capable of providing for us, it is interesting to me that most of you are using RNA-Seq for gene expression analysis; something for which microarrays and PCR have proven more than adequate to provide for us for the past two decades.  However, since this is the case, we thought it would be beneficial to provide some info.  (See post below – http://rna-seqblog.com/information/the-magic-of-rna-seq/)

 

Incoming search terms:

  • grape rnaseq
  • A multiplex RNAseq strategy to profile poly(A ) RNA: application to analysis of transcription response and 3 end formation
  • gene effective length
  • transcriptome gene expression pooling tissues seqanswers
  • tRNA RNA-seq

There certainly is a lot of excitement and much buzz surrounding RNA-Seq’s forecasted replacement of microarrays for gene expression analysis. (I wonder… who could be generating this hype?) From speaking to those interested in RNA-Seq for gene expression profiling,  it seems there is somewhat of a frenzy, and a notion that RNA-Seq has some kind of magical power, so that the normal rules of good experimental design and practice don’t seem to apply here (i.e. need for replicate samples).  This is possibly partly due to the facts that it is still so new and still very expensive. We did a quick scan of some recent reviews to put together the following points for discussion. Read more

Incoming search terms:

  • removing technical variability in rna-seq data using conditional quantile normalization
  • what is a run in RNA seq
  • hansen 2011 ngs rnaseq
  • standards guidelines and best practices for rna-seq encode
  • running replicate samples tophat rna-seq
  • rnaseg pooling samples
  • rna-seq technical variability and sampling
  • rna-seq sample pooling
  • rna-seq pooling samples
  • RNA-seq microrna normalization

Legume Pod Borer The legume pod borer, Maruca vitrata (Lepidoptera: Crambidae), is an insect pest species of crops grown by subsistence farmers in tropical regions of Africa. We present the de novo assembly of 3729 contigs from 454- and Sanger-derived sequencing reads for midgut, salivary, and whole adult tissues of this non-model species. Read more

Incoming search terms:

  • maruca vitrat andentomopathogenic fungi
  • transcriptomic in legumes

RNA-Seq Gene Expression

Alternative RNA-Seq application schemas. (a) In an iterative approach, high-abundance transcripts can be identified in low-read sequencing runs, followed by iterative subtraction of the sequences dominating each sample. A profile from the combined runs promises higher measurement precision of expression levels for weakly to moderately expressed transcripts. (b) After normalization of an aliquot (top row), the strength of RNA-Seq in de novo sequence discovery can be exploited for the compilation of a comprehensive target library, against which a custom microarray can then be designed easily (Leparc et al., 2009) The remaining aliquot can then be quantitatively profiled on this optimized array (bottom row). The performance of both approaches of course depends on the quality of the subtraction or normalization step, respectively.

  • Labaj PP, Leparc GG, Linggi BE, Markillie LM, Wiley HS, Kreil DP. (2011) Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling. Bioinformatics 27(13), i383-91. [abstract]

Incoming search terms:

  • RNA-seq coexpression

Next Page →

  • Social Networking Pages

    Linkedin Group

  • Follow Me on Pinterest
  • RSS SEQanswers – RNA Sequencing

    • reason for low mapping rate?? May 23, 2013
      we did RNASeq using HiSeq 2000 100PE. When the data were back, I mapping them to the reference sequence, but got very low mapping rate (30-40%). I... […]
      miaom
    • cross-species data - questions about normalization May 23, 2013
      Hi, I have some data form various samples (cell types) in different species. I want to compare and analyze gene expression variability across the... […]
      trelek2
    • CuffDiff strange output May 23, 2013
      Hi, I hope that someone can be so gentle to help me. I'm analizing some data from RNA-Seq with TopHat and Cufflinks and I focus my attention on... […]
      Pruexel
    • cannot away with cuffdiff,incredible May 23, 2013
      Hi,all I have 4(A,B,C,D) sample in 4 times(increasing time),I got diff result in 3 different cuffdiff 1.cuffdiff 3(A,B,C) individual... […]
      upper
    • TopHat extremely low paired mapping rate. PLS HELP! May 22, 2013
      Hey guys, I have some problems with my paried-end RNA seq analysis on Galaxy. As you can see in the bam flagstat output, my tophat alignment rate is... […]
      Felix.Lee
    • Identifying small RNA sequence within whole genome sequence May 21, 2013
      Hi all, I want to know if there are any useful bioinformatic tool to find small RNA sequence within a whole bacteria genome. Thank you in... […]
      Inma
  • RSS Biostar – RNA-Seq

    • Why am I getting so many unmapped reads in STAR, classified as "too short"?
      I am currently using STAR to map several Hi-SEQ mRNA runs. I'm having trouble getting a decent amount of reads to map, but I don't really understand why. I'm hoping you can shed some light :) In the final log, only about 50% (or less) of the reads map to the reference. I'm using a GTF in addition to the genome. The unmapped bin that most […]
    • What are the best practices for SNP identification in RNA seq transcriptome data
      I have 20 RICE RNA seq tranascriptome data hiseq 2000 platform paired end reads. I aligned fasta reads with BWA and remove PCR duplicates with PICARD. Later I call SNP with samtools using various parameters. I would like to clarify what parameters should I used while alinging to reference rice genome for looking SNP location 100 bp upstream and 250 bp downst […]
    • How do TopHat options -g , --supress-hits, and Bowtie options interplay?
      Hi, I am currently using TopHat2 to map RNA-seq runs. I think there have been some changes pertaining the -g option. Does anyone know how it works now? I used to think that setting -g would look for n alignments for a given read, report them [if top-scoring] and discard those reads that had more than g [top scoring] alignments. Now, the description sounds mo […]
    • What happened to -k in TopHat for multiple-mapping reads?
      Selecting -g n in tophat does not discard reads mapping more than n, but instead only reports n alignments for those out all all their TOP scoring alignments. I think there used to be an option -k that would allow one to discard reads that topped x alignments -- whatever happened to that? I only see -g in the tophat 2 manual, no reporting options like before […]
    • Does tophat use the library-type information for mapping, or just for the XS flag?
      When I specify library-type to TopHat, i.e., first-strand, second-strand, unstranded, TopHat appends a value + or - to the XS:A flag, which is useful for subsequent analyses, such as annotation. However, does this information actually influence the "mappability" of reads, or is this unaffected? My thinking is that the information would be considere […]
    • Purpose of Y-shaped adapters in Illumina Sequencing?
      Hi all, Y adapters different sequences to be annealed to the 5' and 3' ends of each molecule in a library. The arms of the Y are unique, and the middle part, connected to the DNA fragment, is complementary. What are the advantages of this? My take of this over having fully-complementary adapters (ADAPTER1 - - - - - ADAPTER1) is that: -Upon primer a […]