by Jeffrey M. Perkel

When you get right down to it, the difference between a skin cell, say, and a kidney cell, is a matter of gene expression. All cells have the same DNA; it’s the proteins they produce that define their behavior. The instructions to build those proteins are carried by RNA, and researchers have long recognized the value of probing RNAs to gain insight into the expression differences that define tissues, developmental stages and disease.

RNA-Seq vs microarray

Just a few years ago, researchers who wanted to get a 30,000-foot overview of the transcriptional state of a cell—the so-called “transcriptome,” or cellular RNA content—had one option: DNA microarrays. But the rise of next-generation DNA sequencing (NGS) technologies, coupled with plummeting prices, has shifted the technology landscape.

Today, transcriptome analysis is performed most commonly using an NGS application called RNA-seq, in which some RNA pool—total RNA, messenger RNA or noncoding RNA, for instance—is reverse-transcribed into cDNA, converted into a sequencing library, sequenced and analyzed.

The technique offers several advantages over DNA microarrays, says John Marioni, research group leader at the European Bioinformatics Institute on the Wellcome Trust Genome Sciences Campus in Cambridge, UK Most obviously, RNA-seq works even for species for which no reference genome or DNA microarray exists. Microarrays cannot be built without at least a partial genome sequence and some understanding of what sequences the researcher is looking for. And microarray manufacturers produce chips mostly for the classic laboratory models—Drosophila and C. elegans, mouse and rat.

“If you want to look at organisms way down the evolutionary ladder, like sponges or marine mollusks, there’s no way to do that with arrays,” Marioni says.

In contrast, RNA-seq is unbiased. It reads whatever cDNA is in the sample, regardless of whether researchers have seen that DNA before or not.

Marioni, a statistician and computational biologist who develops tools for analyzing RNA-seq data, has been using the technique since 2008. This year he co-authored a paper in which he applied it to genetic differences and variation among 16 mammalian species, including 11 non-human primates, of which seven had “little or no genomic data . . . previously available.” [1]

His goal, he says, is to create tools that can turn raw data into biological insights. “The idea is you get counts of transcripts from primate livers, and you want to develop models to take in count data and get biological inferences out, so you can know these are not chance events and there is meaningful data in the numbers you’ve analyzed,” Marioni explains.

RNA-seq also offers other advantages over microarrays. It offers a wider dynamic range than microarrays and generally can pick up less abundant transcripts. And unlike microarrays, which report relative expression values based on fluorescence intensity, RNA-seq can report those abundances absolutely, because it counts the transcripts that it reads. Finally, RNA-seq can reveal transcript structure and splicing and can even identify novel isoforms, gene fusions, allele-specific variants and the like.

Naturally, given its growing popularity, tools for performing RNA-seq are widely available, and more are coming to market. Whether it’s sample preparation on the front-end or bioinformatics analysis on the back-end, you’re sure to find a tool to fit your needs.

(read more…)

Incoming search terms:

  • transcriptome analysis using rna-seq
  • content
  • Transcriptome Analysis Using RNA-Seq‏
  • How Does RNA-Seq Work
  • transcriptome analysis using next gensequencing nature review

Comments

One Response to “Transcriptome Analysis Using RNA-Seq – Biocompare”

  1. Steve on September 6th, 2012 2:30 pm

    I wouldn’t necessarily say RNA-Seq is unbiased… there’s definitely some bias due to transcript length and GC content..

Leave a Reply




  • Social Networking Pages

    Linkedin Group

  • Follow Me on Pinterest
  • RSS SEQanswers – RNA Sequencing

    • RNAseq (SOLiD) from 18 - 200 nt June 18, 2013
      We are interested in small non-coding RNAs. Whomever you ask about the size range of small RNAs, you get a different answer. ;) Lets assume, small... […]
      GenomicIBK
    • Unmapped ratio very high on mouse genome June 17, 2013
      Hi, My problem regards RNA-Seq data. I've downloaded public data (SAGE libs w/ 6 different samples from mouse liver ) to analyse using ArrayStudio.... […]
      le.nono
    • RNASeq: Read length different from expected June 17, 2013
      Hello all, I have received paired-end reads for 40 samples. The reads are supposed to be 100bp per end. Instead, 20 of my samples are 101bp per... […]
      gogodidi
    • How to install xgawk June 16, 2013
      Hi, This is Shrujan, i have a problem while running RNA Sequencing QC. It shows an error that xgawk is not found. So please help me installing... […]
      shrujan
    • RNA Sequencing QC Error while using with Sequence_QC.sh file June 15, 2013
      Hi, This is Shrujan kumar Madadha, I had an error while running QC for Drosophila Yukuba fastq RNA file using Sequence_QC.sh file of FASTX... […]
      shrujan
    • Cuffmerge related query June 12, 2013
      I have a query regarding what samples should be merged using cuffmerge, when you have multiple phenotypes (each with replicates). Lets say my mouse... […]
      ParthavJailwala
  • RSS Biostar – RNA-Seq

    • edgeR: very low p-value and very high variance within the group of replicates. What's my problem??
      I'm using edgeR in order to perform differential expression analysis from RNA-seq experiment. I have 6 samples of tumor cell, same tumor and same treatment: 3 patient with good prognosis and 3 patient with bad prognosis. I want to compare the gene expression among the two groups. I ran the edgeR pakage like follow: x […]
    • Normalising tag count to RPKM
      Hi! I was wondering if their is a way to normalise the number of reads in a region and the RPKM of the nearest gene to that region, so that a correlation could be computed. Like the following data shows number of tags in first column and RPKM in second column Tags RPKM 15 0.14619 11 0 203 0.2259 129 10.701 300 7.0772 122 2.3234 346 10.666 77 3.117 201 16.749 […]
    • a simple question on RNA-Seq terminology
      This question may be very simple and basic, but I just need to confirm that I understand the differences among those terminologies in the RNA-Seq context. Suppose I have a sample called SLR, and it is sequenced on 5 lanes, so I have (among other output files) BAM files like L1_SLR, L2_SLR, L3_SLR, L5_SLR and L7_SLR.bam. Here, the letter "L" denotes […]
    • FInding regions of interest with minimum coverage
      Hi, I have a bam file of all my accepted hits (tophat output) and an gtf file with my genes of interest for which I am trying to find potential antisense transcripts. I would like to create a list - preferably one that can be visualized in a genome browser - that shows all genes that have antisense reads in the accepted hits.bam file provided that there are […]
    • How to remove the intronic reads before counting
      I got RNASeq data in several samples. I checked the FastQC, seems the read quality are good (Hiseq 2000). But the problem is many reads are mapped to intronic region, and the regions have no any reference exons there (Refseq, ensembl, gencode). We don't know what they are. We guess the problem happend in library preparation, the concentration was low. N […]
    • Which strand of the mRNA molecule does the sequencer output as a "read"?
      In Illumina Stranded RNA-Seq (using the dUTP method), do the final reads in the fastq files correspond to the initial molecule (that was transcribed), or to the reverse complement of the molecule? C […]