Differential analysis of gene and transcript expression using high-throughput RNA sequencing (RNA-seq) is complicated by several sources of measurement variability and poses numerous statistical challenges.

A team led by researchers at the University of California Berkeley has released and update to Cuffdiff which is their approach to differential expression analysis in Cufflinks. The update, Cuffdiff 2, is an algorithm that estimates expression at transcript-level resolution and controls for variability evident across replicate libraries. Cuffdiff 2 robustly identifies differentially expressed transcripts and genes and reveals differential splicing and promoter-preference changes.

In this article, they demonstrate the accuracy of their approach through differential analysis of lung fibroblasts in response to loss of the developmental transcription factor HOXA1, which they show is required for lung fibroblast and HeLa cell cycle progression. Loss of HOXA1 results in significant expression level changes in thousands of individual transcripts, along with isoform switching events in key regulators of the cell cycle. Cuffdiff 2 performs robust differential analysis in RNA-seq experiments at transcript resolution, revealing a layer of regulation not readily observable with other high-throughput technologies.

Availability – Cuffdiff 2 is available at: http://cufflinks.cbcb.umd.edu/

Trapnell C, Hendrickson DG, Sauvageau M, Goff L, Rinn JL, Pachter L.(2012) Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat Biotechnol [Epub ahead of print]. [abstract]

Incoming search terms:

  • cuffdiff 2
  • principle of rna sequencing
  • rna seq differential expression r
  • cufflinks chip seq
  • cuff diff alternative splicing
  • rna-seq log
  • cummerbund RNAseq PDF
  • cufflinks cloud computting
  • challenges and strategies in transcriptome assembly and differential gene expression quantification a comprehensive in silico assessment of rna-seq experiments
  • cufflink transcript expression reproducibility

Bellerophontes is a new framework for the detection of fusion transcripts through short paired-end reads which integrates splicing-driven alignment and abundance estimation analysis, producing a more accurate set of reads supporting the junction discovery and taking into account also not annotated transcripts. Bellerophontes performs a selection of putative junctions on the basis of a match to an accurate gene fusion model. Bellerophontes runs on top of TopHat and Cufflinks tools (developed by Trapnell et al.). The analysis is based on the results of TopHat alignment and Cufflinks transcript isoform detection.

AVAILABILITY:  Bellerophontes JAVA/Perl/Bash software implementation is free and available at http://eda.polito.it/bellerophontes/

  • Abate F, Acquaviva A, Paciello G, Ficarra E, Ferrarini A, Delledonne M, Soverini S, Martinelli G, Macii E. (2102) Bellerophontes: A RNA-Seq data analysis framework for chimeric transcripts discovery based on accurate fusion model. Bioinformatics [Epub ahead of print]. [abstract]

Incoming search terms:

  • tophat rna seq
  • cufflinks alternative splicing
  • MATS alternative splicing
  • mmseq stands for
  • cufflinks rna
  • tophat rna-seq image
  • cufflinks rna seq
  • RNA seq tophat
  • cufflinks de novo assembly
  • cufflinks tutorial

rna-seq pipelineRecent advances in high-throughput cDNA sequencing (RNA-seq) can reveal new genes and splice variants and quantify expression genome-wide in a single assay. The volume and complexity of data from RNA-seq experiments necessitate scalable, fast and mathematically principled analysis software. TopHat and Cufflinks are free, open-source software tools for gene discovery and comprehensive expression analysis of high-throughput mRNA sequencing (RNA-seq) data. Together, they allow biologists to identify new genes and new splice variants of known ones, as well as compare gene and transcript expression under two or more conditions.

This protocol describes in detail how to use TopHat and Cufflinks to perform such analyses. It also covers several accessory tools and utilities that aid in managing data, including CummeRbund, a tool for visualizing RNA-seq analysis results. Although the procedure assumes basic informatics skills, these tools assume little to no background with RNA-seq analysis and are meant for novices and experts alike. The protocol begins with raw sequencing reads and produces a transcriptome assembly, lists of differentially expressed and regulated genes and transcripts, and publication-quality visualizations of analysis results. The protocol’s execution time depends on the volume of transcriptome sequencing data and available computing resources but takes less than 1 d of computer time for typical experiments and ~1 h of hands-on time.

  • Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L. (2012) Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 7(3), 562-78. [article]

Incoming search terms:

  • Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks
  • rna-seq mapping
  • cufflinks rna-seq
  • cufflinks next generation sequencing
  • cufflinks software
  • tophat rna
  • rna-seq expression analysis
  • rnaseq tutorial
  • differential expression analysis
  • differential gene and transcript expression analysis

LSCCMonday, October 17, 2011 – 9:30am – 1:00pm

VLSCI Boardroom

Topics covered:

  • Mapping RNA-seq data using tophat and bowtie,
  • Analyzing and comparing transcripts using cufflinks,
  • Visualising data using IGV.

You will need to bring your own laptop (Macs and linux machines work better than PCs, but any laptop is OK)

Registration essential: Places are limited. To register, email course coordinator Dr Nathan Hall (below) and please include one or two sentences about the type of NGS research you are doing.

RSVP Email Address for this Event: nhal@unimelb.edu.au

More info: http://www.vlsci.org.au/events/lscc-workshop-rna-seq-analysis

Incoming search terms:

  • igv alternative splicing
  • heat map from rna-seq in igv
  • igv color rna seq data
  • rna seq igv
  • microrna data analysis igv
  • igv rna-seq data
  • igv rna seq alignment
  • IGV heatmap rna seq
  • igv heat map rna seq tutorial
  • visualize fusion igv

Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one. Read more

Incoming search terms:

  • Cufflinks assembles transcripts
  • ab assembler tool est
  • cufflink transcriptome
  • cufflinks differential expression

Technical Guides

Discussion Forums

  • The RNA-Seq Blog – A discussion forum for all things transcriptomic.
  • SEQanswers – The next-generation sequencing community – threads tagged with RNA-Seq.

Webinars

  • An Illumina-Demonstrated Method for Sequencing the Complete Transcriptome -  Session will introduce an improved solution for the reduction of abundant transcripts in RNA-Seq experiments, based on an Illumina-optimized protocol utilizing duplex-specific nuclease (DSN) from Evrogen. Illumina scientists will provide a brief overview of DSN, will describe the enhancements made to the DSN workflow to optimize its performance for Illumina RNA-Seq, and will demonstrate its utility in a wide range of applications, including ncRNA discovery and FFPE transcriptome profiling.

RNA-Seq Data Analysis Tools

  • rQuant.web – is a web service to provide convenient access to tools for the quantitative analysis of RNA-Seq data. It allows to determine abundances of multiple transcripts per gene locus from RNA-Seq measurements. rQuant.web is available free of charge, to all users as a tool in a Galaxy installation. 
  • Scripture – is a method for transcriptome reconstruction that relies solely on RNA-Seq reads and an assembled genome to build a transcriptome ab initio.
  • Cufflinks – assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one.
  • SpliceMap – SpliceMap is a de novo splice junction discovery tool. It offers high sensitivity and support for arbitrarily long RNA-seq read lengths.
  • TopHat – is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons.
  • PALMapper – a combination of the spliced alignment method QPALMA with the short read alignment tool GenomeMapper. The resulting method, called PALMapper, efficiently computes both spliced and unspliced alignments at high accuracy while taking advantage of base quality information and splice site predictions.
  • RNA-MATE – A recursive mapping strategy for high-throughput RNA-sequencing data.
  • ERANGE – Mapping and Quantifying Mammalian Transcriptomes by RNA-Seq
  • SeqMap – A Tool For Mapping Millions Of Short Sequences To The Genome.
  • Bioconductor – Bioconductor is an open source and open development software project for the analysis and comprehension of genomic data.
  • BWA – BWA is a fast light-weighted tool that aligns relatively short sequences (queries) to a sequence database (targe), such as the human reference genome.
  • CisGenome – An integrated tool for tiling array, ChIP-seq, genome and cis-regulatory element analysis.
  • GenePattern – is a powerful genomic analysis platform that provides access to more than 100 tools for gene expression analysis, proteomics, SNP analysis and common data processing tasks. A web-based interface provides easy access to these tools and allows the creation of multi-step analysis pipelines that enable reproducible in silico research.
  • Galaxy – Mapping pipeline for Illumina, 454, and SOLiD sequencing data.
  • MAQ – stands for Mapping and Assembly with Quality It builds assembly by mapping short reads to reference sequences.
  • UCSC Genome Browser – This site contains the reference sequence and working draft assemblies for a large collection of genomes. It also provides portals to the ENCODE and Neandertal projects.

Incoming search terms:

  • seq web
  • rquant
  • chip-seq fastqc to cisgenome
  • s eq uen
  • RNA-seq websites
  • rna seq questions
  • rna seq question
  • resource for learning rna seq
  • mtdna rna-seq seq answers
  • cisgenome protocol

High-throughput mRNA sequencing (RNA-Seq) promises simultaneous transcript discovery and abundance estimation1, 2, 3. However, this would require algorithms that are not restricted by prior gene annotations and that account for alternative transcription and splicing. Here we introduce such algorithms in an open-source software program called Cufflinks. To test Cufflinks, we sequenced and analyzed >430 million paired 75-bp RNA-Seq reads from a mouse myoblast cell line over a differentiation time series. We detected 13,692 known transcripts and 3,724 previously unannotated ones, 62% of which are supported by independent expression data or by homologous genes in other species. Over the time series, 330 genes showed complete switches in the dominant transcription start site (TSS) or splice isoform, and we observed more subtle shifts in 1,304 other genes. These results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation. (read more… )

Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28(5), 511-15.

Incoming search terms:

  • rna seq otogenetics
  • rna seq was performed otogenetics

  • Social Networking Pages

    Linkedin Group

  • Follow Me on Pinterest
  • RSS SEQanswers – RNA Sequencing

    • reason for low mapping rate?? May 23, 2013
      we did RNASeq using HiSeq 2000 100PE. When the data were back, I mapping them to the reference sequence, but got very low mapping rate (30-40%). I... […]
      miaom
    • cross-species data - questions about normalization May 23, 2013
      Hi, I have some data form various samples (cell types) in different species. I want to compare and analyze gene expression variability across the... […]
      trelek2
    • CuffDiff strange output May 23, 2013
      Hi, I hope that someone can be so gentle to help me. I'm analizing some data from RNA-Seq with TopHat and Cufflinks and I focus my attention on... […]
      Pruexel
    • cannot away with cuffdiff,incredible May 23, 2013
      Hi,all I have 4(A,B,C,D) sample in 4 times(increasing time),I got diff result in 3 different cuffdiff 1.cuffdiff 3(A,B,C) individual... […]
      upper
    • TopHat extremely low paired mapping rate. PLS HELP! May 22, 2013
      Hey guys, I have some problems with my paried-end RNA seq analysis on Galaxy. As you can see in the bam flagstat output, my tophat alignment rate is... […]
      Felix.Lee
    • Identifying small RNA sequence within whole genome sequence May 21, 2013
      Hi all, I want to know if there are any useful bioinformatic tool to find small RNA sequence within a whole bacteria genome. Thank you in... […]
      Inma
  • RSS Biostar – RNA-Seq

    • Why am I getting so many unmapped reads in STAR, classified as "too short"?
      I am currently using STAR to map several Hi-SEQ mRNA runs. I'm having trouble getting a decent amount of reads to map, but I don't really understand why. I'm hoping you can shed some light :) In the final log, only about 50% (or less) of the reads map to the reference. I'm using a GTF in addition to the genome. The unmapped bin that most […]
    • What are the best practices for SNP identification in RNA seq transcriptome data
      I have 20 RICE RNA seq tranascriptome data hiseq 2000 platform paired end reads. I aligned fasta reads with BWA and remove PCR duplicates with PICARD. Later I call SNP with samtools using various parameters. I would like to clarify what parameters should I used while alinging to reference rice genome for looking SNP location 100 bp upstream and 250 bp downst […]
    • How do TopHat options -g , --supress-hits, and Bowtie options interplay?
      Hi, I am currently using TopHat2 to map RNA-seq runs. I think there have been some changes pertaining the -g option. Does anyone know how it works now? I used to think that setting -g would look for n alignments for a given read, report them [if top-scoring] and discard those reads that had more than g [top scoring] alignments. Now, the description sounds mo […]
    • What happened to -k in TopHat for multiple-mapping reads?
      Selecting -g n in tophat does not discard reads mapping more than n, but instead only reports n alignments for those out all all their TOP scoring alignments. I think there used to be an option -k that would allow one to discard reads that topped x alignments -- whatever happened to that? I only see -g in the tophat 2 manual, no reporting options like before […]
    • Does tophat use the library-type information for mapping, or just for the XS flag?
      When I specify library-type to TopHat, i.e., first-strand, second-strand, unstranded, TopHat appends a value + or - to the XS:A flag, which is useful for subsequent analyses, such as annotation. However, does this information actually influence the "mappability" of reads, or is this unaffected? My thinking is that the information would be considere […]
    • Purpose of Y-shaped adapters in Illumina Sequencing?
      Hi all, Y adapters different sequences to be annealed to the 5' and 3' ends of each molecule in a library. The arms of the Y are unique, and the middle part, connected to the DNA fragment, is complementary. What are the advantages of this? My take of this over having fully-complementary adapters (ADAPTER1 - - - - - ADAPTER1) is that: -Upon primer a […]