Mar
8
Protocol – Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks
Filed Under Data Analysis, Expression and Quantification, Splicing and Junction Mapping
Recent advances in high-throughput cDNA sequencing (RNA-seq) can reveal new genes and splice variants and quantify expression genome-wide in a single assay. The volume and complexity of data from RNA-seq experiments necessitate scalable, fast and mathematically principled analysis software. TopHat and Cufflinks are free, open-source software tools for gene discovery and comprehensive expression analysis of high-throughput mRNA sequencing (RNA-seq) data. Together, they allow biologists to identify new genes and new splice variants of known ones, as well as compare gene and transcript expression under two or more conditions.
This protocol describes in detail how to use TopHat and Cufflinks to perform such analyses. It also covers several accessory tools and utilities that aid in managing data, including CummeRbund, a tool for visualizing RNA-seq analysis results. Although the procedure assumes basic informatics skills, these tools assume little to no background with RNA-seq analysis and are meant for novices and experts alike. The protocol begins with raw sequencing reads and produces a transcriptome assembly, lists of differentially expressed and regulated genes and transcripts, and publication-quality visualizations of analysis results. The protocol’s execution time depends on the volume of transcriptome sequencing data and available computing resources but takes less than 1 d of computer time for typical experiments and ~1 h of hands-on time.
- Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, Pimentel H, Salzberg SL, Rinn JL, Pachter L. (2012) Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 7(3), 562-78. [article]
Incoming search terms:
- Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks
- rna-seq mapping
- cufflinks rna-seq
- cufflinks next generation sequencing
- cufflinks software
- tophat rna
- rna-seq expression analysis
- rnaseq tutorial
- differential expression analysis
- differential gene and transcript expression analysis
Comments
4 Responses to “Protocol – Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks”
Leave a Reply










a nice paper, but:
1: 6 fastq files are used as examples, TopHat cannot process C1_R2_* and C2_R2_*;
2: there is a typo in cuffdiff command;
3: got a error message when running cummeRbund.
Emailed author, no responses yet.
> cuff_data <- readCufflinks('diff_out')
Creating database diff_out/cuffData.db
Reading diff_out/genes.fpkm_tracking
Checking samples table…
Populating samples table…
Writing genes table
Reshaping geneData table
Recasting
Writing geneData table
Error in sqliteExecStatement(con, statement, bind.data) :
RS-DBI driver: (unable to bind data for parameter ':status')
Well, got response from author:
my first point is right. The latest version of TopHat cannot handle C1_R2_* and C2_R2_*;
My other two points are not correct.
there is NOT typo in cuffdiff command in their paper. You have to be very careful the “comma” among the file names.
i have the same error when using cummeRbund. did you get an answer from the authors related to that point or did you managed to make it work?
thx
With help from the author, I figured it out:
the error from cummeRbund is due to the incorrect usage of cuffdiff.
Be careful with the cuffdiff command, especially the “comma”.
Wrong one:
cuffdiff -o diff_out -b genome.fa -p 8 -L C1,C2 -u merged_asm/merged.gtf ./C1_R1_thout/accepted_hits.bam ./C1_R3_thout/accepted_hits.bam ./C2_R1_thout/accepted_hits.bam ./C2_R3_thout/accepted_hits.bam
Correct one:
cuffdiff -o diff_out -b genome.fa -p 8 -L C1,C2 -u merged_asm/merged.gtf ./C1_R1_thout/accepted_hits.bam,./C1_R3_thout/accepted_hits.bam ./C2_R1_thout/accepted_hits.bam,./C2_R3_thout/accepted_hits.bam