From initial seed germination through reproduction, plants continuously reprogram their transcriptional repertoire to facilitate growth and development. This dynamic is mediated by a diverse but inextricably-linked catalog of regulatory proteins called transcription factors (TFs). Statistically quantifying TF binding site (TFBS) abundance in promoters of differentially expressed genes can be used to identify binding site patterns in promoters that are closely related to stress-response. Output from today’s transcriptomic assays necessitates statistically-oriented software to handle large promoter-sequence sets in a computationally tractable fashion.

MARINA

 

Researchers at NIH have developed Marina, an open-source software for identifying over-represented TFBSs from amongst large sets of promoter sequences, using an ensemble of 7 statistical metrics and binding-site profiles. Through software comparison, they show that Marina can identify considerably more over-represented plant TFBSs compared to a popular software alternative.

Marina was used to identify over-represented TFBSs in a two time-point RNA-Seq study exploring the transcriptomic interplay between soybean (Glycine max) and soybean rust (Phakopsora pachyrhizi). Marina identified numerous abundant TFBSs recognized by transcription factors that are associated with defense-response such as WRKY, HY5 and MYB2. Comparing results from Marina to that of a popular software alternative suggests that regardless of the number of promoter-sequences, Marina is able to identify significantly more over-represented TFBSs.

Availability: http://mason.gmu.edu/~phossein/marina

  • Hosseini P, Ovcharenko I, Matthews BF. (2013) Using an ensemble of statistical metrics to quantify large sets of plant transcription factor binding sites. Plant Methods 9(1), 12. [Epub ahead of print]. [article]

Incoming search terms:

  • glycine max pawer point
  • rna-seq group
  • transcription factor binding site prediction
  • chip seq mapping reads
  • rna seq nih
  • plant transcription factors prediction software
  • novoalign hadoop mapreduce
  • identification of over represented transcription factor binding sites in RNAseq data
  • identification of over represented transcription factor binding sites in RNA
  • cloning vector reference tophat2

The RegulatoryGenomics website posts and updates a comprehensive list of tools for RNA-Seq analysis.

This is their current version.

Spliced-mappers

Method

Reference

Web-site

TopHap

(Trapnell et al. 2009)

http://tophat.cbcb.umd.edu/

MapSplice

(Wang et al. 2010)

http://www.netlab.uky.edu/p/bioinfo/MapSplice

SpliceMap

(Auger et al. 2010)

http://www.stanford.edu/group/wonglab/SpliceMap/

HMMSplicer

(Dimon et al. 2010)

http://derisilab.ucsf.edu/index.php?software=105

TrueSight

(Li et al. 2012b)

http://bioen-compbio.bioen.illinois.edu/TrueSight/

SOAPsplice

(Huang et al. 2011)

http://soap.genomics.org.cn/soapsplice.html

PASSion

(Zhang et al. 2012)

https://trac.nbic.nl/passion

PALMapper

(Jean et al. 2010)

http://galaxy.raetschlab.org/

SplitSeek

(Ameur et al. 2010)

http://solidsoftwaretools.com/gf/project/splitseek

Supersplat

(Bryant et al. 2010)

http://mocklerlab-tools.cgrb.oregonstate.edu/

SeqSaw

(Wang et al. 2011)

http://bioinfo.au.tsinghua.edu.cn/software/seqsaw

MapNext

(Bao et al. 2009)

http://evolution.sysu.edu.cn/english/software/mapnext.htm

STAR

(Dobin et al. 2012)

http://gingeraslab.cshl.edu/STAR/

GSNAP

(Wu et al. 2010)

http://research-pub.gene.com/gmap/

QPALMA

(De Bona et al. 2008)

http://www.raetschlab.org/suppl/qpalma

OSA

(Hu et al. 2012)

http://omicsoft.com/osa/

  Read more

Incoming search terms:

  • pathyway analysis for rna seq data
  • statistical methods for differential pathway activities
  • star splice junctions
  • solas rna analysis
  • scarf file rna
  • rnaseq alternative splicing trinity
  • rna seq alternative splicing method
  • alternative splicing expression
  • MethodstostudyEvent/IsoformExpressionandAlternativeSplicingfromRNA-Seq|RNA-SeqBlog
  • junction map mrna deep sequencing

StanfordPrecise identification of RNA-coding regions and transcriptomes of eukaryotes is a significant problem in biology. Currently, eukaryote transcriptomes are analyzed using deep short-read sequencing experiments of complementary DNAs. The resulting short-reads are then aligned against a genome and annotated junctions to infer biological meaning.

Here, scientists at Stanford University use long-read complementary DNA datasets for the analysis of a eukaryotic transcriptome and generate two large datasets in the human K562 and HeLa S3 cell lines. Both data sets comprised at least 4 million reads and had median read lengths greater than 500 bp. They show that annotation-independent alignments of these reads provide partial gene structures that are very much in-line with annotated gene structures, 15% of which have not been obtained in a previous analysis of short reads. For long-noncoding RNAs (lncRNA) genes, however, they find an increased fraction of novel gene structures among our alignments. Other important aspects of transcriptome analysis, such as the description of cell type-speciewly sequenced genomes. Furthermore, they demonstrate that long read sequence can be assembled into full-length transcripts with considerable success. This method is applicable to all long read sequencing technologies.

Tilgner H, Raha D, Habegger L, Mohiuddin M, Gerstein M, Snyder M. (2013) Accurate identification and analysis of human mRNA isoforms using deep long read sequencing. G3 (Bethesda) 3(3), 387-97. [abstract]

Incoming search terms:

  • unspliced read alignment
  • using trinity assembler for virus discovery in human rna seq samples
  • accurate identi␣cation and analysis of human mrna isoforms using deep long read sequencing
  • fusion transcripts search OR detection tools
  • using trinity assembler for virus discovery in human rna seq

The whole-genome sequences of many non-model organisms have recently been determined. Using these genome sequences, next-generation sequencing based experiments such as RNA-Seq and ChIP-seq have been performed and comparisons of the experiments between related species have provided new knowledge about evolution and biological processes. Although these comparisons require transformation of the genome coordinates of the reads between the species, current software tools are not suitable to convert the massive numbers of reads to the corresponding coordinates of other species’ genomes.

RECOTNow, researchers at Ochanomizu University and the Tokyo Institute of Technology, Japan have developed a set of programs, called REad COordinate Transformer (RECOT), which is useful to compare RNA-seq, ChIP-seq and CLIP-seq sequences between closely-related species. RECOT can be used to transform the coordinates of short reads obtained from the genome of a query species being studied to that of a comparison target species after aligning the query and target gene/genome sequences. RECOT generates output in SAM format that can be viewed using recent genome browsers capable of displaying next-generation sequencing data. RECOT

They demonstrate the usefulness of RECOT in comparing ChIP-seq results between two closely-related fruit flies. The results indicate position changes of a transcription factor binding site caused sequence polymorphisms at the binding site.

Availability – RECOT is available at: http://sesejun.github.com/recot/

Izawa A, Sese J. (2013) A tool for the coordinate transformation of next-generation sequencing reads for comparative genomics and transcriptomics. Source Code Biol Med 8(1), 6. [Epub ahead of print]. [abstract]

Incoming search terms:

  • transcriptoma ppt
  • transcriptoma theme ppt
  • Transcriptomic analysis
  • RNA sequencing description
  • truseq rna
  • rna-seq tophat transcriptome
  • RNA sequencing process
  • powerpoint on transcriptomics
  • sage rna analysis
  • transriptomcs ppt

Bioinformatics has published a Next-Gen Sequencing “Virtual Issue” covering all the sequencing tools that appeared in the journal.  We have listed those described as applicable to RNA-Seq.

Statistical Inferences for Isoform Expression in RNA-Seq.
Hui Jiang and Wing Wong
Bioinformatics (2009) 25: 1026–1032 Full Text

A toolkit for analysing large-scale plant small RNA datasets
Simon Moxon et al.
Bioinformatics (2008) 24: 2252-2253 Full Text

TopHat: discovering splice junctions with RNA-Seq
Cole Trapnell et al.
Bioinformatics (2009) 25: 1105–1111 Full Text

Read more

Incoming search terms:

  • top hat rna-seq
  • ion torrent tophat2 mapping error
  • cummerbund rna
  • how to use tophat
  • edger tutorial
  • tophat rnaseq
  • edgeR的使用 RNA
  • edger
  • tophat fusion
  • tophat multiple mapped
Biological domain Bioinformatics method Input format Output format
CLCbio Genomics Workbench Genomics
Whole Genome Resequencing
De-novo assembly
SNP discovery
InDel discovery
ChIP-Seq
RNA-Seq Alignment
MiRNA
Mapping
Assembly
Alignment
Colorspace
FASTA
FASTQ
GenBank
SAM
BAM
Illumina Bustard
ELAND
CSFASTA/CSQUAL (ABI SOLiD)
FASTA
FASTQ
GFF
GenBank
SAM
BAM
ACE
Nexus
CPTRA RNA-Seq Alignment
RNA-Seq Quantitation
Cufflinks RNA-Seq Alignment
RNA-Seq Quantitation
Differential Expression
Alternative Splicing
De novo transcriptome assembly
RNA-Seq
Assembly
Differentially expressed gene identification
statistical testing
SAM GTF
ERANGE RNA-Seq Alignment
RNA-Seq Quantitation
ChIP-Seq
Allele-specific transcription
Est2assembly RNA-Seq Alignment
Genomics
Assembly
FreClu RNA-Seq Alignment Mapping
G-Mo.R-Seq RNA-Seq Alignment
GSNAP RNA-Seq Alignment
DNA methylation
Mapping
Bisulfite mapping
HMMSplicer RNA-Seq Alignment
MapNext SNP discovery
RNA-Seq Alignment
Alignment FASTA
FASTQ
MapSplice RNA-Seq Alignment Mapping FASTA
FASTQ
SAM
BED
MIRA De-novo assembly
SNP discovery
RNA-Seq Alignment
Smith-Waterman
graph reduction
learning algorithm
Assembly
Mapping
k-mer analysis
FASTA
FASTQ
CAF
GenBank
GBFF
EXP
PHD
SCF
XML traceinfo
ACE
CAF
EXP
HTML
TXT
TCS
Myrna RNA-Seq Quantitation
RNA-Seq Alignment
Hadoop
MapReduce
Nesoni RNA-Seq Alignment
SNP discovery
Phylogenetics
Alignment SAM
FASTA
Novocraft Genomics
Whole Genome Resequencing
RNA-Seq Alignment
ChIP-Seq
MiRNA
Mapping FASTA
FASTQ
Fasta.gz
CSFASTA/CSQUAL (ABI SOLiD)
SAM
Delimited Text
TXT
OLego Genomics
RNA-Seq
RNA-Seq Alignment
Mapping
Alignment
FASTA
FASTQ
SAM
BED
PALMA RNA-Seq Alignment Alignment
PERalign RNA-Seq Alignment SAM
Qpalma RNA-Seq Alignment Alignment PSL-like
RNA-MATE RNA-Seq Alignment
RNA-Seq Quantitation
Colorspace
RSEM RNA-Seq Alignment
RNA-Seq Quantitation
Scripture RNA-Seq Alignment
SeqMan NGen Genomics
De-novo assembly
De novo transcriptome assembly
Whole Genome Resequencing
SNP discovery
InDel discovery
ChIP-Seq
RNA-Seq Alignment
Mapping
Assembly
Alignment
Paired End
FASTA
FASTQ
Scarf
SFF
SQD
ACE
PHD
ABI
AB1
GFF
CSFASTA/CSQUAL (ABI SOLiD)
SCF
TXT
GenBank
SEQ
BAM
SAM
SQD
ACE
FASTA
Sim4cc RNA-Seq Alignment
Comparative genomics
Mapping
SOCS RNA-Seq Alignment
DNA methylation
SNP discovery
Mapping
Bisulfite mapping
CSFASTA/CSQUAL (ABI SOLiD) Delimited Text
SpliceMap RNA-Seq Alignment Mapping SAM
SplitSeek RNA-Seq Alignment
Supersplat RNA-Seq Alignment Assembly
TopHat RNA-Seq Alignment FASTA
FASTQ
SAM
BED
WIG
UnoSeq RNA-Seq Alignment
De novo transcriptome assembly
USeq ChIP-Seq
RNA-Seq Alignment

Incoming search terms:

  • rna-seq alignment tool
  • plant SNPs RNAseq ppt
  • fastq RNA-seq ppt
  • rnaseq aligment
  • rna seq tools alignment
  • rna seq snp calling tool
  • alignment tools for rna seq
  • best rna-seq align
  • alignment tools rna-seq
  • alignment tools of rna-seq

Accurately mapping RNA-Seq reads to the reference genome is a critical step for performing downstream analysis such as transcript assembly, isoform detection and quantification. Many tools have been developed, however, given the huge size of the next generation sequencing (NGS) datasets and the complexity of the transcriptome, RNA-Seq read mapping remains a challenge with the ever-increasing amount of data.

OSA (Omicsoft Sequence Aligner) is a fast and accurate alignment tool for RNA-Seq data. Benchmarked with existing methods, OSA improves mapping speed 4-10 fold with better sensitivity and less false positives.

Availability: OSA can be downloaded from http://omicsoft.com/osa. It is free to academic users. OSA has been tested extensively on Linux, Mac OS X and Windows platforms.

  • Hu J, Ge H, Newman M, Liu K. (2012) OSA: A fast and accurate alignment tool for RNA-Seq. Bioinformatics [Epub ahead of print]. [abstract]

Incoming search terms:

  • omicsoft OSA
  • bfast map rnaseq
  • osa rna-seq aligner
  • rna sequencing mapping tool

Bulked segregant analysis (BSA) is an efficient method to map genes responsible for mutant phenotypes. BSR-Seq makes use of RNA-Seq reads to efficiently map genes even in populations for which no polymorphic markers have been previously identified.

BSR-Seq provides not only the map position of a gene responsible for a mutant phenotype but also the effects of such a mutant on global patterns of gene expression. The expression patterns of genes within the mapping interval can be used to prioritize candidate genes based on the fact that the causal gene will often be down-regulated in the mutant pool as compared to the non-mutant pool. In addition, this strategy yields a collection of polymorphic SNPs that are tightly linked to the mutant. These SNPs could be used to fine map the mutant or clone the affected gene via chromosome walking. Hence, BSR-Seq is not only an efficient strategy for mapping genes, but also yields other data that facilitate gene cloning. Read more

Incoming search terms:

  • bulked segregant analysis
  • BSR-Seq
  • BSA Rnaseq
  • bulk rna-seq
  • bulk segregant analysis rna seq
  • bulked segregant rna-seq
  • bulked segregant rna seq
  • bulk segregation analysis duplicated genes
  • bulk segregant analysis ppt
  • what is bulked segregant analysis

Novoalign is a short-read mapper designed to be fast and sensitive on small to large genome databases. It’s primary design is based on the use of read quality information and the need to assemble genomes from resequencing experiments. Novoalign supports fragment, paired-end and mate-pair reads from major sequencing platforms such as Illumina, SOLiD and Roche 454. NovoalignCS is the version of Novoalign developed for SOLiD colourspace reads.

The full version of Novoalign is a commercial product but a version of Novoalign with some limits to functionality is available for free at: http://www.novocraft.com/main/downloadpage.php

Incoming search terms:

  • novoalign
  • novoalign RNAseq
  • novoalign rna seq
  • novoaligncs solid seqanswers
  • novoaligner
  • novealign
  • Novoalinger
  • Pac bio sequence alignment software novoalign
  • RNA-seq Novoalighn
  • similar to novoalign

The sequencing of personal genomes enabled analysis of variation in transcription factor (TF) binding, chromatin structure, and gene expression and indicated how they contribute to phenotypic variation. It is hypothesized that using the reference genome for mapping ChIP-seq or RNA-seq reads may introduce errors, especially at polymorphic genomic regions.

Researchers at the University of Illinois at Urbana-Champaign have developed a Personal Genome Editor (perEditor) that changes the reference human genome (NCBI36/hg18) into an individual genome, taking into account single nucleotide polymorphisms (SNPs), insertions and deletions, copy number variation, and chromosomal rearrangements. perEditor outputs two alleles (maternal, paternal) of the individual genome that is ready for mapping ChIP-seq and RNA-seq reads, and enabling the analyses of allele specific binding, chromatin structure, and gene expression.

AVAILABILITY: perEditor is available at http://biocomp.bioen.uiuc.edu/perEditor.

CONTACT: szhong@illinois.edu.

  • Rivas-Astroza M, Xie D, Cao X, Zhong S. (2011) Mapping personal functional data to personal genomes. Bioinformatics [Epub ahead of print]. [abstract]

Incoming search terms:

  • coverage vs depth in RNA-Seq

SeqGene is an open-source software for mining next-gen sequencing datasets, focusing on post-alignment quality control, SNP and indel identification and annotation, RNA expression quantification, allele specific expression, and expression-genotying association analysis. SeqGene is especially suited for RNA-seq and exonome-seq applications, with focus on protein coding and regulatory regions of a genome. For RNA-seq applications, SeqGene implemented a novel topology-based pathway analysis method to identify SNP-Expression co-enrichment and SNP-Expression paths. Read more

Incoming search terms:

  • seqgene
  • RNA-seq_pipeline pdf
  • RNA seq data mining
  • rnaseq data to pathways
  • RNA seq for allele mining
  • cut sequence mining software
  • mining of RNAseq dataset in arabidopsis
  • how to mine rnaseq data
  • Herpès virus des bivalves
  • general purpose software

SAMMate, a Graphical User Interface (GUI) RNA-seq analysis pipeline, allows biomedical researchers to quickly process SAM/BAM files and is compatible with both single-end and paired-end sequencing technologies. SAMMate automates some of more standard procedures in RNA-seq analysis. Read more

Incoming search terms:

  • rna-seq visualization
  • genepattern rna-seq tutorial
  • deseq gui
  • raw read counts sam
  • cufflink gui
  • DESEQ GRAPHICAL USER INTERFACE
  • rna seq analysis in graphical user interface
  • rna seq analysis using read count
  • sammate lncrna

SeqMap is a tool for identifying viral integration sites from LAM-PCR and LM-PCR analysis. The tool will extract vector sequence data then search existing genome databases for matches to the unique sequences generated by the LAM or LM-PCR reaction. SeqMap displays the vector insertion site graphically, showing the chromosome location and distance to surrounding genes. The tool also allows you to organize your data and make notations. Read more

Incoming search terms:

  • seqmap
  • bam to seqmap
  • lam pcr mapping
  • mapping webtool short sequences in a genome
  • seqmap how to
  • seqmap manual
  • use seqmap

Next Page →

  • Social Networking Pages

    Linkedin Group

  • Follow Me on Pinterest
  • RSS SEQanswers – RNA Sequencing

    • CuffDiff strange output May 23, 2013
      Hi, I hope that someone can be so gentle to help me. I'm analizing some data from RNA-Seq with TopHat and Cufflinks and I focus my attention on... […]
      Pruexel
    • cannot away with cuffdiff,incredible May 23, 2013
      Hi,all I have 4(A,B,C,D) sample in 4 times(increasing time),I got diff result in 3 different cuffdiff 1.cuffdiff 3(A,B,C) individual... […]
      upper
    • TopHat extremely low paired mapping rate. PLS HELP! May 22, 2013
      Hey guys, I have some problems with my paried-end RNA seq analysis on Galaxy. As you can see in the bam flagstat output, my tophat alignment rate is... […]
      Felix.Lee
    • Identifying small RNA sequence within whole genome sequence May 21, 2013
      Hi all, I want to know if there are any useful bioinformatic tool to find small RNA sequence within a whole bacteria genome. Thank you in... […]
      Inma
    • standard of clean data May 21, 2013
      Hi all I recently got my prokaryotes RNA-seq data report back. the standard filter steps of the raw data set by our local sequencing center is as... […]
      Pengfei Liu
    • Problem with cummeRbund diffData() May 20, 2013
      Hi all, I'm running Tophat/cufflinks/cuffdiff for differential gene expression and analysis with cummeRbund (v 2.0.0). I'm having an issue with... […]
      Enrique Zudaire
  • RSS Biostar – RNA-Seq

    • Why am I getting so many unmapped reads in STAR, classified as "too short"?
      I am currently using STAR to map several Hi-SEQ mRNA runs. I'm having trouble getting a decent amount of reads to map, but I don't really understand why. I'm hoping you can shed some light :) In the final log, only about 50% (or less) of the reads map to the reference. I'm using a GTF in addition to the genome. The unmapped bin that most […]
    • What are the best practices for SNP identification in RNA seq transcriptome data
      I have 20 RICE RNA seq tranascriptome data hiseq 2000 platform paired end reads. I aligned fasta reads with BWA and remove PCR duplicates with PICARD. Later I call SNP with samtools using various parameters. I would like to clarify what parameters should I used while alinging to reference rice genome for looking SNP location 100 bp upstream and 250 bp downst […]
    • How do TopHat options -g , --supress-hits, and Bowtie options interplay?
      Hi, I am currently using TopHat2 to map RNA-seq runs. I think there have been some changes pertaining the -g option. Does anyone know how it works now? I used to think that setting -g would look for n alignments for a given read, report them [if top-scoring] and discard those reads that had more than g [top scoring] alignments. Now, the description sounds mo […]
    • What happened to -k in TopHat for multiple-mapping reads?
      Selecting -g n in tophat does not discard reads mapping more than n, but instead only reports n alignments for those out all all their TOP scoring alignments. I think there used to be an option -k that would allow one to discard reads that topped x alignments -- whatever happened to that? I only see -g in the tophat 2 manual, no reporting options like before […]
    • Does tophat use the library-type information for mapping, or just for the XS flag?
      When I specify library-type to TopHat, i.e., first-strand, second-strand, unstranded, TopHat appends a value + or - to the XS:A flag, which is useful for subsequent analyses, such as annotation. However, does this information actually influence the "mappability" of reads, or is this unaffected? My thinking is that the information would be considere […]
    • Purpose of Y-shaped adapters in Illumina Sequencing?
      Hi all, Y adapters different sequences to be annealed to the 5' and 3' ends of each molecule in a library. The arms of the Y are unique, and the middle part, connected to the DNA fragment, is complementary. What are the advantages of this? My take of this over having fully-complementary adapters (ADAPTER1 - - - - - ADAPTER1) is that: -Upon primer a […]