Researchers at the University of Washington and FHCRC have developed a new method to measure and correct for protocol-specific sequence bias for quantification of sequence abundance in RNA-Seq experiments using a simple graphical model. Their model does not rely on existing gene annotations, and model selection is performed automatically making it applicable with few assumptions. They have evaluated the method on several data sets, and by multiple criteria, and demonstrate that it effectively decreases bias and increases uniformity.

Availability: The method is freely available under the LGPL license at: http://bioconductor.org/packages/release/bioc/html/seqbias.html

Contact: dcjones@cs.washington.edu

  • Jones DC, Ruzzo WL, Peng X, Katze MG. (2012) A new approach to bias correction in RNA-Seq. Bioinformatics [Epub ahead of print]. [abstract]

Incoming search terms:

  • arbutus unedo rubra
  • CLC RNA-seq network
  • myrica rubra
  • myrica rubra seeds
  • rna-seq 3\ biais corected
  • Transcriptomic analysis of Chinese bayberry (Myrica rubra) fruit development and ripening using RAN-Seq

 

Incoming search terms:

  • rna seq sample prep comparison
  • rna seq samples
  • transcriptomics sample preparation

Time – Tuesday, January 31, 2012
04:30 PM

Place – Department of Statistics, Purdue University – in PHYS 223

Speaker – Yu (Michael) Zhu

RNA-Seq has emerged as a powerful technique for transcriptome study. As much as the improved sensitivity and coverage, RNA-Seq also brings challenges for data analysis. The massive amount of sequence read data, excessive variability, uncertainties, and bias and noises stemming from multiple sources make the analysis of RAN-Seq data difficult. Despite much progress, RNA-Seq data analysis still has room for improvement, especially on the quantification of transcript/gene expression levels…(read more)

Reference - Ming Hu. Yu Zhu, Jeremy M.G. Taylor, Jun S. Liu, Zhaohui S. Qui. (2011) Using Poisson mixed-effects model to quantify transcript-level gene expression in RNA-Seq. Bioinformatics 28, 1, 63-68. [abstract]

Incoming search terms:

  • ranseq blog
  • iontorrent ranseq

Boxing

On Tuesday, Roche announced a $5.7 Billion hostile takeover bid for Illumina. Roche said it is making its proposal after “multiple efforts to engage with Illumina in order to reach a negotiated transaction,” were rebuffed. Under the terms of its all-cash proposed deal, Roche would acquire all outstanding shares of the San Diego-based next-generation sequencing and microarray firm at $44.50 per share. The bid price is a 61 percent premium over the one-month historical average of Illumina’s share price and 43 percent premium over the firm’s three-month historical average, both as of Dec. 21. (See the Roche Press Release…)

Today,  in response to Roche’s bid for the company, Illumina has created a rights agreement to “deter coercive and otherwise unfair takeover tactics”. The agreement states that if any person or group, such as Roche, becomes the holder of 15 percent or more of Illumina’s stock, then shareholders — excluding those owning 15 percent or more of the stock — would have the right to purchase additional shares at the then-current exercise price (a favorable price), but that the shares would have a market value of twice that exercise price. In other words, the shareholder would be able to buy a share of Illumina at half the market price. (See the Illumina Press Release… )

If completed, the acquisition would combine two of the leading names in next-gen sequencing.

Incoming search terms:

  • illumina paired end sequencing video
  • --SS_lib_type in trinity for illumina paired end reads

milkCow milk is a complex bioactive fluid consumed by humans beyond infancy. Even though the chemical and physical properties of cow milk are well characterized, very limited research has been done on characterizing the milk transcriptome. This study performs a comprehensive expression profiling of genes expressed in milk somatic cells of transition (day 15), peak (day 90) and late (day 250) lactation Holstein cows by RNA sequencing. Milk samples were collected from Holstein cows at 15, 90 and 250 days of lactation, and RNA was extracted from the pelleted milk cells. Gene expression analysis was conducted by Illumina RNA sequencing. Sequence reads were assembled and analyzed in CLC Genomics Workbench. Gene Ontology (GO) and pathway analysis were performed using the Blast2GO program and GeneGo application of MetaCore program.

The results revealed that 69% of NCBI Btau 4.0 annotated genes are expressed in bovine milk somatic cells. Most of the genes were ubiquitously expressed in all three stages of lactation. However, a fraction of the milk transcriptome has genes devoted to specific functions unique to the lactation stage. This indicates the ability of milk somatic cells to adapt to different molecular functions according to the biological need of the animal.

  • Wickramasinghe S, Rincon G, Islas-Trejo A, Medrano JF. (2012) Transcriptional profiling of bovine milk using RNA sequencing. BMC Genomics [Epub ahead of print]. [article]

Incoming search terms:

  • clc genomic alternative splicing
  • clc rna seq
  • clc rna seq bam
  • clc and rna-seq
  • yhs-0001
  • MicroRNA seq cattle
  • blast2go for rna data analysis
  • rna-seq clc
  • clc genomics output deseq
  • RNA-SEQ data analysis using CLCBIO

Here, researchers from Iowa State University compare four recently proposed statistical methods, edgeR, DESeq, baySeq, and a method with a two-stage Poisson model (TSPM), through a variety of simulations that were based on different distribution models or real data. They compared the ability of these methods to detect DE genes in terms of the significance ranking of genes and false discovery rate control. All methods compared are implemented in freely available software.

  • Kvam VM, Liu P, Si Y. (2012) A comparison of statistical methods for detecting differentially expressed genes from RNA-seq data. Am J Bot [Epub ahead of print]. [article]

Incoming search terms:

  • sequencing depth rna-seq
  • A comparison of statistical methods for detecting differentially expressed genes from RNA-seq data
  • bayseq
  • another review of methods for detecting differentially expressed genes from rna-seq data
  • example of bayseq
  • rna seq blog review differential expression analysis data
  • rna-seq review statistics
  • rnaseq fruit stages grant proposal
  • running bayseq
  • sashimi methodology

MATS

MATS (multivariate analysis of transcript splicing) is a Bayesian statistical framework for flexible hypothesis testing of differential alternative splicing patterns on RNA-Seq data. MATS uses a multivariate uniform prior to model the between-sample correlation in exon splicing patterns, and a Markov chain Monte Carlo (MCMC) method coupled with a simulation-based adaptive sampling procedure to calculate the P-value and false discovery rate (FDR) of differential alternative splicing. Importantly, the MATS approach is applicable to almost any type of null hypotheses of interest, providing the flexibility to identify differential alternative splicing events that match a given user-defined pattern.

Availability: http://intron.healthcare.uiowa.edu/MATS/

  • Shen S, Won Park J, Huang J, Dittmar KA, Lu ZX, Zhou Q, Carstens RP, Xing Y. (2012) MATS: a Bayesian framework for flexible detection of differential alternative splicing from RNA-Seq data. Nucleic Acids Res [Epub ahead of print]. [article]

Incoming search terms:

  • mats rnaseq
  • mats transcriptome
  • mats differential expressoin
  • mats splice junction

This is the High-Throughput Sequencing portal of the EPFL Bioinformatics and Biostatistcs core facility. We provide below links to a set of simple pipelines that will read raw sequencing data and provide a set of first-pass analyses using standard techniques that we have designed and tested. The raw files can either be uploaded from a third-party experiment or provided as references to the local Genomics Core Facilities databases. Outputs are linked to GDV for browsing and finer analysis.

RNA-Seq Analysis – http://htsstation.vital-it.ch/rnaseq/

RNA-Seq Tutorial – http://bbcf.epfl.ch/bbcflib/tutorial_rnaseq.html

Incoming search terms:

  • rpkm
  • 454 rna seq and mapping rpkm
  • rnaseq counting reads
  • rna-seq read count
  • read count rna-seq
  • htseq-count rpkm
  • count reads rna-seq
  • chip seq rpkm r
  • rnaseq problem in mapping the reads to terminal exons
  • rnaseq read count

FX is an RNA-Seq analysis tool, which runs in parallel on cloud computing infrastructure, for the estimation of gene expression levels and genomic variant calling. In the mapping of short RNA-Seq reads, FX uses a transcriptome-based reference primarily, generated from ∼160,000 mRNA sequences from RefSeq, UCSC and Ensembl databases. This approach reduces the misalignment of reads originating from splicing junctions. Unmapped reads not aligned on known transcripts are then mapped on the human genome reference. FX allows analysis of RNA-Seq data on cloud computing infrastructures, supporting access through a user-friendly web interface.

Availability: FX is freely available on the web at (http://fx.gmi.ac.kr), and can be installed on local Hadoop clusters.

  • Hong D, Rhie A, Park SS, Lee J, Ju YS, Kim S, Yu SB, Bleazard T, Park HS, Rhee H, Chong H, Yang KS, Lee YS, Kim IH, Lee JS, Kim JI, Seo JS. (2012) FX: an RNA-Seq analysis tool on the cloud. Bioinformatics [Epub ahead of print]. [abstract]

Incoming search terms:

  • rna-seq variant calling
  • hadoop bam coverage ensembl
  • s-mart rna seq tutorial
  • softwares to know about expression level in reads
  • acquire public domain rna seq raw data
  • fx rna seq analysis tool in the cloud
  • gene network analysis from rna-seq
  • how are genetic testing companies able to pinpoint particular alleles

Nucleic Acids Research has published its 19th annual Database Issue and  features descriptions of 92 new online databases covering various areas of molecular biology and 100 papers describing recent updates to the databases previously described in NAR and other journals.

The NAR online RNA Database Collection is available at: http://www.oxfordjournals.org/nar/database/cat/2

The full content of the Database Issue is freely available online on the Nucleic Acids Research web site (http://nar.oxfordjournals.org/content/40/D1.toc)

  • Galperin MY, Fernández-Suárez XM. (2012) The 2012 Nucleic Acids Research Database Issue and the online Molecular Biology Database Collection. Nucleic Acids Res 40(Database issue):D1-8. [article]

Incoming search terms:

  • rna sequence database
  • rna sequencing database
  • rna sequence databases
  • database rna sequence
  • rna sequences database
  • RNA seq databases
  • protocol of rna sequence database
  • mrna sequence database
  • rna sequencing databases
  • RNA-sequence database

Sticking with the theme of plant RNA analysis, (can’t help it, we’re attending the Plant and Animal Genomes Conference this week and have plants on the brain) here is a great set of tools developed by the The Zhao Bioinformatics Laboratory at the Noble Foundation. Some of the goals of this project are:

  • to develop graphical models to model and simulate plant transcriptional regulatory networks;
  • to develop novel computational algorithms to infer large-scale gene regulatory networks (GRNs) from high throughput functional genomics data in model plants;

Read more

Incoming search terms:

  • how to use plant small rna target analysis server
  • plant small rna target analysis
  • plant transcriptonal regulatoinry netowrk
  • rna seq count regulatory network

Chinese BayberryChinese bayberry (Myrica rubra Sieb. and Zucc.) is an important subtropical fruit crop and an ideal species for fruit quality research due to the rapid and substantial changes that occur during development and ripening, including changes in fruit color and taste.

RNA-Seq generated 1.92 G raw data, which was then de novo assembled into 41,239 UniGenes with a mean length of 531 bp. Approximately 80% of the UniGenes (32,805) were annotated against public protein databases, and coding sequences (CDS) of 31,665 UniGenes were determined. Over 3,600 UniGenes were differentially expressed during fruit ripening, with 826 up-regulated and 1,407 down-regulated. GO comparisons between the UniGenes of these two types and interactive pathways (Ipath) analysis found that energy-related metabolism was enhanced, and catalytic activity was increased. All genes involved in anthocyanin biosynthesis were up-regulated during the fruit ripening processes, concurrent with color change. Important changes in carbohydrate and acid metabolism in the ripening fruit are likely associated with expression of sucrose phosphate synthase (SPS) and glutamate decarboxylase (GAD).

This provides a reference for the study of complicated metabolism in non-model perennial species.

  • Feng C, Chen M,  Xu C, Bai L, Yin X, L Xi, Allan AC, Ferguson IB, Chen K. (2012) Transcriptomic analysis of Chinese bayberry (Myrica rubra) fruit development and ripening using RNA-Seq. BMC Genomics [Epub ahead of print]. [article]

Incoming search terms:

  • rna seq assembly
  • bayberry tree
  • bayberry
  • chinese bayberry tree
  • fruit ripening transcriptomic
  • fruit ripening workflow
  • fruit shelf ripening rna sequencing
  • images of chinese bayberry fruit
  • myrica rubra ppt

RNA-SeqThis review summarizes a number of frequently-used applications of transcriptome sequencing and their related analyzing strategies, including short read mapping, exon-exon splice junction detection, gene or isoform expression quantification, differential expression analysis and transcriptome reconstruction.

Table 1 Tools for short read mapping
Table 2 A list of software for splice junction detection
Table 3 Software for gene or isoform expression quantification
Table 4 Available tools for differential expression analysis
Table 5 Transcriptome reconstruction tools

  • Chen G, Wang C, Shi T. (2011) Overview of available methods for diverse RNA-Seq data analyses. Sci China Life Sci 54(12), 1121-28. [article]

Previous reviews covering RNA-Seq data analysis strategies and tools:

June – Nature Methods
Sept -  Nature Reviews Genetics

Incoming search terms:

  • rna-seq software
  • rna sequencing software
  • Overview of available methods for diverse RNA-Seq data analyses
  • rna seq tool comparison
  • rna sequencing software tools
  • rna-seq analysis tools review
  • rna-seq tools review
  • RNAseq data analysis tools

Next Page →

  • Social Networking Pages

    Linkedin Group

  • Follow Me on Pinterest
  • RSS SEQanswers – RNA Sequencing

    • TopHat extremely low paired mapping rate. PLS HELP! May 22, 2013
      Hey guys, I have some problems with my paried-end RNA seq analysis on Galaxy. As you can see in the bam flagstat output, my tophat alignment rate is... […]
      Felix.Lee
    • Identifying small RNA sequence within whole genome sequence May 21, 2013
      Hi all, I want to know if there are any useful bioinformatic tool to find small RNA sequence within a whole bacteria genome. Thank you in... […]
      Inma
    • standard of clean data May 21, 2013
      Hi all I recently got my prokaryotes RNA-seq data report back. the standard filter steps of the raw data set by our local sequencing center is as... […]
      Pengfei Liu
    • Problem with cummeRbund diffData() May 20, 2013
      Hi all, I'm running Tophat/cufflinks/cuffdiff for differential gene expression and analysis with cummeRbund (v 2.0.0). I'm having an issue with... […]
      Enrique Zudaire
    • How to increase rowsize in heatmap? May 16, 2013
      Hi, I am a complete newbie to all things cummeRbund and am currently fighting with generating readable heatmaps. When I use ... […]
      Mags
    • novoalign mapping May 15, 2013
      Hi, I want to use novoalign to map reads - allowing up to 15 mismatches for 100 bp paired-end reads I am new to novoalign(went through the... […]
      abh
  • RSS Biostar – RNA-Seq

    • Why am I getting so many unmapped reads in STAR, classified as "too short"?
      I am currently using STAR to map several Hi-SEQ mRNA runs. I'm having trouble getting a decent amount of reads to map, but I don't really understand why. I'm hoping you can shed some light :) In the final log, only about 50% (or less) of the reads map to the reference. I'm using a GTF in addition to the genome. The unmapped bin that most […]
    • What are the best practices for SNP identification in RNA seq transcriptome data
      I have 20 RICE RNA seq tranascriptome data hiseq 2000 platform paired end reads. I aligned fasta reads with BWA and remove PCR duplicates with PICARD. Later I call SNP with samtools using various parameters. I would like to clarify what parameters should I used while alinging to reference rice genome for looking SNP location 100 bp upstream and 250 bp downst […]
    • How do TopHat options -g , --supress-hits, and Bowtie options interplay?
      Hi, I am currently using TopHat2 to map RNA-seq runs. I think there have been some changes pertaining the -g option. Does anyone know how it works now? I used to think that setting -g would look for n alignments for a given read, report them [if top-scoring] and discard those reads that had more than g [top scoring] alignments. Now, the description sounds mo […]
    • What happened to -k in TopHat for multiple-mapping reads?
      Selecting -g n in tophat does not discard reads mapping more than n, but instead only reports n alignments for those out all all their TOP scoring alignments. I think there used to be an option -k that would allow one to discard reads that topped x alignments -- whatever happened to that? I only see -g in the tophat 2 manual, no reporting options like before […]
    • Does tophat use the library-type information for mapping, or just for the XS flag?
      When I specify library-type to TopHat, i.e., first-strand, second-strand, unstranded, TopHat appends a value + or - to the XS:A flag, which is useful for subsequent analyses, such as annotation. However, does this information actually influence the "mappability" of reads, or is this unaffected? My thinking is that the information would be considere […]
    • Purpose of Y-shaped adapters in Illumina Sequencing?
      Hi all, Y adapters different sequences to be annealed to the 5' and 3' ends of each molecule in a library. The arms of the Y are unique, and the middle part, connected to the DNA fragment, is complementary. What are the advantages of this? My take of this over having fully-complementary adapters (ADAPTER1 - - - - - ADAPTER1) is that: -Upon primer a […]