Researchers at the University of Washington and FHCRC have developed a new method to measure and correct for protocol-specific sequence bias for quantification of sequence abundance in RNA-Seq experiments using a simple graphical model. Their model does not rely on existing gene annotations, and model selection is performed automatically making it applicable with few assumptions. They have evaluated the method on several data sets, and by multiple criteria, and demonstrate that it effectively decreases bias and increases uniformity.

Availability: The method is freely available under the LGPL license at: http://bioconductor.org/packages/release/bioc/html/seqbias.html

Contact: dcjones@cs.washington.edu

  • Jones DC, Ruzzo WL, Peng X, Katze MG. (2012) A new approach to bias correction in RNA-Seq. Bioinformatics [Epub ahead of print]. [abstract]

Incoming search terms:

  • bokep cdna
  • myrica rubra
  • myrica rubra seeds
  • rna correction
  • rna-seq 3\ biais corected
  • Transcriptomic analysis of Chinese bayberry (Myrica rubra) fruit development and ripening using RAN-Seq

 

Incoming search terms:

  • rna seq sample prep comparison
  • rna seq samples
  • sample preparation effect on differential expression rna seq
  • Transcriptome sequencing comparison
  • transcriptomics sample preparation

Time – Tuesday, January 31, 2012
04:30 PM

Place – Department of Statistics, Purdue University – in PHYS 223

Speaker – Yu (Michael) Zhu

RNA-Seq has emerged as a powerful technique for transcriptome study. As much as the improved sensitivity and coverage, RNA-Seq also brings challenges for data analysis. The massive amount of sequence read data, excessive variability, uncertainties, and bias and noises stemming from multiple sources make the analysis of RAN-Seq data difficult. Despite much progress, RNA-Seq data analysis still has room for improvement, especially on the quantification of transcript/gene expression levels…(read more)

Reference - Ming Hu. Yu Zhu, Jeremy M.G. Taylor, Jun S. Liu, Zhaohui S. Qui. (2011) Using Poisson mixed-effects model to quantify transcript-level gene expression in RNA-Seq. Bioinformatics 28, 1, 63-68. [abstract]

Incoming search terms:

  • ranseq blog
  • poisson mixture model code

Boxing

On Tuesday, Roche announced a $5.7 Billion hostile takeover bid for Illumina. Roche said it is making its proposal after “multiple efforts to engage with Illumina in order to reach a negotiated transaction,” were rebuffed. Under the terms of its all-cash proposed deal, Roche would acquire all outstanding shares of the San Diego-based next-generation sequencing and microarray firm at $44.50 per share. The bid price is a 61 percent premium over the one-month historical average of Illumina’s share price and 43 percent premium over the firm’s three-month historical average, both as of Dec. 21. (See the Roche Press Release…)

Today,  in response to Roche’s bid for the company, Illumina has created a rights agreement to “deter coercive and otherwise unfair takeover tactics”. The agreement states that if any person or group, such as Roche, becomes the holder of 15 percent or more of Illumina’s stock, then shareholders — excluding those owning 15 percent or more of the stock — would have the right to purchase additional shares at the then-current exercise price (a favorable price), but that the shares would have a market value of twice that exercise price. In other words, the shareholder would be able to buy a share of Illumina at half the market price. (See the Illumina Press Release… )

If completed, the acquisition would combine two of the leading names in next-gen sequencing.

milkCow milk is a complex bioactive fluid consumed by humans beyond infancy. Even though the chemical and physical properties of cow milk are well characterized, very limited research has been done on characterizing the milk transcriptome. This study performs a comprehensive expression profiling of genes expressed in milk somatic cells of transition (day 15), peak (day 90) and late (day 250) lactation Holstein cows by RNA sequencing. Milk samples were collected from Holstein cows at 15, 90 and 250 days of lactation, and RNA was extracted from the pelleted milk cells. Gene expression analysis was conducted by Illumina RNA sequencing. Sequence reads were assembled and analyzed in CLC Genomics Workbench. Gene Ontology (GO) and pathway analysis were performed using the Blast2GO program and GeneGo application of MetaCore program.

The results revealed that 69% of NCBI Btau 4.0 annotated genes are expressed in bovine milk somatic cells. Most of the genes were ubiquitously expressed in all three stages of lactation. However, a fraction of the milk transcriptome has genes devoted to specific functions unique to the lactation stage. This indicates the ability of milk somatic cells to adapt to different molecular functions according to the biological need of the animal.

  • Wickramasinghe S, Rincon G, Islas-Trejo A, Medrano JF. (2012) Transcriptional profiling of bovine milk using RNA sequencing. BMC Genomics [Epub ahead of print]. [article]

Incoming search terms:

  • clc genomic alternative splicing
  • clc rna seq bam
  • blast2go rna-seq
  • microrna sequencing milk
  • MicroRNA seq cattle
  • RNA-SEQ data analysis using CLCBIO
  • RNA-Seq Analysis tool from the CLC Genomic Workbench software
  • blast2go for rna data analysis
  • code for RNA sequence network
  • cow milk cufflinks

Here, researchers from Iowa State University compare four recently proposed statistical methods, edgeR, DESeq, baySeq, and a method with a two-stage Poisson model (TSPM), through a variety of simulations that were based on different distribution models or real data. They compared the ability of these methods to detect DE genes in terms of the significance ranking of genes and false discovery rate control. All methods compared are implemented in freely available software.

  • Kvam VM, Liu P, Si Y. (2012) A comparison of statistical methods for detecting differentially expressed genes from RNA-seq data. Am J Bot [Epub ahead of print]. [article]

Incoming search terms:

  • sequencing depth rna-seq
  • A comparison of statistical methods for detecting differentially expressed genes from RNA-seq data
  • bayseq
  • sashimi methodology
  • running bayseq
  • rnaseq fruit stages grant proposal
  • rna seq blog review differential expression analysis data
  • introduction of bayseq
  • input libsizes in bayseq
  • find differentially expressed genes high throughput data

MATS

MATS (multivariate analysis of transcript splicing) is a Bayesian statistical framework for flexible hypothesis testing of differential alternative splicing patterns on RNA-Seq data. MATS uses a multivariate uniform prior to model the between-sample correlation in exon splicing patterns, and a Markov chain Monte Carlo (MCMC) method coupled with a simulation-based adaptive sampling procedure to calculate the P-value and false discovery rate (FDR) of differential alternative splicing. Importantly, the MATS approach is applicable to almost any type of null hypotheses of interest, providing the flexibility to identify differential alternative splicing events that match a given user-defined pattern.

Availability: http://intron.healthcare.uiowa.edu/MATS/

  • Shen S, Won Park J, Huang J, Dittmar KA, Lu ZX, Zhou Q, Carstens RP, Xing Y. (2012) MATS: a Bayesian framework for flexible detection of differential alternative splicing from RNA-Seq data. Nucleic Acids Res [Epub ahead of print]. [article]

Incoming search terms:

  • mats rnaseq
  • mats transcriptome
  • a bayesian framework for flexible detection of differetial expression alternative splicing for mrna-seq
  • mats differential expressoin
  • mats splice junction
  • MATS: a Bayesian framework for flexible detection of differential alternative splicing from RNA-Seq data

This is the High-Throughput Sequencing portal of the EPFL Bioinformatics and Biostatistcs core facility. We provide below links to a set of simple pipelines that will read raw sequencing data and provide a set of first-pass analyses using standard techniques that we have designed and tested. The raw files can either be uploaded from a third-party experiment or provided as references to the local Genomics Core Facilities databases. Outputs are linked to GDV for browsing and finer analysis.

RNA-Seq Analysis – http://htsstation.vital-it.ch/rnaseq/

RNA-Seq Tutorial – http://bbcf.epfl.ch/bbcflib/tutorial_rnaseq.html

Incoming search terms:

  • 454 rna seq and mapping rpkm
  • read count rna-seq
  • htseq-count rpkm
  • HTSEQ rpkm
  • rna-seq read count
  • rnaseq counting reads
  • rna seq read count
  • rna seq read counting
  • RSEM RNASeq RPKM
  • rna seq transcript read count

FX is an RNA-Seq analysis tool, which runs in parallel on cloud computing infrastructure, for the estimation of gene expression levels and genomic variant calling. In the mapping of short RNA-Seq reads, FX uses a transcriptome-based reference primarily, generated from ∼160,000 mRNA sequences from RefSeq, UCSC and Ensembl databases. This approach reduces the misalignment of reads originating from splicing junctions. Unmapped reads not aligned on known transcripts are then mapped on the human genome reference. FX allows analysis of RNA-Seq data on cloud computing infrastructures, supporting access through a user-friendly web interface.

Availability: FX is freely available on the web at (http://fx.gmi.ac.kr), and can be installed on local Hadoop clusters.

  • Hong D, Rhie A, Park SS, Lee J, Ju YS, Kim S, Yu SB, Bleazard T, Park HS, Rhee H, Chong H, Yang KS, Lee YS, Kim IH, Lee JS, Kim JI, Seo JS. (2012) FX: an RNA-Seq analysis tool on the cloud. Bioinformatics [Epub ahead of print]. [abstract]

Incoming search terms:

  • rna-seq variant calling
  • hadoop bam coverage ensembl
  • fx rna seq analysis tool in the cloud

Nucleic Acids Research has published its 19th annual Database Issue and  features descriptions of 92 new online databases covering various areas of molecular biology and 100 papers describing recent updates to the databases previously described in NAR and other journals.

The NAR online RNA Database Collection is available at: http://www.oxfordjournals.org/nar/database/cat/2

The full content of the Database Issue is freely available online on the Nucleic Acids Research web site (http://nar.oxfordjournals.org/content/40/D1.toc)

  • Galperin MY, Fernández-Suárez XM. (2012) The 2012 Nucleic Acids Research Database Issue and the online Molecular Biology Database Collection. Nucleic Acids Res 40(Database issue):D1-8. [article]

Incoming search terms:

  • rna sequence database
  • rna sequencing database
  • rna sequences database
  • rna sequence databases
  • database rna sequence
  • RNA seq databases
  • mrna sequence database
  • RNA Sequences databases
  • rna sequencing databases
  • rna seq library database

Sticking with the theme of plant RNA analysis, (can’t help it, we’re attending the Plant and Animal Genomes Conference this week and have plants on the brain) here is a great set of tools developed by the The Zhao Bioinformatics Laboratory at the Noble Foundation. Some of the goals of this project are:

  • to develop graphical models to model and simulate plant transcriptional regulatory networks;
  • to develop novel computational algorithms to infer large-scale gene regulatory networks (GRNs) from high throughput functional genomics data in model plants;

Read more

Incoming search terms:

  • plant rna-sequencing promoter analysis gene discovery gene network
  • plant small rna target analysis
  • plant transcriptonal regulatoinry netowrk

Chinese BayberryChinese bayberry (Myrica rubra Sieb. and Zucc.) is an important subtropical fruit crop and an ideal species for fruit quality research due to the rapid and substantial changes that occur during development and ripening, including changes in fruit color and taste.

RNA-Seq generated 1.92 G raw data, which was then de novo assembled into 41,239 UniGenes with a mean length of 531 bp. Approximately 80% of the UniGenes (32,805) were annotated against public protein databases, and coding sequences (CDS) of 31,665 UniGenes were determined. Over 3,600 UniGenes were differentially expressed during fruit ripening, with 826 up-regulated and 1,407 down-regulated. GO comparisons between the UniGenes of these two types and interactive pathways (Ipath) analysis found that energy-related metabolism was enhanced, and catalytic activity was increased. All genes involved in anthocyanin biosynthesis were up-regulated during the fruit ripening processes, concurrent with color change. Important changes in carbohydrate and acid metabolism in the ripening fruit are likely associated with expression of sucrose phosphate synthase (SPS) and glutamate decarboxylase (GAD).

This provides a reference for the study of complicated metabolism in non-model perennial species.

  • Feng C, Chen M,  Xu C, Bai L, Yin X, L Xi, Allan AC, Ferguson IB, Chen K. (2012) Transcriptomic analysis of Chinese bayberry (Myrica rubra) fruit development and ripening using RNA-Seq. BMC Genomics [Epub ahead of print]. [article]

Incoming search terms:

  • rna seq assembly
  • bayberry tree
  • bayberry
  • chinese bayberry fruit
  • chinese fruit
  • fruit ripening workflow
  • images of chinese bayberry fruit
  • purpose of myrica rubra fruits

RNA-SeqThis review summarizes a number of frequently-used applications of transcriptome sequencing and their related analyzing strategies, including short read mapping, exon-exon splice junction detection, gene or isoform expression quantification, differential expression analysis and transcriptome reconstruction.

Table 1 Tools for short read mapping
Table 2 A list of software for splice junction detection
Table 3 Software for gene or isoform expression quantification
Table 4 Available tools for differential expression analysis
Table 5 Transcriptome reconstruction tools

  • Chen G, Wang C, Shi T. (2011) Overview of available methods for diverse RNA-Seq data analyses. Sci China Life Sci 54(12), 1121-28. [article]

Previous reviews covering RNA-Seq data analysis strategies and tools:

June – Nature Methods
Sept -  Nature Reviews Genetics

Incoming search terms:

  • rna-seq software
  • Overview of available methods for diverse RNA-Seq data analyses
  • RNAseq data analysis tools
  • analysis of rna seq software
  • rna sequencing software tools
  • rna-seq analysis tools review
  • rna-seq tools review

Next Page →

  • Social Networking Pages

    Linkedin Group

  • Follow Me on Pinterest
  • RSS SEQanswers – RNA Sequencing

    • RNAseq (SOLiD) from 18 - 200 nt June 18, 2013
      We are interested in small non-coding RNAs. Whomever you ask about the size range of small RNAs, you get a different answer. ;) Lets assume, small... […]
      GenomicIBK
    • Unmapped ratio very high on mouse genome June 17, 2013
      Hi, My problem regards RNA-Seq data. I've downloaded public data (SAGE libs w/ 6 different samples from mouse liver ) to analyse using ArrayStudio.... […]
      le.nono
    • RNASeq: Read length different from expected June 17, 2013
      Hello all, I have received paired-end reads for 40 samples. The reads are supposed to be 100bp per end. Instead, 20 of my samples are 101bp per... […]
      gogodidi
    • How to install xgawk June 16, 2013
      Hi, This is Shrujan, i have a problem while running RNA Sequencing QC. It shows an error that xgawk is not found. So please help me installing... […]
      shrujan
    • RNA Sequencing QC Error while using with Sequence_QC.sh file June 15, 2013
      Hi, This is Shrujan kumar Madadha, I had an error while running QC for Drosophila Yukuba fastq RNA file using Sequence_QC.sh file of FASTX... […]
      shrujan
    • Cuffmerge related query June 12, 2013
      I have a query regarding what samples should be merged using cuffmerge, when you have multiple phenotypes (each with replicates). Lets say my mouse... […]
      ParthavJailwala
  • RSS Biostar – RNA-Seq

    • edgeR: very low p-value and very high variance within the group of replicates. What's my problem??
      I'm using edgeR in order to perform differential expression analysis from RNA-seq experiment. I have 6 samples of tumor cell, same tumor and same treatment: 3 patient with good prognosis and 3 patient with bad prognosis. I want to compare the gene expression among the two groups. I ran the edgeR pakage like follow: x […]
    • Normalising tag count to RPKM
      Hi! I was wondering if their is a way to normalise the number of reads in a region and the RPKM of the nearest gene to that region, so that a correlation could be computed. Like the following data shows number of tags in first column and RPKM in second column Tags RPKM 15 0.14619 11 0 203 0.2259 129 10.701 300 7.0772 122 2.3234 346 10.666 77 3.117 201 16.749 […]
    • a simple question on RNA-Seq terminology
      This question may be very simple and basic, but I just need to confirm that I understand the differences among those terminologies in the RNA-Seq context. Suppose I have a sample called SLR, and it is sequenced on 5 lanes, so I have (among other output files) BAM files like L1_SLR, L2_SLR, L3_SLR, L5_SLR and L7_SLR.bam. Here, the letter "L" denotes […]
    • FInding regions of interest with minimum coverage
      Hi, I have a bam file of all my accepted hits (tophat output) and an gtf file with my genes of interest for which I am trying to find potential antisense transcripts. I would like to create a list - preferably one that can be visualized in a genome browser - that shows all genes that have antisense reads in the accepted hits.bam file provided that there are […]
    • How to remove the intronic reads before counting
      I got RNASeq data in several samples. I checked the FastQC, seems the read quality are good (Hiseq 2000). But the problem is many reads are mapped to intronic region, and the regions have no any reference exons there (Refseq, ensembl, gencode). We don't know what they are. We guess the problem happend in library preparation, the concentration was low. N […]
    • Which strand of the mRNA molecule does the sequencer output as a "read"?
      In Illumina Stranded RNA-Seq (using the dUTP method), do the final reads in the fastq files correspond to the initial molecule (that was transcribed), or to the reverse complement of the molecule? C […]