Jun
30
SeqGene – general purpose software for mining post-aligment RNA-Seq datasets
Filed Under Expression and Quantification, Splicing and Junction Mapping, Unspliced Mapping Tools | Leave a Comment
SeqGene is an open-source software for mining next-gen sequencing datasets, focusing on post-alignment quality control, SNP and indel identification and annotation, RNA expression quantification, allele specific expression, and expression-genotying association analysis. SeqGene is especially suited for RNA-seq and exonome-seq applications, with focus on protein coding and regulatory regions of a genome. For RNA-seq applications, SeqGene implemented a novel topology-based pathway analysis method to identify SNP-Expression co-enrichment and SNP-Expression paths. Read more
Incoming search terms:
- seqgene
- RNA-seq_pipeline pdf
- RNA seq data mining
- rnaseq data to pathways
- RNA seq for allele mining
- cut sequence mining software
- mining of RNAseq dataset in arabidopsis
- how to mine rnaseq data
- Herpès virus des bivalves
- general purpose software
Jun
30
RNA-Seq for Gene Finding
Filed Under Data Analysis, Presentations | Leave a Comment
Incoming search terms:
- de novo gene prediction via rna sequencing
- gene prediction rnaseq
- predicting orfs from rna seq
- rnaseq find lncra
- rna-seq to find orf
- RNA-seq new gene
- RNA-SEQ find new gene
- RNA seq result ORF
- rna seq predict orf
- ORF prediction for RNAseq data
Jun
29
RNA-Seq News – from GenomeWeb
Filed Under News | Leave a Comment
Variability Across RNA-seq Experiments Suggests Need for Careful Study Design
Researchers found that even at high coverage, the estimate of the relative abundance of a particular transcript can “substantially disagree” between sequencing experiments using the same platforms and protocols.
Study Finds Array and Sequencing Combo Yields Novel Info on Gene Expression
A recent study suggests that using arrays and sequencing together could help generate more reliable data sets than when either method is used alone, and should help improve the confidence of functional analysis.
Jun
27
Transcriptome Sequencing and De Novo Analysis for Yesso Scallop (Patinopecten yessoensis)
Filed Under Transcriptome Sequenced | Leave a Comment
Bivalves comprise 30,000 extant species, constituting the second largest group of mollusks. However, limited genetic research has focused on this group of animals so far, which is, in part, due to the lack of genomic resources. The advent of high-throughput sequencing technologies enables generation of genomic resources in a short time and at a minimal cost, and therefore provides a turning point for bivalve research. In the present study, we performed de novo transcriptome sequencing to first produce a comprehensive expressed sequence tag (EST) dataset for the Yesso scallop (Patinopecten yessoensis). Read more
Incoming search terms:
- 454 rna denovo
- transcriptome 2013 -newt -wasp
- news patinopecten yessoensis
Jun
27
FlyBase adds RNA-Seq Data Sets
Filed Under Databases, Web Tools | Leave a Comment
FlyBase has just incorporated several new RNA-Seq data sets from the modENCODE project. These data sets differ from our current RNA-Seq data in that the expression is displayed by strand. One of these data sets includes temporal expression data from the embryonic stages. The other data sets include expression data from a selection of tissues and timepoints, and under a variety of treatments. RNA-Seq expression data, by strand, from cell lines (e.g. Kc, S2) is also now available.
The Treatment Data represents the transcriptome of 4-day old mated adult flies and/or feeding third instar larvae that were fed or exposed to various toxins or environmental stress factors encountered in nature. The concentrations and exposure times used in this study were taken from previously published experiments or were based on experimentally determined LD50 results when there were no preexisting data available. These data can be viewed on GBrowse by selecting the Data Source menu option “D. melanogaster RNA-Seq Data” and selecting the appropriate tracks.
(read more… )
Incoming search terms:
- RNA-Seq gbrowse log
Jun
24
miRDeep – Discovering known and novel miRNAs from deep sequencing data
Filed Under Other Tools | 1 Comment
The capacity of highly parallel sequencing technologies to detect small RNAs at unprecedented depth suggests their value in systematically identifying microRNAs (miRNAs). However, the identification of miRNAs from the large pool of sequenced transcripts from a single deep sequencing run remains a major challenge.
Here, the authors present an algorithm, miRDeep, which uses a probabilistic model of miRNA biogenesis to score compatibility of the position and frequency of sequenced RNA with the secondary structure of the miRNA precursor.
The miRDeep package was developed to discover active known or novel miRNAs from deep sequencing data (Solexa/Illumina, 454, …). The package consists of everything you need to analyze your own deep sequencing data after removal of ligation adapters: a number of scripts to preprocess the mapped data, and the core miRDeep algorithm that will analyze and score these data.
They demonstrate its accuracy and robustness using published Caenorhabditis elegans data and data they generated by deep sequencing human and dog RNAs. miRDeep reports altogether approximately 230 previously unannotated miRNAs, of which four novel C. elegans miRNAs are validated by northern blot analysis.
miRDeep is freely available at: http://www.mdc-berlin.de/en/research/research_teams/systems_biology_of_gene_regulatory_elements/projects/miRDeep/index.html
Friedländer MR, Chen W, Adamidi C, Maaskola J, Einspanier R, Knespel S, Rajewsky N. (2008) Discovering microRNAs from deep sequencing data using miRDeep. Nat Biotechnol 26(4), 407-15. [abstract]
Incoming search terms:
- rna deep sequencing
- deep sequencing
- mirdeep
- deep sequencing rna
- deep rna sequencing
- mirdeep characters
- mirdeep pair end
- mirdeep database
- mirdeep results
- mirdeep*
Jun
24
Genome Annotation with RNA-Seq
Filed Under Publications, Transcriptome Assembly Tools | Leave a Comment
While RNA-Seq’s capability of high-resolution and accuracy in transcript abundance estimation has been thoroughly demonstrated, (so much so that it is being heralded as a possible replacement for microarray based gene expression technology) there is another important application for RNA-Seq; the improvement of existing genome annotations and even the possibility of complete de novo genome annotation.
Improvements to current genome annotation is a topic that has been discussed before on the RNA-Seq Blog. See post from earlier this year:
Jan 13 – RNA-Seq Datasets Improving Genome Annotation in Plants, Animals, Bacteria
Jan 7 – Improvements to Ensembl include a de novo RNA-seq gene annotation pipeline
Now, researchers at UC Berkley and the Broad Institute have developed a novel approach termed “reference annotation based transcript (RABT) assembly”. They claim that it is a “pure” assembler and that it does not utilize information about the structure and content of coding genes, or other external input (e.g. ESTs) during the assembly.
However, a problem exists with using RNA-Seq for annotation. Genes that are expressed at a low level will be represented by few reads and may be only partially covered. This means that naive assembly methods will fail to reconstruct the majority of full-length transcripts.
(Read how their method overcomes this problem… )
Availability: The methods described in this paper are implemented in the Cufflinks suite of software for RNA-Seq, freely available from http://bio.math.berkeley.edu/cufflinks.
- Roberts A, Pimentel H, Trapnell C, Pachter L. (2011) Identification of novel transcripts in annotated genomes using RNA-Seq. Bioinformatics [Epub ahead of print]. [abstract]
Incoming search terms:
- rna-seq genome annotation
- rna-seq annotation
- rna seq annotation
- annotate rnaSeq
- annotation rna seq
- RNA-seq gene annotation
- how to annotate rnaseq data
- identification of novel transcripts in annotated genomes using rna-seq
- rna annotation pipeline
- RNA-seq annotation pipeline
Jun
23
The FDA has begun to develop their program to evaluate sequencing based diagnostics. At a recent meeting, the Association for Molecular Pathology (AMP) advised FDA officials on many important considerations for evaluating the analytical validity of next-generation sequencing:
The analytical validation requirements for NGS will vary based on the clinical application at issue, such as a mutation panel for a Mendelian disease versus transcriptome analysis.
Performance of, and coverage needs for, a given platform are likely to differ depending on:
- the nucleic acid analyzed
- the characteristics of the DNA regions and the type of variations interrogated
- the relative allele proportions of particular variants
- whether quantitative or qualitative results are desired
Flexibility and individualization is necessary in the development of validation protocols, guidelines, and controls on an application-by-application basis.
The test system, the analytical validity of the instrument and the performance of the bioinformatics software should be evaluated both independently and as a complete system.
Incoming search terms:
- NGS for clinics
- rna seq clinic
Jun
23
Advanced RNA-Seq Course
Filed Under Events | Leave a Comment
Date: Aug 25-26, 2011
Location: Amsterdam Medical Centre, Amsterdam
Organizer: NBIC & LUMC
Contact(s): Dr. Celia van Gelder
Level: PhD
NBIC and LUMC will organize a 2-day course on RNA-seq data analysis on August 25 and 26, 2011. The course will be hosted by Antoine van Kampen at the AMC, Amsterdam. The course will consist of seminars and hands-on R practicals and will focus on data preprocessing, quality control, and statistical methods for detection of differential gene expression. It will be an expert course and a follow-up course on the general NBIC NGS data analysis course (which will be given from 5-7 september 2011 in Leiden. Participants for the RNA-seq course should preferably have participated in the general NGS course or otherwise have ample experience with NGS technology. The course is aimed at PhD students and postdocs, but scientific programmers with some background in biology and bioinformatics may also attend.
Course topics:
- RNA-seq experimental approaches
- Alignment
- Statistics for differential gene expression
- eQTL analysis R packages for RNA-seq data analysis
Confirmed speakers:
Rutger Brouwer, Lude Franke, Jelle Goeman, Philip de Groot, Peter-Bram ‘t Hoen (course coordinator), Antoine van Kampen, Nagesha Rao, Marieke Simonis, Marcel Willemsen, Kai Ye, Erik van Zwet
(more info… )
Jun
21
If you’re new to RNA-Seq or computational Biology in general, here is a short presentation overview.
from Wei Sun – Assistant professor, University of North Carolina-Chapel Hill Department of Biostatistics – Bios 784: Introduction to Computational Biology – class notes
http://www.bios.unc.edu/~wsun/teach/RNA-seq_pipeline.pdf
Incoming search terms:
- rna seq analysis pipeline
- rna-seq analysis pipeline
- rna seq pipeline
- rnaseq analysis pipeline
- NGS pipeline
- ngs data analysis pipeline
- data analysis pipeline
- RNA-seq pipline
- sequencing data analysis pipeline
- pipeline for RNA Seq data analysis
Jun
21
Quality of RNA-Seq Data
Filed Under Data Analysis, Presentations | Leave a Comment
Is RNA-seq data really “digital”? Is it more sensitive or reliable than microarrays?
Before you go, assess technological bias, limitation and cost-performance with publicly available data. not on the vendor’s champion data.
Data Source: GSE29155
RNA-Seq anlalysis of prostate cancer cell lines using Next Generation Sequencing
http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE29155
Incoming search terms:
- rna quality for rna-seq
- facs rna seq
- facs rnaseq
- FACS sorting combined RNA-Seq
- how many facs-sorted cells for rna-seq
- quality of rna seq data
- rna seq facs
Jun
20
Several viruses are known to cause cancer, such as human herpes virus 8 in Kaposi sarcoma and human papilloma viruses in cervical cancer. Recently, Merkel cell polyoma virus (MCPyV) has been described in 80% of Merkel cell carcinomas (MCC). Similarly to MCC and Kaposi sarcoma, melanoma incidence is increased in immunosuppressed patients.
Melanoma is an aggressive type of cancer; known risk factors to develop melanoma are UV exposition, age and skin type. Intriguingly, melanoma also occurs more frequently in immunosuppressed patients. Although electron microscopy revealed virus-like particles in melanoma, up to now, no virus causing melanoma could be identified. Researchers at University of Tübingen, Germany set out to determine whether infection by known or yet unknown viruses may play a role in melanoma development as well.
To detect viral sequences expressed in melanoma cells, they analysed three melanoma metastases by whole-transcriptome sequencing and digital transcriptome subtraction. None of the samples investigated harboured viral sequences. In contrast, artificial viral sequences and MCPyV transcripts used as a positive control for the bioinformatics analysis were detected. This renders it less likely that viruses are frequently involved in melanoma induction. A larger number of melanoma transcriptome sequencings are required to rule out viruses as a relevant pathogen.
Feldhahn M, Menzel M, Weide B, Bauer P, Meckbach D, Garbe C, Kohlbacher O, Bauer J. (2011) No evidence of viral genomes in whole-transcriptome sequencing of three melanoma metastases. Exp Dermatol [Epub ahead of print]. [abstract]
Incoming search terms:
- melanoma virus
- Viruses and melanomas
- virus melanoma
- virus causing melanoma
- rnaseq viral human herpes
- melanoma and viruses
- is melanoma a virus
- rnaseq data melanoma
- rnaseq virus
- viral melanoma
Jun
17
Designing RNA-Seq Experiments
Filed Under Publications | Leave a Comment
For RNA-seq experiments, besides the randomization in preparing the research subjects, there are many other steps to consider for randomization due to the complexity of the technologies. For example, we can randomize the sample order for various steps in the library construction and the order/location of the samples in the sequencer.
Replication
The most desirable replicates are the biological replicates, which are true replicates and provide us the variation among biological samples. Some studies include biological replicates, while many others only have technical replicates that are repeated measurements from the same biological sample. If the goal is to evaluate the technology, technical replicates alone are sufficient.
RNA-Seq Specific Effects
RNA-seq experiments can be affected by common variability coming from various technical effects like processing date, technician and reagent batch. However, there are some recognized technical effects specific to the RNA-seq procedures. Among these sources of variation, the library preparation effect is the largest. The flow cell and lane effects are relatively small.
Sequencing Depth
Due to the random sampling nature of RNA-seq, it will take a large number of sequences to measure the transcripts that are expressed at low level. For a given budget, it is critical to decide whether to increase the sequencing depth to have more accurate measurements on the genes expressed at low level or increase the sample size with limited sequencing depth for each sample. It would take extremely deep coverage in order to detect allelic differential expression for genes expressed at a fairly low level.
Paired-end Sequencing
At the same sequencing depth, the pair-end sequences increase the sensitivity and specificity of the detection of the alternative splicing and chimeras in comparison with the single end sequencing.
Biases of Next-Generation Sequencing
In reality, sequence reads are not exactly randomly obtained from transcripts. Biases have been found to be related to GC content of the sequence, the use of the random hexamer primers, 3′ and 5′ depletion or bias towards 3′-end, and bias toward specific RNA species. Most of these biases are related to library preparation methods. From the experimental design point of view, these biases increase the required samples size and sequence depth, which emphasize the importance of choosing better protocols and selecting the right analysis methods.
Sample Size Calculation for RNA-Seq
The sample size may be determined at two levels—the number of lanes for technical replicates in one treatment or the number of biological replicates for each treatment. In the cases when there are only technical replicates and the library preparation effects and lane effects are negligible or mitigated by proper designs, sample sizes can be calculated gene-by-gene based on Poisson models. When there are biological replicates and the over-dispersion problem exists, NB distributions are more appropriate than Poisson distributions to model the RNA-seq data. First obtain the sample sizes for one gene and then determine the overall sample size based on the overall average power.
Validation
It is worth pointing out that validation using qRT-PCR on the same RNA samples assayed in the RNA-seq analysis only validates the technology. It does not validate the conclusion about the treatments/conditions. It is the validation using different biological replicates from the same populations that can further validate the biological conclusions from RNA-seq experiments.
Fang Z, Cui X. (2011) Design and validation issues in RNA-seq experiments. Brief Bioinform. 12(3), 280-87. [abstract]
Incoming search terms:
- rna seq experimental design
- rna-seq experimental design
- rna-seq experiment
- rna-seq power analysis
- rna seq replicates
- rna-seq validation
- experimental design rna-seq
- power calculation rna-seq
- rnaseq power calculations
- rna-seq replicates


.png)













