Cloud ComputingThe winner of the Big Science Challenge, a contest convened last year by Cycle Computing to provide $10,000 in cloud computing resources for groundbreaking biomedical research, has successfully completed the first phase of its project while logging more than 115 compute years on the Amazon Cloud.

Victor Ruotti and colleagues from the Morgridge Institute for Research at the University of Wisconsin claimed top prize in the challenge.  The intense computing for Ruotti’s experiment – a pariwise comparison of RNA-Seq signatures for 124 stem cell lines — was performed over a week using very high memory instances – each core had 8 Gigabytes (GB) memory. About 1.6 million jobs were scheduled using Condor, although Stowe says other schedulers such as GridEngine could also be used. Spot availability varied over time – up to a maximum of 8,000 cores concurrently, with an average of 5,000 cores running.

The result was 7-8 Terabytes (TB) BAM files.

“The goal of the Big Science Challenge was to help people think bigger than they normally would, to do things that would be impossible on a local cluster,” said Cycle Computing CEO Jason Stowe

(read more at Bio-IT World…)

Incoming search terms:

  • rna seq cloud computing
  • rna-seq data volume
  • cloudmap rna free
  • differential expression pipeline using bowtie with galaxy
  • broad mit cloud sequence analysis
  • RNA-Seq Cloud
  • rna seq alignments cloud
  • galaxy cluster rnaseq
  • free cloud space for running rnaseq
  • cycle computing review

FX is an RNA-Seq analysis tool, which runs in parallel on cloud computing infrastructure, for the estimation of gene expression levels and genomic variant calling. In the mapping of short RNA-Seq reads, FX uses a transcriptome-based reference primarily, generated from ∼160,000 mRNA sequences from RefSeq, UCSC and Ensembl databases. This approach reduces the misalignment of reads originating from splicing junctions. Unmapped reads not aligned on known transcripts are then mapped on the human genome reference. FX allows analysis of RNA-Seq data on cloud computing infrastructures, supporting access through a user-friendly web interface.

Availability: FX is freely available on the web at (http://fx.gmi.ac.kr), and can be installed on local Hadoop clusters.

  • Hong D, Rhie A, Park SS, Lee J, Ju YS, Kim S, Yu SB, Bleazard T, Park HS, Rhee H, Chong H, Yang KS, Lee YS, Kim IH, Lee JS, Kim JI, Seo JS. (2012) FX: an RNA-Seq analysis tool on the cloud. Bioinformatics [Epub ahead of print]. [abstract]

Incoming search terms:

  • rna-seq variant calling
  • hadoop bam coverage ensembl
  • s-mart rna seq tutorial
  • softwares to know about expression level in reads
  • acquire public domain rna seq raw data
  • fx rna seq analysis tool in the cloud
  • gene network analysis from rna-seq
  • how are genetic testing companies able to pinpoint particular alleles

cloud computingRNA-Seq is becoming the tool of choice for gene expression studies, as it can facilitate the investigation of phenomena beyond the reach of traditional microarrays, such as novel transcripts and isoforms, alternative splice sites, and allele-specific expression. However, this increased power comes with orders of magnitude higher complexity in terms of bioinformatics, data storage, and processing.

Prognosys Biosciences announced Voila!™, a new cloud-based data analysis service for next-generation sequencing data. Voila! will be available initially for RNA sequencing projects that utilize data from Illumina HiSeq and GAIIx next-generation sequencing instruments.

(Read the press release… )

Golden Helix and Expression Analysis announced they will be developing a cloud-based analytic solution to increase adoption of RNA sequencing. Bioinformatic processes will be performed in a service-based cloud compute environment. This offering will address the obstacles of sequence data by providing cloud-based and integrated desktop analysis tools that are scalable, affordable, and simplified.

(Read the press release… )

Appistry, Inc. announced the release of a series of advanced RNA-Seq solutions for the rapid analysis of sequencing data generated by this emerging technology. The TopHat, TopHat-Fusion and MapSplice-based solutions leverage the Ayrris/BIO(TM) high-performance computing platform to foster Personalized Medicine approaches by enabling researchers to process and analyze large volumes of data in a fraction of the time currently required by conventional gene expression profiling technologies. The RNA-Seq solutions were developed by the Appistry Life Sciences Group–recently established to conceptualize and deliver technologies for Next Generation Sequencing.

(Read the press release… )

Incoming search terms:

  • human bodymap 2 0 data from illumina review
  • cloud rna seq
  • rna-seq data storage
  • RNA sequencer instruments
  • instrument for rna sequencing
  • illumina human bodymap 2 0 software
  • illumina body map normal
  • GAIIX transcriptome seq
  • epicentre biotechnologies acquisition revenue
  • cloud-based rna-seq

GenePattern – is a powerful genomic analysis platform that provides access to more than 100 tools for gene expression analysis, proteomics, SNP analysis and common data processing tasks.

GenePattern offers a suite of tools to support a wide variety of RNA-seq analyses, including short-read mapping, identification of splice junctions, transcript and isoform detection, quantitation, and differential expression. The modules have been adapted from widely-used tools. GenePattern also provides pipelines that allow you to perform a number of multi-step RNA-seq analyses automatically. Read more

Incoming search terms:

  • tophat fusion genepattern
  • galaxy gene pattern
  • genepatter paper
  • genepattern
  • genepattern de novo rna seq
  • genepattern ngs
  • genepattern rna-seq
  • RNA-seq alternative splicing in silico full length

The University of California, Santa Cruz (UCSC) Genome Browser is an up-to-date source for genome sequence data from a variety of vertebrate and invertebrate species and major model organisms, integrated with a large collection of aligned annotations. The Browser is a graphical viewer optimized to support fast interactive performance and is an open-source, web-based tool suite built on top of a MySQL database for rapid visualization, examination, and querying of the data at many levels.

The Genome Browser Database, browsing tools, downloadable data files, and documentation can all be found on the UCSC Genome Bioinformatics website.

  • Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D. (2002) The human genome browser at UCSC. Genome Res 12(6), 996-1006. [abstract]

Incoming search terms:

  • ucus genome
  • rna seq reference sequence
  • UCSC reference sequence
  • downloading reference transcriptome ucsc
  • ucsc transcriptome download
  • ucsc transcriptome
  • UCSC genome browser rna-seq
  • sequence data analysis ucsc genome browser
  • rna seq ucsc genome browser
  • reference transcriptome genome

From GenomeWeb – By Matthew Dublin

Using a grant from Amazon Web Services and the National Institutes of Health, researchers at the Johns Hopkins Bloomberg School of Public Health have developed an RNA sequencing data analysis program for the cloud called Myrna. The new software calculates differential gene expression in large RNA-seq datasets by using Bowtie, an ultrafast, memory-efficient short read aligner, and R/Bioconductor for statistical calculations. These tools are combined in an automatic, parallel pipeline that runs in the cloud using Elastic MapReduce, on a local Hadoop cluster. Read more

Incoming search terms:

  • Myrna rna-seq
  • rna cloud
  • amazon cloud compute cost rna seq
  • myrna rnaseq
  • sequence analysis cloud computing
  • myrna ngs
  • r bioconductor rna-seq アマゾン
  • rnaseq tophat cloud amazon
  • SEQ Alignment CLOU
  • myrna in bioinformatics

  • Social Networking Pages

    Linkedin Group

  • Follow Me on Pinterest
  • RSS SEQanswers – RNA Sequencing

    • The Transcript Length from Cufflinks May 25, 2013
      Hi Guys, I'm doing a fungus RNA-Seq.However, the merged transcriptome gave me very long transcripts (generally >2K). I used GeneMarES to do... […]
      hchang10
    • DESeq; can I omit timepoints during dispersal estimation? May 24, 2013
      I have a bacterial timecourse with 2 biological replicates per timepoint. There is a fair bit of variance between my replicates. I have spent the... […]
      amcloon
    • HT Seq Count stranded options May 24, 2013
      I am very new to bioinformatics, so I would be really grateful for some help! I have been using *HTSeq Count v0.5.3* and I am bit confused about... […]
      qwrissie
    • Tophat 2.0.8b installation error May 24, 2013
      I install tophat-2.0.8b to rerun the mapping. but when i make it, the error appears like this. make[1]: Entering directory... […]
      canhu
    • reason for low mapping rate?? May 23, 2013
      we did RNASeq using HiSeq 2000 100PE. When the data were back, I mapping them to the reference sequence, but got very low mapping rate (30-40%). I... […]
      miaom
    • cross-species data - questions about normalization May 23, 2013
      Hi, I have some data form various samples (cell types) in different species. I want to compare and analyze gene expression variability across the... […]
      trelek2
  • RSS Biostar – RNA-Seq

    • Why am I getting so many unmapped reads in STAR, classified as "too short"?
      I am currently using STAR to map several Hi-SEQ mRNA runs. I'm having trouble getting a decent amount of reads to map, but I don't really understand why. I'm hoping you can shed some light :) In the final log, only about 50% (or less) of the reads map to the reference. I'm using a GTF in addition to the genome. The unmapped bin that most […]
    • What are the best practices for SNP identification in RNA seq transcriptome data
      I have 20 RICE RNA seq tranascriptome data hiseq 2000 platform paired end reads. I aligned fasta reads with BWA and remove PCR duplicates with PICARD. Later I call SNP with samtools using various parameters. I would like to clarify what parameters should I used while alinging to reference rice genome for looking SNP location 100 bp upstream and 250 bp downst […]
    • How do TopHat options -g , --supress-hits, and Bowtie options interplay?
      Hi, I am currently using TopHat2 to map RNA-seq runs. I think there have been some changes pertaining the -g option. Does anyone know how it works now? I used to think that setting -g would look for n alignments for a given read, report them [if top-scoring] and discard those reads that had more than g [top scoring] alignments. Now, the description sounds mo […]
    • What happened to -k in TopHat for multiple-mapping reads?
      Selecting -g n in tophat does not discard reads mapping more than n, but instead only reports n alignments for those out all all their TOP scoring alignments. I think there used to be an option -k that would allow one to discard reads that topped x alignments -- whatever happened to that? I only see -g in the tophat 2 manual, no reporting options like before […]
    • Does tophat use the library-type information for mapping, or just for the XS flag?
      When I specify library-type to TopHat, i.e., first-strand, second-strand, unstranded, TopHat appends a value + or - to the XS:A flag, which is useful for subsequent analyses, such as annotation. However, does this information actually influence the "mappability" of reads, or is this unaffected? My thinking is that the information would be considere […]
    • Purpose of Y-shaped adapters in Illumina Sequencing?
      Hi all, Y adapters different sequences to be annealed to the 5' and 3' ends of each molecule in a library. The arms of the Y are unique, and the middle part, connected to the DNA fragment, is complementary. What are the advantages of this? My take of this over having fully-complementary adapters (ADAPTER1 - - - - - ADAPTER1) is that: -Upon primer a […]