Exosomes, endosome-derived membrane microvesicles, contain specific RNA transcripts that are thought to be involved in cell-cell communication. These RNA transcripts have great potential as disease biomarkers. To characterize exosomal RNA profiles systemically, a team led by researchers at the Medical College of Wisconsin performed RNA sequencing analysis using three human plasma samples and evaluated the efficacies of small RNA library preparation protocols from three manufacturers. In all they evaluated 14 libraries (7 replicates).

RNA-Seq

From the 14 size-selected sequencing libraries, the researchers obtained a total of 101.8 million raw single-end reads, an average of about 7.27 million reads per library. Sequence analysis showed that there was a diverse collection of the exosomal RNA species among which microRNAs (miRNAs) were the most abundant, making up over 42.32% of all raw reads and 76.20% of all mappable reads. At the current read depth, 593 miRNAs were detectable. The five most common miRNAs (miR-99a-5p, miR-128, miR-124-3p, miR-22-3p, and miR-99b-5p) collectively accounted for 48.99% of all mappable miRNA sequences. MiRNA target gene enrichment analysis suggested that the highly abundant miRNAs may play an important role in biological functions such as protein phosphorylation, RNA splicing, chromosomal abnormality, and angiogenesis. From the unknown RNA sequences, they predicted 185 potential miRNA candidates. Furthermore, they detected significant fractions of other RNA species including ribosomal RNA (9.16% of all mappable counts), long non-coding RNA (3.36%), piwi-interacting RNA (1.31%), transfer RNA (1.24%), small nuclear RNA (0.18%), and small nucleolar RNA (0.01%); fragments of coding sequence (1.36%), 5’ untranslated region (0.21%), and 3’ untranslated region (0.54%) were also present. In addition to the RNA composition of the libraries, they found that the three tested commercial kits generated a sufficient number of DNA fragments for sequencing but each had significant bias toward capturing specific RNAs. Read more

Incoming search terms:

  • Illumina RNA deep sequencing (RNA-seq) technology
  • miRdeep2
  • exosome technology capture
  • exosomes
  • illumina
  • RNA: The genomes rising stars
  • RNA: The genomes rising stars ȫ

While RNA-sequencing represents a well-established technology, the required sequencing depth for detecting all expressed genes is not known. If we leave the entire biological overhead and meta-information behind we are dealing with a classical sampling process. Such sampling processes are well known from population genetics and thoroughly investigated.

In this study, researchers from the University of Vienna and Medical University of Vienna, Austria have used the Pitman Sampling Formula to model the sampling process of RNA-sequencing. By doing so they characterize the sampling by means of two parameters which grasp the conglomerate of different sequencing technologies, protocols and their associated biases. They differ between two levels of sampling: number of reads per gene and respectively, number of reads starting at each position of a specific gene. The latter approach allows one to evaluate the theoretical expectation of uniform coverage and the performance of sequencing protocols in that respect. Most importantly, given a pilot sequencing experiment they provide an estimate for the size of the underlying sampling universe and, based on these findings, evaluate an estimator for the number of newly detected genes when sequencing an additional sample of arbitrary size.

  • Tauber S, von Haeseler A. (2013) Exploring the sampling universe of RNA-seq. Stat Appl Genet Mol Biol [Epub ahead of print]. [abstract]

Incoming search terms:

  • ACGT-101-miR
  • detect ncRNA from RNA-seq data
  • what is rna sequencing?
  • what is it deep sequencing
  • small RNA deep sequencing
  • rna-seq how deep is deep enough
  • rna sequenced deeply enough meaning
  • Rna deep sequencing language:en
  • detect
  • deep sequencing mapping

The methanogenic archaeon Methanopyrus kandleri grows near the upper temperature limit for life. Genome analyses revealed strategies to adapt to these harsh conditions and elucidated a unique transfer RNA (tRNA) C-to-U editing mechanism at base 8 for 30 different tRNA species.

Here, RNA-Seq deep sequencing methodology was combined with computational analyses to characterize the small RNome of this hyperthermophilic organism and to obtain insights into the RNA metabolism at extreme temperatures. A large number of 132 small RNAs were identified that guide RNA modifications, which are expected to stabilize structured RNA molecules. The C/D box guide RNAs were shown to exist as circular RNA molecules. In addition, clustered regularly interspaced short palindromic repeats RNA processing and potential regulatory RNAs were identified. Finally, the identification of tRNA precursors before and after the unique C8-to-U8 editing activity enabled the determination of the order of tRNA processing events with termini truncation preceding intron removal. This order of tRNA maturation follows the compartmentalized tRNA processing order found in Eukaryotes and suggests its conservation during evolution.

tRNA

  • Su AAH, Tripp V, Randau L. (2013) RNA-Seq analyses reveal the order of tRNA processing events and the maturation of C/D box and CRISPR RNAs in the hyperthermophile Methanopyrus kandleri NAR [Epub ahead of print]. [article]

Incoming search terms:

  • fruits tropicaux pine apple
  • transcriptome profiling of giardia intestinalis using strand-specific rna-se
  • cummbund
  • halophytes plants adaptation
  • sequence rna seq
  • stress tolerance rnaseq population genomics
  • tcga rnaseq depth
  • trna cancer
  • xrtyacfsgimrq

The power of deep sequencing technology to reliably detect single RNA reads leads to a paradoxical problem of high sensitivity. In hybridization or PCR based methods for RNA quantification, the concern is low sensitivity, i.e., the problem that the signal from truly expressed genes might not be distinguishable from noise. In contrast, the problem with RNA-seq is that it is not clear whether genes with very low read counts are from low expressed genes or merely transcriptional noise. The frequency distribution for read counts does not show a clear separation in two classes of genes, which makes the decision whether a gene is to be considered expressed or not seemingly arbitrary.

Here, researchers from Yale University address this problem by suggesting a statistical model that considers the number of transcripts detected in a RNA-Seq study as a mixture of two distributions: one is a exponential distribution for transcripts from inactive genes, and a negative binomial distribution for actively transcribed genes. They apply this model to a number of RNA-Seq data sets and find that the model fits the data very well. The calculated criteria for distinguishing between expressed and non-expressed gene is remarkably consistent among data sets, suggesting genes with more than two transcripts per million transcripts (TPM) are highly likely from actively transcribed genes. The regression model correctly identifies the not actively expressed class of genes and thus, provides an operational criterion for classifying genes in expressed and non-expressed sets, facilitating the interpretation of RNA-Seq data.

  •  Wagner GP, Kin K, Lynch VJ. (2013) A model based criterion for gene expression calls using RNA-seq data. Theory Biosci [Epub ahead of print]. [abstract]

Incoming search terms:

  • www rna-seqblog com exponential-negative-binomial-model-for-gene-expression-calls-using-rna-seq-data
  • clustering rna-seq
  • rna-seq for gene expression
  • RNA-seq error have influence on gene expression
  • regulation of gene expresssion in prokaryotes
  • junctions negative binomial
  • edge-pro into deseq
  • edge-pro bacteria rna
  • dispersion matlab
  • deep sequencing rnaseq

The noninfectious HIV-1 transgenic (HIV-1Tg) rat was developed as a model of AIDs-related pathology and immune dysfunction by manipulation of a noninfectious HIV-1gag-pol virus with a deleted 3-kb SphI-MscI fragment containing the 3′ -region of gag and the 5′ region of pol into F344 rats.

The primary goal of this study was to identify differentially expressed genes and enriched pathways affected by the gag-pol-deleted HIV-1 genome. Using RNA deep sequencing, a team led by researchers at the University of Virginia sequenced RNA transcripts in the prefrontal cortex, hippocampus, and striatum of HIV-1Tg and F344 rats. A total of 72 RNA samples were analyzed (i.e., 12 animals per group × 2 strains × 3 brain regions). Following deep-sequencing analysis of 50-bp paired-end reads of RNA-Seq, they used Bowtie/Tophat/Cufflinks suites to align these reads into transcripts based on the Rn4 rat reference genome and to measure the relative abundance of each transcript. Statistical analyses on each brain region in the two strains revealed that immune response- and neurotransmission-related pathways were altered in the HIV-1Tg rats, with brain region differences. Other neuronal survival-related pathways, including those encoding myelin proteins, growth factors, and translation regulators, were altered in the HIV-1Tg rats in a brain region-dependent manner. This study is the first deep-sequencing analysis of RNA transcripts associated the HIV-1Tg rat. Considering the functions of the pathways and brain regions examined in this study, our findings of abnormal gene expression patterns in HIV-1Tg rats suggest mechanisms underlying the deficits in learning and memory and vulnerability to drug addiction and other psychiatric disorders observed in HIV-positive patients.

rat brain

  • Li MD, Cao J, Wang S, Wang J, Sarkar S, et al. (2013) Transcriptome Sequencing of Gene Expression in the Brain of the HIV-1 Transgenic Rat. PLoS ONE 8(3),  e59582. [article]

Incoming search terms:

  • Memek perawan
  • serum rna deep sequencing database
  • HIV RNA seq
  • rna-seq alignment tool hiv
  • memek plos18
  • RNA seq rat
  • bokep blog
  • rna seq rpkm cutoff for rat
  • rna deep sequencing results understanding
  • MEMEK-BOKEP-BLOG

MicroRNAs (miRNAs) are a class of non-coding RNAs of ∼22 nucleotides in length, and constitute a novel class of gene regulators by imperfect base-pairing to the 3′UTR of protein encoding messenger RNAs. Growing evidence indicates that miRNAs are implicated in several pathological processes in myocardial disease. The past years, we have witnessed several profiling attempts using high-density oligonucleotide array-based approaches to identify the complete miRNA content (miRNOME) in the healthy and diseased mammalian heart. These efforts have demonstrated that the failing heart displays differential expression of several dozens of miRNAs. While the total number of experimentally validated human miRNAs is roughly two thousand, the number of expressed miRNAs in the human myocardium remains elusive.

With the objective of performing an unbiased assay to identify the miRNOME of the human heart, both under physiological and pathophysiological conditions, a team led by researchers at Maastricht University, The Netherlands used deep sequencing and bioinformatics to annotate and quantify microRNA expression in healthy and diseased human heart (heart failure secondary to hypertrophic or dilated cardiomyopathy). Their results indicate that the human heart expresses >800 miRNAs, the majority of which not being annotated nor described so far and some of which being unique to primate species. Furthermore, >250 miRNAs show differential and etiology-dependent expression in human dilated cardiomyopathy (DCM) or hypertrophic cardiomyopathy (HCM). The human cardiac miRNOME still possesses a large number of miRNAs that remain virtually unexplored. The current study provides a starting point for a more comprehensive understanding of the role of miRNAs in regulating human heart disease.

  • Leptidis S, El Azzouzi H, Lok SI, de Weger R, Olieslagers S, Kisters N, Silva GJ, Heymans S, Cuppen E, Berezikov E, De Windt LJ, da Costa Martins P. (2013) A Deep Sequencing Approach to Uncover the miRNOME in the Human Heart. PLoS One 8(2), e57800. [article]

Incoming search terms:

  • deep sequencing approach
  • info ptt-poznan pl
  • deep sequencing ppt
  • mirna identification through rna seq
  • deep sequencing rna seq
  • Illumina microarray miRNA probe
  • ptt-poznan pl loc:US
  • A Deep Sequencing Approach to Uncover the miRNOME in the Human Heart
  • review housekeeping genes in arabidopsis filetype;pdf
  • rice rna

MicroRNAs (miRNAs) can group together along the human genome to form stable secondary structures made of several hairpins hosting miRNAs in their stems. The few known examples of such structures are all involved in cancer development. A large scale computational analysis of human chromosomes crossing sequence analysis and deep sequencing data revealed the presence of >400 structural clusters of miRNAs in the human genome. An a posteriori analysis validates predictions as bona fide miRNAs. A functional analysis of structural clusters position along the chromosomes co-localizes them with genes involved in several key cellular processes like immune systems, sensory systems, signal transduction and development. Immune systems diseases, infectious diseases and neurodegenerative diseases are characterized by genes that are especially well organized around structural clusters of miRNAs. Target genes functional analysis strongly supports a regulatory role of most predicted miRNAs and, notably, a strong involvement of predicted miRNAs in the regulation of cancer pathways. This analysis provides new fundamental insights on the genomic organization of miRNAs in human chromosomes.

MIReStruC

Availability: The program, called MIReStruC (standing for ‘miRNA Structural Cluster’), has been implemented in bash, C, Awk and Python. It is available at the address http://www.ihes.fr/∼carbone/data9/.

  • Mathelier A, Carbone A. (2013) Large scale chromosomal mapping of human microRNA structural clusters. Nucleic Acids Res [Epub ahead of print]. [article]

Incoming search terms:

  • mrna and mirna integration software rna-seq
  • microrna cufflink
  • Expressed sequence tag
  • mirna blog
  • mouse mirna rnaseq pipeline
  • microrna bolg
  • is mirdeep suitable for pairend seq
  • problem analysis flowchart
  • rna seq mir bowtie question snorna
  • mirna rna seq kadota

NorahDeskNorahDesk reconstructs full-length putative ncRNA transcripts from short sequence reads by hybridizing contigs. It analyzes not only the distinct read distribution of true ncRNA classes in an unbiased way but also utilizes secondary structures as an independent confirmation source to reliably predict ncRNA from deep sequencing data.

Using publicly available mouse sequence data from brain, skeletal muscle, testis and ovary, NorahDesk was evaluated with an emphasis on the performance for microRNAs (miRNAs) and piwi-interacting small RNA (piRNA). This method was also compared with Dario and mirDeep2 and found to produce longer transcripts with higher read coverage. This feature makes it the first method particularly suitable for the prediction of both known and novel piRNAs.

NorahDesk and the mouse small ncRNA annotation file in BED format used in this study are available at http://www.bioinformatics.org.au/NorahDesk.

  • Ragan C, Mowry BJ, Bauer DC. (2012) Hybridization-based reconstruction of small non-coding RNA transcripts from deep sequencing data. Nucleic Acids Res [Epub ahead of print]. [article]

Incoming search terms:

  • NorahDesk
  • ncRNA detection tool RNA-Seq
  • rna seq non coding rna
  • rna-seq and non-coding rna
  • rna-seq non-coding rna
  • noncoding rna sequencing
  • rna seq non coding
  • ngs ncrna
  • rna deep sequence analyis
  • public available deep sequence data

miRDeep

The capacity of highly parallel sequencing technologies to detect small RNAs at unprecedented depth suggests their value in systematically identifying microRNAs (miRNAs). However, the identification of miRNAs from the large pool of sequenced transcripts from a single deep sequencing run remains a major challenge.

Here, the authors present an algorithm, miRDeep, which uses a probabilistic model of miRNA biogenesis to score compatibility of the position and frequency of sequenced RNA with the secondary structure of the miRNA precursor.

The miRDeep package was developed to discover active known or novel miRNAs from deep sequencing data (Solexa/Illumina, 454, …). The package consists of everything you need to analyze your own deep sequencing data after removal of ligation adapters: a number of scripts to preprocess the mapped data, and the core miRDeep algorithm that will analyze and score these data.

They demonstrate its accuracy and robustness using published Caenorhabditis elegans data and data they generated by deep sequencing human and dog RNAs. miRDeep reports altogether approximately 230 previously unannotated miRNAs, of which four novel C. elegans miRNAs are validated by northern blot analysis.

miRDeep is freely available at: http://www.mdc-berlin.de/en/research/research_teams/systems_biology_of_gene_regulatory_elements/projects/miRDeep/index.html

Friedländer MR, Chen W, Adamidi C, Maaskola J, Einspanier R, Knespel S, Rajewsky N. (2008) Discovering microRNAs from deep sequencing data using miRDeep. Nat Biotechnol 26(4), 407-15. [abstract]

Incoming search terms:

  • rna deep sequencing
  • deep sequencing
  • mirdeep
  • deep sequencing rna
  • deep rna sequencing
  • mirdeep characters
  • mirdeep pair end
  • mirdeep database
  • mirdeep results
  • mirdeep*

Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Using high-throughput Illumina RNA-seq, the transcriptome from C. sinensis was analyzed at an unprecedented depth.

(read more… )

Shi C et al. (2011) Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds. BMC Genomics [Epub ahead of print]. [article]

Incoming search terms:

  • TEA genome
  • tea genome project
  • tea transcriptome
  • transcriptome analysis in tea
  • Camellia sinensis transcriptome
  • Associated Transcriptomics in tea pdf
  • transcriptome analysis in assam tea pdf
  • tea sequencing
  • second generation sequencing in tea
  • RNA-SEQ tea

Sea urchins in the genus Strongylocentrotus are important research models in many areas of bioscience including developmental and cell biology, reproductive biology and evolutionary biology. The purple urchin, specifically S.nudus is one of the most economically important marine animals as the gonads of S. nudus are extensively eaten as a delicacy in China, Korea and Japan. In China, however, natural stocks of S. nudus have declined dramatically due to overfishing and damage of natural habitat. The economic and biomedical importance of the urchin has lead to significant efforts to decode the urchin genome and its genetics. However, only 45 microRNAs are currently listed in miRBase version 16.0, laging behind other deuterostomia species.

Recently, a collaboration of researchers from a group of Life Science universities in China set out to understand the miRNA-based regulatory system of this urchin species as well as basel deuterostomia lineages. First, high throughput sequencing analysis of miRNAs was performed on a small RNA library isolated from five tissues of S. nudus. with the Illumina sequencing platform. Quality reads were filtered for miRNA prediction with the ACGT101-miR-v3.5 package. Reads that matched to rRNA, tRNA, snRNA, snoRNA, repeat sequences, and other ncRNAs deposited in Rfam 8.0, as well as to the sequences containing polyA tails, were discarded. The retained 18-26 nt reads were mapped onto the Strongylocentrotus purpuratus genome and to all deuterostoma known mature miRNA sequences in miRBase Version 16.0. The bioinformatics analysis yielded 415 unique microRNAs including 345 deuterostoma conserved and 70 urchin specific microRNAs, as well as 5 microRNA* sequences.

Next, a miRNA microarray assay was used to confirm the expression of miRNAs in female urchin gonad. A custom microarray consisting of 460 probes for miRNAs corresponding to 415 identified in this research and 45 known urchin miRNAs were designed and in situ synthesized using the µParaflo platform. One hundred miRNAs were confirmed to express at different signal values, 68 of which were identified first time in this research and others were known S.purpuratus miRNAs.

Wei Z, Liu X, Feng T, Chang Y. (2011) Novel and Conserved Micrornas in Dalian Purple Urchin (Strongylocentrotus Nudus) Identified by Next Generation Sequencing. Int J Biol Sci 7, 180-192.[article]

Incoming search terms:

  • tophat miRNA
  • mirna sequencing illumina
  • mirna sequencing analysis
  • microrna sequencing
  • tophat microrna
  • mirna tophat
  • tophat microRNA sequencing
  • small RNA and miRNA sequencing in plants using illumina platform
  • Memek perawan yang bernama mirna famel lia yang ngaji dekat dta nurul ihklas dumai
  • miRNA sequence mapping tools

MicroRNAs (miRNAs) are key regulators of gene expression and contribute to a variety of biological processes including cell growth, differentiation, and development. Abnormal microRNA expression has been reported in various diseases including various cancers, cardiovascular disease, and neurological disorders. Therefore microRNAs are considered to be promising diagnostic and therapeutic candidates for the treatment of human disease.

The miRBase sequence database, is the public repository for all known microRNAs. Newly discovered microRNAs are routinely added and it has grown rapidly with approximately >10,000 entries to date. Despite this rapid growth, many miRNAs have not yet been validated, and many believe there are numerous microRNAs yet to be identified. A lack of a full complement of miRNAs has imposed limitations on recognizing their important roles in development and disease.

Now researchers are using the latest in deep sequencing technology along with advanced bioinformatics packages to identify novel microRNAs in various tissue types and species.

  • Ryu S, Joshi N, McDonnell K, Woo J, Choi H, et al. (2011) Discovery of Novel Human Breast Cancer MicroRNAs from Deep Sequencing Data by Analysis of Pri-MicroRNA Secondary Structures. PLoS ONE 6(2), e16403. [article]
  • Xie SS, Li XY, Liu T, Cao JH, Zhong Q, Zhao SH. (2011) Discovery of Porcine microRNAs in Multiple Tissues by a Solexa Deep Sequencing Approach. PLoS One 6(1), e16235. [article]
  • Creighton CJ, Benham AL, Zhu H, Khan MF, Reid JG, Nagaraja AK, Fountain MD, Dziadek O, Han D, Ma L, Kim J, Hawkins SM, Anderson ML, Matzuk MM, Gunaratne PH. (2010) Discovery of novel microRNAs in female reproductive tract using next generation sequencing. PLoS One 5(3), e9637. [article]
  • Huang QX, Cheng XY, Mao ZC, Wang YS, Zhao LL, Yan X, Ferris VR, Xu RM, Xie BY. (2010) MicroRNA discovery and analysis of pinewood nematode (Bursaphelenchus xylophilus) by deep sequencing. PLoS One 5(10), e13271. [article]
  • Song C, Wang C, Zhang C, Korir NK, Yu H, Ma Z, Fang J. (2010) Deep sequencing discovery of novel and conserved microRNAs in trifoliate orange (Citrus trifoliata). BMC Genomics 11, 431. [article]
  • Zhao CZ, Xia H, Frazier TP, Yao YY, Bi YP, Li AQ, Li MJ, Li CS, Zhang BH, Wang XJ. (2010) Deep sequencing identifies novel and conserved microRNAs in peanuts (Arachis hypogaea L.). BMC Plant Biol 10, 3. [article]

Incoming search terms:

  • rna-seq mirna
  • miRNA-Seq
  • rnaseq mirna
  • rna seq mirna
  • mirna rna-seq
  • RNAseq microRNA
  • mirna sequencing
  • rna-seq microrna
  • microRNA RNA-seq
  • micro rna ppt

miRBase is the primary online repository for all microRNA sequences and annotation. The current release (miRBase 16) contains over 15,000 microRNA gene loci in over 140 species, and over 17,000 distinct mature microRNA sequences. Read more

Incoming search terms:

  • rna-seq small rna tophats miRbase
  • RS-DBI driver: (error in statement: no such table: replicates)

Next Page →

  • Social Networking Pages

    Linkedin Group

  • Follow Me on Pinterest
  • RSS SEQanswers – RNA Sequencing

    • DESeq; can I omit timepoints during dispersal estimation? May 24, 2013
      I have a bacterial timecourse with 2 biological replicates per timepoint. There is a fair bit of variance between my replicates. I have spent the... […]
      amcloon
    • HT Seq Count stranded options May 24, 2013
      I am very new to bioinformatics, so I would be really grateful for some help! I have been using *HTSeq Count v0.5.3* and I am bit confused about... […]
      qwrissie
    • Tophat 2.0.8b installation error May 24, 2013
      I install tophat-2.0.8b to rerun the mapping. but when i make it, the error appears like this. make[1]: Entering directory... […]
      canhu
    • reason for low mapping rate?? May 23, 2013
      we did RNASeq using HiSeq 2000 100PE. When the data were back, I mapping them to the reference sequence, but got very low mapping rate (30-40%). I... […]
      miaom
    • cross-species data - questions about normalization May 23, 2013
      Hi, I have some data form various samples (cell types) in different species. I want to compare and analyze gene expression variability across the... […]
      trelek2
    • CuffDiff strange output May 23, 2013
      Hi, I hope that someone can be so gentle to help me. I'm analizing some data from RNA-Seq with TopHat and Cufflinks and I focus my attention on... […]
      Pruexel
  • RSS Biostar – RNA-Seq

    • Why am I getting so many unmapped reads in STAR, classified as "too short"?
      I am currently using STAR to map several Hi-SEQ mRNA runs. I'm having trouble getting a decent amount of reads to map, but I don't really understand why. I'm hoping you can shed some light :) In the final log, only about 50% (or less) of the reads map to the reference. I'm using a GTF in addition to the genome. The unmapped bin that most […]
    • What are the best practices for SNP identification in RNA seq transcriptome data
      I have 20 RICE RNA seq tranascriptome data hiseq 2000 platform paired end reads. I aligned fasta reads with BWA and remove PCR duplicates with PICARD. Later I call SNP with samtools using various parameters. I would like to clarify what parameters should I used while alinging to reference rice genome for looking SNP location 100 bp upstream and 250 bp downst […]
    • How do TopHat options -g , --supress-hits, and Bowtie options interplay?
      Hi, I am currently using TopHat2 to map RNA-seq runs. I think there have been some changes pertaining the -g option. Does anyone know how it works now? I used to think that setting -g would look for n alignments for a given read, report them [if top-scoring] and discard those reads that had more than g [top scoring] alignments. Now, the description sounds mo […]
    • What happened to -k in TopHat for multiple-mapping reads?
      Selecting -g n in tophat does not discard reads mapping more than n, but instead only reports n alignments for those out all all their TOP scoring alignments. I think there used to be an option -k that would allow one to discard reads that topped x alignments -- whatever happened to that? I only see -g in the tophat 2 manual, no reporting options like before […]
    • Does tophat use the library-type information for mapping, or just for the XS flag?
      When I specify library-type to TopHat, i.e., first-strand, second-strand, unstranded, TopHat appends a value + or - to the XS:A flag, which is useful for subsequent analyses, such as annotation. However, does this information actually influence the "mappability" of reads, or is this unaffected? My thinking is that the information would be considere […]
    • Purpose of Y-shaped adapters in Illumina Sequencing?
      Hi all, Y adapters different sequences to be annealed to the 5' and 3' ends of each molecule in a library. The arms of the Y are unique, and the middle part, connected to the DNA fragment, is complementary. What are the advantages of this? My take of this over having fully-complementary adapters (ADAPTER1 - - - - - ADAPTER1) is that: -Upon primer a […]