- Oct. 31 – Nov. 2, 2018 and
Nov. 14-16, 2018
- Please note: This is one iteration spanning six days over two weeks.
High-throughput sequencing techniques can rapidly provide a global picture of the processes within cells, allowing for accelerated discoveries in biology. As sequencing technologies improve and the costs decline, learning how to analyze and handle these large datasets has become imperative for researchers in biology and medicine.
In the Introduction to RNA-seq Analysis Using High-Performance Computing Workshop, participants will learn the basics of Unix/Linux and gain experience using the HMS compute cluster (O2). Participants will also learn the workflow (tools and parameters) to generate gene counts (expression data) from RNA sequencing data, and considerations for designing a robust RNA sequencing experiment. This workshop will cover how to efficiently manage and analyze data using the Unix/Linux command line interface and high-performance computing (HPC). Together, these methods form the foundation of high-throughput sequencing data analysis and are critical for researchers looking to become efficient when performing computational tasks and working with high-throughput data. Ideal participants are researchers who want to build a foundation for analyzing sequencing data.
The Introduction to R: Basics, Plots, and RNA-seq Differential Expression Analysis Workshop will show participants how to use gene count data generated in the previous workshop to generate lists of differentially expressed genes and perform functional analysis on them to gain better biological insight. This workshop will introduce participants to the basics of R and RStudio and their application to differential gene expression analysis as well as downstream functional analysis on RNA-seq count data. Together, R and RStudio allow participants to wrangle data, generate publication-quality plots and use various packages to extend the functionality of R. This workshop is intended to provide both basic R programming knowledge AND information on its application. Participants should be interested in using R for increasing their efficiency for data analysis, visualizing data using R (ggplot2), and using R to perform statistical analysis on RNA-seq count data to obtain differentially expressed gene lists.
- Learning how to efficiently manage and analyze sequencing data using the Unix command line interface on a high-performance computing (HPC) environment
- Understanding the basics of R (with RStudio) and utilizing it to perform differential gene expression analysis on RNA-seq count data and visualize results.
- MD, PhD, or equivalent
- Preference will be given to applicants with:
- Interest in RNA sequencing analysis for bulk tissue samples
- No or minimal experience with data analysis using the Unix command line interface and the R programming environment
- No prior programming experience or command line training is required.
- Six days