Dissecting the regulatory relationships between genes is a critical step towards building accurate predictive models of biological systems. A powerful approach towards this end is to systematically study the differences in correlation between gene pairs in more than one distinct condition.
In this study researchers from the Icahn School of Medicine at Mount Sinai develop an R package, DGCA (for Differential Gene Correlation Analysis), which offers a suite of tools for computing and analyzing differential correlations between gene pairs across multiple conditions. To minimize parametric assumptions, DGCA computes empirical p-values via permutation testing. To understand differential correlations at a systems level, DGCA performs higher-order analyses such as measuring the average difference in correlation and multiscale clustering analysis of differential correlation networks. Through a simulation study, the researchers show that the straightforward z-score based method that DGCA employs significantly outperforms the existing alternative methods for calculating differential correlation. Application of DGCA to the TCGA RNA-seq data in breast cancer not only identifies key changes in the regulatory relationships between TP53 and PTEN and their target genes in the presence of inactivating mutations, but also reveals an immune-related differential correlation module that is specific to triple negative breast cancer (TNBC).
Workflow for the Differential Gene Correlation Analysis (DGCA) R package
Users input a gene expression matrix, a design matrix to specify the conditions, and a comparison vector to specify which conditions will be compared. DGCA then calculates the gene pair correlations within each condition, processes these correlation values, and compares them to build up a difference in correlation matrix. If permutation testing is chosen, DGCA will perform the same procedure on permuted gene expression matrices. These permutation samples are used to estimate an empirical false discovery rate. After investigators choose the significance threshold for differential correlation between conditions (if any) to choose downstream gene pairs, they can use DGCA’s capacities for visualization, gene ontology (GO) enrichment, and/or network construction
Availability – The DGCA R package will be available for download from CRAN (the Comprehensive R Archive Network, https://cran.r-project.org/), a repository of open-source software. Source code and other files are available at https://github.com/andymckenzie/DGCA