Canonical correlation analysis (CCA) for RNA-Seq co-expression networks

Digital transcriptome analysis by next-generation sequencing discovers substantial mRNA variants. Variation in gene expression underlies many biological processes and holds a key to unravelling mechanism of common diseases. However, the current methods for construction of co-expression networks using overall gene expression are originally designed for microarray expression data, and they overlook a large number of variations in gene expressions.

CCATo use information on exon, genomic positional level and allele-specific expressions, researchers at Fudan University, China have developed novel component-based methods, single and bivariate canonical correlation analysis, for construction of co-expression networks with RNA-Seq data. To evaluate the performance of our methods for co-expression network inference with RNA-Seq data, they are applied to lung squamous cell cancer expression data from TCGA database and their own bipolar disorder and schizophrenia RNA-Seq study. The preliminary results demonstrate that the co-expression networks constructed by canonical correlation analysis and RNA-Seq data provide rich genetic and molecular information to gain insight into biological processes and disease mechanism. These new methods substantially outperform the current statistical methods for co-expression network construction with microarray expression data or RNA-Seq data based on overall gene expression levels.

Availability: A program for implementing the developed CCA for co-expression network construction can be downloaded from bioconductor ( and at

  • Hong S, Chen X, Jin L, Xiong M. (2013) Canonical correlation analysis for RNA-seq co-expression networks. Nucleic Acids Res [Epub ahead of print]. [article]