Construction and Optimization of Large Gene Co-expression Network Using RNA-Seq Data

With the emergence of massively parallel sequencing, genome-wide expression data production has reached an unprecedented level. This abundance of data has greatly facilitated maize research, but may not be amenable to traditional analysis techniques that were optimized for other data types. Using publicly available data, a Gene Co-expression Network (GCN) can be constructed and used for gene function prediction, candidate gene selection and improving understanding of regulatory pathways. Several GCN studies have been done in maize, mostly using microarray datasets.

To build an optimal GCN from plant materials RNA-Seq data, researchers at Florida State University evaluated parameters for expression data normalization and network inference. A comprehensive evaluation of these two parameters and ranked aggregation strategy on network performance using libraries from 1266 maize samples was conducted. Three normalization methods (VST, CPM, RPKM) and ten inference methods, including six correlation and four mutual information (MI) methods, were tested. The three normalization methods had very similar performance. For network inference, correlation methods performed better than MI methods at some genes. Increasing sample size also had a positive effect on GCN. Aggregating single networks together resulted in improved performance compared to single networks.

Cell wall pathway subnetworks

rna-seq

A, Intersections of PCC aggregation (PA), SCC aggregation (SA) and MRNET-single (MS) networks, queried by 16 cell wall pathway genes (red nodes). Cyan nodes are genes with reported function in cell wall related pathways in plant. Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways. Grey lines indicate network predicted interactions. B, Network retrieved from CORNET database, queried by the 16 cell wall pathway genes (red node). Cyan nodes are genes with reported function in cell wall related path-ways in plant. Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways. Grey lines indicate network predicted interactions. C, Network retrieved from STRING database, queried by the 16 cell wall pathway genes (red nodes). Cyan nodes are genes with reported function in cell wall related pathways in plant. Dark grey nodes are genes without prior knowledge of involvement in cell wall related pathways. Grey lines indicate netwrok predicted interactions.

Huang J, Vendramin Alegre S, Shi L, McGinnis K. (2017) Construction and Optimization of Large Gene Co-expression Network in Maize Using RNA-Seq Data. Plant Physiol [Epub ahead of print]. [abstract]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.