Inferring gene networks from high-throughput data constitutes an important step in the discovery of relevant regulatory relationships in organism cells. Despite the large number of available Gene Regulatory Network inference methods, the problem remains challenging: the underdetermination in the space of possible solutions requires additional constraints that incorporate a priori information on gene interactions.
Weighting all possible pairwise gene relationships by a probability of edge presence, researchers from IFP Energies Nouvelles, France formulate the regulatory network inference as a discrete variational problem on graphs. They enforce biologically plausible coupling between groups and types of genes by minimizing an edge labeling functional coding for a priori structures. The optimization is carried out with Graph cuts, an approach popular in image processing and computer vision. The researchers compare the inferred regulatory networks to results achieved by the mutual-information-based Context Likelihood of Relatedness (CLR) method and by the state-of-the-art GENIE3, winner of the DREAM4 multifactorial challenge.
Schematic view of the proposed BRANE Cut method. The initial graph (a) is transformed into an intermediate graph (b) in which a max-flow computation is performed to return an optimal edge labeling x ∗ leading to the inferred graph (c). We choose to present the method in its full generality with unscaled weights (i.e. w i,j∈ [ 0,+∞[, and λ parameters also belong to [ 0,+∞[. Nodes v 2 and v 3 are TFs, λTF¯=1 and λ TF=3. Taking γ=4 implies that v 1, v 2, and v 3 satisfy the regulator coupling property. Vertices v 1 and v 4 are thus affected, leading to the presence of additional edges weighted by ρ 1,2,3=0 and ρ 4,2,3=3, when μ is set to 3. Computing a max-flow in the graph (b) leads to some edge saturation, represented in dashed lines. The values from the source (value 1) and the sink (value 0) are propagated through non saturated paths, thus leading to x 2,4=x 3,4=0
TheBRANE Cut approach infers more accurately the five DREAM4 in silico networks (with improvements from 6 % to 11 %). On a real Escherichia coli compendium, an improvement of 11.8 % compared to CLR and 3 % compared to GENIE3 is obtained in terms of Area Under Precision-Recall curve. Up to 48 additional verified interactions are obtained over GENIE3 for a given precision. On this dataset involving 4345 genes, our method achieves a performance similar to that of GENIE3, while being more than seven times faster.
BRANE Cut is a weighted graph thresholding method. Using biologically sound penalties and data-driven parameters, it improves three state-of-the art GRN inference methods. It is applicable as a generic network inference post-processing, due to its computational efficiency.
Availability – The BRANE Cut code is available at: http://www-syscom.univ-mlv.fr/~pirayre/Codes-GRN-BRANE-cut.html