**Graph-based clustering using a decoupling strategy for hard-clustering**

*The principle is similar for soft-clustering. Gray nodes represent TFs nodes. The T-label problem is decomposed into T binary sub-problems by setting the component t of marker labels s(t); t 2 T, to one and the others to zero. Each sub-problem t leads to a probability for each node. The final node clustering corresponds to the label whose probability amidst the T sub-problems is maximal.*

**Availability** – BRANE Clust software is available at: http://www-syscom.univ-mlv.fr/~pirayre/Codes-GRN-BRANE-clust.html

Pirayre A, Couprie C, Duval L, Pesquet JC. (2017) **BRANE Clust: Cluster-Assisted Gene Regulatory Network Inference Refinement.** *IEEE/ACM Trans Comput Biol Bioinform* [Epub ahead of print]. [abstract]

As the underlying structure of many networks is not (completely) known, one focus of systems biology is uncovering the complex and dynamic interactions between genes. The research area called ‘network inference (NI)’ aims at the deduction of network structures utilizing high-throughput data with help of reverse engineering techniques. In most cases transcriptome data is used. NI consists of three parts:

- the identification of potential regulators,
- the prediction of target genes and
- the inference of the mode of interaction (e.g. activation or repression).

The advance of Next-Generation-Sequencing of cDNAs derived from RNA samples (RNA-Seq) allows to study transcriptomes with a so far unreachable depth and quality. On the other hand, data pre-processing poses new challenges. Here, the authors describe a work-flow combining RNA-Seq data analysis with NI. In particular, the advance of RNA-Seq allows researchers to perform transcriptome studies of interacting (micro-) organisms using the same technology without having to separate RNA samples (‘dual RNA-Seq’). This allows to predict GRNs of organisms which interact with each other.

**Workflow of GRN inference **

*Systems Biology Cycle of wet lab (experiment) and dry lab work: Experiments lead to RNA-Seq data, which need to be preprocessed and features have to be selected (more detailed steps are shown in grey boxes). A GRN is inferred for selected features. Predicted interactions are validated leading to more knowledge and new hypotheses. Both analysis of experimental data (data preprocessing and feature selection) and modeling (network inference) is supported by prior knowledge.*

Linde J, Schulze S, Henkel SG, Guthke R. (2016) **Data- and knowledge-based modeling of gene regulatory networks: an update**. *EXCLI J* 14:346-78. [article]

Weighting all possible pairwise gene relationships by a probability of edge presence, researchers from IFP Energies Nouvelles, France formulate the regulatory network inference as a discrete variational problem on graphs. They enforce biologically plausible coupling between groups and types of genes by minimizing an edge labeling functional coding for a priori structures. The optimization is carried out with Graph cuts, an approach popular in image processing and computer vision. The researchers compare the inferred regulatory networks to results achieved by the mutual-information-based Context Likelihood of Relatedness (CLR) method and by the state-of-the-art GENIE3, winner of the DREAM4 multifactorial challenge.

*Schematic view of the proposed BRANE Cut method. The initial graph ( a) is transformed into an intermediate graph (b) in which a max-flow computation is performed to return an optimal edge labeling x ^{∗} leading to the inferred graph (c). We choose to present the method in its full generality with unscaled weights (i.e. w _{i,j}∈ [ 0,+∞[, and λ parameters also belong to [ 0,+∞[. Nodes v _{2} and v _{3} are TFs, λTF¯=1 and λ _{TF}=3. Taking γ=4 implies that v _{1}, v _{2}, and v _{3} satisfy the regulator coupling property. Vertices v _{1} and v _{4} are thus affected, leading to the presence of additional edges weighted by ρ _{1,2,3}=0 and ρ _{4,2,3}=3, when μ is set to 3. Computing a max-flow in the graph (b) leads to some edge saturation, represented in dashed lines. The values from the source (value 1) and the sink (value 0) are propagated through non saturated paths, thus leading to x _{2,4}=x _{3,4}=0*

TheBRANE Cut approach infers more accurately the five DREAM4 in silico networks (with improvements from 6 % to 11 %). On a real Escherichia coli compendium, an improvement of 11.8 % compared to CLR and 3 % compared to GENIE3 is obtained in terms of Area Under Precision-Recall curve. Up to 48 additional verified interactions are obtained over GENIE3 for a given precision. On this dataset involving 4345 genes, our method achieves a performance similar to that of GENIE3, while being more than seven times faster.

BRANE Cut is a weighted graph thresholding method. Using biologically sound penalties and data-driven parameters, it improves three state-of-the art GRN inference methods. It is applicable as a generic network inference post-processing, due to its computational efficiency.

**Availability** – The BRANE Cut code is available at: http://www-syscom.univ-mlv.fr/~pirayre/Codes-GRN-BRANE-cut.html

Pirayre A, Couprie C, Bidard F, Duval L, Pesquet JC. (2015) **BRANE Cut: biologically-related a priori network enhancement with graph cuts for gene regulatory network inference**. *BMC Bioinformatics* 16:369. [article]