Prediction of single-cell gene expression for transcription factor analysis

Single-cell RNA sequencing is a powerful technology to discover new cell types and study biological processes in complex biological samples. A current challenge is to predict transcription factor (TF) regulation from single-cell RNA data.

Goethe University researchers propose a novel approach for predicting gene expression at the single-cell level using cis-regulatory motifs, as well as epigenetic features. The researchers designed a tree-guided multi-task learning framework that considers each cell as a task. Through this framework they were able to explain the single-cell gene expression values using either TF binding affinities or TF ChIP-seq data measured at specific genomic regions. TFs identified using these models could be validated by the literature.

Schematic illustration of the learning set-ups, single- (a) and multi- (b) task learning

Schematic illustration of the learning set-ups, single- (a) and multi- (b) task learning. Common input files consisting of TF data (static, dynamic, or ChIP-seq) and single-cell gene expression are provided for both learning schemes. The rows of the feature matrix, X, are the genes for which one of the feature set-ups described previously would be used. The response matrix, Y, consists of the gene expression values measured in single cells. And finally, the coefficients matrix, B, establishes a linear association between the X and Y, where the rows indicate the features and columns the cells.

Common input files consisting of TF data (static, dynamic, or ChIP-seq) and single-cell gene expression are provided for both learning schemes. The rows of the feature matrix, X, are the genes for which one of the feature set-ups described previously would be used. The response matrix, Y, consists of the gene expression values measured in single cells. And finally, the coefficients matrix, B, establishes a linear association between the X and Y, where the rows indicate the features and columns the cells.

The proposed method allows one to identify distinct TFs that show cell type-specific regulation. This approach is not limited to TFs but can use any type of data that can potentially be used in explaining gene expression at the single-cell level to study factors that drive differentiation or show abnormal regulation in disease.

Availabilityhttps://github.com/SchulzLab/Triangulate.

Behjati Ardakani F, Kattler K, Heinen T, Schmidt F, Feuerborn D, Gasparoni G, Lepikhov K, Nell P, Hengstler J, Walter J, Schulz MH. (2020) Prediction of single-cell gene expression for transcription factor analysis. Gigascience 9(11):giaa113. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.