A non-negative matrix factorization approach for the functional classification of the epigenome

In the last decade, advanced functional genomics approaches and deep sequencing have allowed large-scale mapping of histone modifications and other epigenetic marks, highlighting functional relationships between chromatin organization and genome function. Here, researchers from the Sapienza University of Rome propose a novel approach to explore functional interactions between different epigenetic modifications and extract combinatorial profiles that can be used to annotate the chromatin in a finite number of functional classes. Their method is based on non-negative matrix factorization (NMF), an unsupervised learning technique originally employed to decompose high-dimensional data in a reduced number of meaningful patterns. They applied the NMF algorithm to a set of different epigenetic marks, consisting of ChIP-seq assays for multiple histone modifications, Pol II binding and chromatin accessibility assays from human H1 cells.

The researchers identified a number of chromatin profiles that contain functional information and are biologically interpretable. They also observed that epigenetic profiles are characterized by specific genomic contexts and show significant association with distinct genomic features. Moreover, analysis of RNA-seq data reveals that distinct chromatin signatures correlate with the level of gene expression.

Non-negative matrix factorization of epigenetic data

rna-seq

The scheme gives an intuitive representation of how NMF can be used to approximate a multivariate epigenetic signal in a pre-defined number of signal patterns. The algorithm takes as input a data-matrix (V) with rows corresponding to a series of genomic intervals (or loci) and columns corresponding to different epigenetic tracks for the marks. Each cell in the matrix defines the normalized/background corrected signal of a given epigenetic mark (y) in a given locus (x) (a). As result, a standard NMF procedure yields two sparse matrices W (the weight matrix) and H (the coefficient matrix) describing the contribution of each code/profile to single loci and single marks respectively (b)

Overall, this study highlights the utility of NMF in studying functional relationships between different epigenetic modifications and may provide new biological insights for the interpretation of the chromatin dynamics.

Gandolfi F, Tramontano A. (2017) A computational approach for the functional classification of the epigenome. Epigenetics Chromatin 10:26. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.