Our body consists of numerous cells that are functioning continuously to keep us alive. Even though the cells in different organs contain almost similar DNA, they can perform different functions. While the cells perform different functions, their journey actually starts in a single cell. Over time cells divide, and ancestor cells give rise to descendant cells forming a lineage tree that elucidates the genealogical relationship between cells. During this process, descendant cells can also acquire new specialized functions that are different from their ancestors. When a group of cells perform similar functions, they are said to compose a cell type. The process in which a cell transforms from one cell type to another is known as cellular differentiation. Understanding this cellular differentiation process and these cell lineage trees that delineate the ancestral relationship between cells is crucial for understanding the normal development process in an organism as well as what goes wrong in pathologies such as cancer.
To address this important research problem in developmental biology, Dr. Hamim Zafar from IIT Kanpur, in collaboration with researchers from Carnegie Mellon University, has developed a statistical learning method called ‘LinTIMaT’ that can reconstruct cell lineages for an individual organism or at the species-level. The method and its applications on whole-organism lineage reconstruction have been reported in an article published in the journal Nature Communications. The research team consists of Dr. Hamim Zafar who is a joint faculty in the CSE and BSBE departments at IIT Kanpur, Chieh Lin (co-first author in the study) and Prof. Ziv Bar-Joseph from Carnegie Mellon University.
Inference of cell lineages in a multicellular organism is a challenging task. The best way of describing how a mother cell divides into daughter cells is through live tracking which is not possible for most organisms. To circumvent this problem, scientists use imaging-based or genetic markers to tag the cells. The challenge lies in scaling the method for a whole organism, as thousands of cells need to be tagged even to recover a part of the cell lineage. Even more challenging is to simultaneously profile the lineage and function information from the cells as needed to understand the connection between cell lineage and differentiation.
Thanks to the development of high-throughput sequencing technologies, experimental biologists can now combine a genetic engineering tool named CRISPR-Cas9 with another technology named single-cell RNA sequencing to generate high-throughput data suitable for reconstructing cell lineages. For the same cell, CRISPR-Cas9 system gives a set of mutation that encodes the lineage information whereas single-cell RNA sequencing provides gene expression values that indicate cellular functions. While this combination provided valuable insights regarding development, “the studies also have several limitations” mentioned Zafar, “for reconstructing the lineage, these studies resorted to using a classical, off-the-shelf, method for phylogenetic tree building that uses only mutation information, and fails to recover branches and cannot resolve branchings in later stages of development.” Also, the mutations are random and make it impossible to reconstruct species-level cell lineage.
Overview of LinTIMaT
a LinTIMaT reconstructs a cell lineage tree by integrating CRISPR-Cas9 mutations and transcriptomic data. In Step 1, LinTIMaT infers top scoring lineage trees built on barcodes using only mutation likelihood. In Step 2, for all cells carrying the same barcode, LinTIMaT reconstructs a cellular subtree based on expression likelihood. In Step 3, cellular subtrees are attached to barcode lineages to obtain cell lineage trees and the tree with the best combined likelihood is selected. Finally, LinTIMaT uses a hill-climbing search for refining the cell lineage tree by optimizing the combined likelihood (Step 4). b To reconstruct a species-invariant lineage, LinTIMaT first identifies cell clusters that are preserved in all individual lineages and then performs an iterative search that attempts to minimize the distance between individual lineage trees and the invariant tree topology. As part of the iterative process, LinTIMaT matches preserved clusters in one individual tree to preserved clusters in other individual tree(s) such that leaves in the resulting invariant tree contain cells from all individual studies. See Methods for complete details.
LinTIMaT circumvents these drawbacks by involving gene expression data to augment the lineage reconstruction process using mutation data. With the help of single-cell gene expression data, LinTIMaT refines the cell lineage branches that cannot be learned from the mutations data alone. In addition, LinTIMaT’s probabilistic approach helps to account for uncertainties in mutation data. Also, the use of gene expression data enables LinTIMaT to combine multiple individual lineages for the reconstruction of a species-invariant lineage tree.
“Our method will have a lot of applications in understanding normal as well as pathological development in different diseases. LinTIMaT will be a powerful method for the biologists who are studying development in model organisms or cancer tissues” Dr. Zafar remarked. Ongoing evolution and differentiation are common in cancer and it complicates the therapy as well. After profiling cells from cancer tissues, clinicians can use LinTIMaT to resolve the lineage of the malignant cell types and this will help in designing personalized treatment strategies that target a set of cancer cell types.