Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing data

被引:18
作者
Kinalis, Savvas [1 ]
Nielsen, Finn Cilius [1 ]
Winther, Ole [1 ,2 ,3 ]
Bagger, Frederik Otzen [1 ,4 ,5 ,6 ]
机构
[1] Univ Copenhagen, Rigshosp, Ctr Genom Med, Copenhagen, Denmark
[2] Tech Univ Denmark, Dept Appl Math & Comp Sci, Sect Cognit Syst, Lyngby, Denmark
[3] Univ Copenhagen, Dept Biol, Bioinformat Ctr, Copenhagen, Denmark
[4] Univ Basel, Univ Childrens Hosp Basel, Basel, Switzerland
[5] Univ Basel, Dept Biomed, Basel, Switzerland
[6] Swiss Inst Bioinformat, Basel, Switzerland
关键词
Interpretable machine learning; Deep learning; Neural networks; Manifold learning; Expression profiles; Single-cell RNA-sequencing; Gene set enrichment analysis; Functional analysis; Biological pathway analysis; LINEAGE COMMITMENT; HEMATOPOIETIC STEM; CANCER; MOUSE;
D O I
10.1186/s12859-019-2952-9
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundUnsupervised machine learning methods (deep learning) have shown their usefulness with noisy single cell mRNA-sequencing data (scRNA-seq), where the models generalize well, despite the zero-inflation of the data. A class of neural networks, namely autoencoders, has been useful for denoising of single cell data, imputation of missing values and dimensionality reduction.ResultsHere, we present a striking feature with the potential to greatly increase the usability of autoencoders: With specialized training, the autoencoder is not only able to generalize over the data, but also to tease apart biologically meaningful modules, which we found encoded in the representation layer of the network. Our model can, from scRNA-seq data, delineate biological meaningful modules that govern a dataset, as well as give information as to which modules are active in each single cell. Importantly, most of these modules can be explained by known biological functions, as provided by the Hallmark gene sets.ConclusionsWe discover that tailored training of an autoencoder makes it possible to deconvolute biological modules inherent in the data, without any assumptions. By comparisons with gene signatures of canonical pathways we see that the modules are directly interpretable. The scope of this discovery has important implications, as it makes it possible to outline the drivers behind a given effect of a cell. In comparison with other dimensionality reduction methods, or supervised models for classification, our approach has the benefit of both handling well the zero-inflated nature of scRNA-seq, and validating that the model captures relevant information, by establishing a link between input and decoded data. In perspective, our model in combination with clustering methods is able to provide information about which subtype a given single cell belongs to, as well as which biological functions determine that membership.
引用
收藏
页数:9
相关论文
共 50 条
[1]  
[Anonymous], 2018, BIORXIV
[2]  
[Anonymous], 2016, DEEP LEARNING
[3]  
[Anonymous], 2018, BIORXIV
[4]  
[Anonymous], 2018, DEEPIMPUTE ACCURATE
[5]  
[Anonymous], ARXIV2018180203426
[6]   β-Catenin signaling contributes to stemness and regulates early differentiation in murine embryonic stem cells [J].
Anton, Roman ;
Kestler, Hans A. ;
Kuehl, Michael .
FEBS LETTERS, 2007, 581 (27) :5247-5254
[7]   Single-Cell Map of Diverse Immune Phenotypes in the Breast Tumor Microenvironment [J].
Azizi, Elham ;
Carr, Ambrose J. ;
Plitas, George ;
Cornish, Andrew E. ;
Konopacki, Catherine ;
Prabhakaran, Sandhya ;
Nainys, Juozas ;
Wu, Kenmin ;
Kiseliovas, Vaidotas ;
Setty, Manu ;
Choi, Kristy ;
Fromme, Rachel M. ;
Phuong Dao ;
McKenney, Peter T. ;
Wasti, Ruby C. ;
Kadaveru, Krishna ;
Mazutis, Linas ;
Rudensky, Alexander Y. ;
Pe'er, Dana .
CELL, 2018, 174 (05) :1293-+
[8]   BloodSpot: a database of healthy and malignant haematopoiesis updated with purified and single cell mRNA sequencing profiles [J].
Bagger, Frederik Otzen ;
Kinalis, Savvas ;
Rapin, Nicolas .
NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) :D881-D885
[9]   Telomerase Inhibition Effectively Targets Mouse and Human AML Stem Cells and Delays Relapse following Chemotherapy [J].
Bruedigam, Claudia ;
Bagger, Frederik O. ;
Heidel, Florian H. ;
Kuhn, Catherine Paine ;
Guignes, Solene ;
Song, Axia ;
Austin, Rebecca ;
Vu, Therese ;
Lee, Erwin ;
Riyat, Sarbjit ;
Moore, Andrew S. ;
Lock, Richard B. ;
Bullinger, Lars ;
Hill, Geoffrey R. ;
Armstrong, Scott A. ;
Williams, David A. ;
Lane, Steven W. .
CELL STEM CELL, 2014, 15 (06) :775-790
[10]   A test metric for assessing single-cell RNA-seq batch correction [J].
Buettner, Maren ;
Miao, Zhichao ;
Wolf, F. Alexander ;
Teichmann, Sarah A. ;
Theis, Fabian J. .
NATURE METHODS, 2019, 16 (01) :43-+