Unsupervised generative and graph representation learning for modelling cell differentiation

被引:9
作者
Bica, Ioana [1 ,2 ,4 ]
Andres-Terre, Helena [2 ]
Cvejic, Ana [3 ,5 ,6 ]
Lio, Pietro [2 ]
机构
[1] Univ Oxford, Dept Engn Sci, Oxford OX1 3PJ, England
[2] Univ Cambridge, Dept Comp Sci & Technol, Cambridge CB3 0FD, England
[3] Wellcome Trust Sanger Inst, Wellcome Trust Genome Campus, Cambridge CB10 1SA, England
[4] Alan Turing Inst, London NW1 2DB, England
[5] Univ Cambridge, Dept Haematol, Cambridge CB2 0XY, England
[6] Wellcome Trust Med Res Council, Stem Cell Inst, Cambridge CB2 0AW, England
基金
欧洲研究理事会; 英国医学研究理事会; 英国工程与自然科学研究理事会;
关键词
ATLAS;
D O I
10.1038/s41598-020-66166-8
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Using machine learning techniques to build representations from biomedical data can help us understand the latent biological mechanism of action and lead to important discoveries. Recent developments in single-cell RNA-sequencing protocols have allowed measuring gene expression for individual cells in a population, thus opening up the possibility of finding answers to biomedical questions about cell differentiation. In this paper, we explore unsupervised generative neural methods, based on the variational autoencoder, that can model cell differentiation by building meaningful representations from the high dimensional and complex gene expression data. We use disentanglement methods based on information theory to improve the data representation and achieve better separation of the biological factors of variation in the gene expression data. In addition, we use a graph autoencoder consisting of graph convolutional layers to predict relationships between single-cells. Based on these models, we develop a computational framework that consists of methods for identifying the cell types in the dataset, finding driver genes for the differentiation process and obtaining a better understanding of relationships between cells. We illustrate our methods on datasets from multiple species and also from different sequencing technologies.
引用
收藏
页数:13
相关论文
共 46 条
[1]  
[Anonymous], 2017, CoRR, abs/1707.08114
[2]  
Chollet F., 2015, KERAS
[3]  
Dziugaite GK, 2015, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, P258
[4]   Single-cell RNA-seq denoising using a deep count autoencoder [J].
Eraslan, Goekcen ;
Simon, Lukas M. ;
Mircea, Maria ;
Mueller, Nikola S. ;
Theis, Fabian J. .
NATURE COMMUNICATIONS, 2019, 10 (1)
[5]   Expression and regulation of drug transporters in vertebrate neutrophils [J].
Foulkes, Matthew J. ;
Henry, Katherine M. ;
Rougeot, Julien ;
Hooper-Greenhill, Edward ;
Loynes, Catherine A. ;
Jeffrey, Phil ;
Fleming, Angeleen ;
Savage, Caroline O. ;
Meijer, Annemarie H. ;
Jones, Simon ;
Renshaw, Stephen A. .
SCIENTIFIC REPORTS, 2017, 7
[6]  
Gretton A., 2006, P 21 INT C NEURAL IN, P513
[7]  
Grover A., 2019, INT C MACH LEARN ICM
[8]   Single-Cell Analysis Identifies Distinct Stages of Human Endothelial-to-Hematopoietic Transition [J].
Guibentif, Carolina ;
Ronn, Roger Emanuel ;
Boiers, Charlotta ;
Lang, Stefan ;
Saxena, Shobhit ;
Soneji, Shamit ;
Enver, Tariq ;
Karlsson, Goran ;
Woods, Niels-Bjarne .
CELL REPORTS, 2017, 19 (01) :10-19
[9]   Neutrophils in host defense: new insights from zebrafish [J].
Harvie, Elizabeth A. ;
Huttenlocher, Anna .
JOURNAL OF LEUKOCYTE BIOLOGY, 2015, 98 (04) :523-537
[10]  
Ioffe Sergey, 2015, P MACHINE LEARNING R, V37, P448, DOI DOI 10.48550/ARXIV.1502.03167