scVAE: variational auto-encoders for single-cell gene expression data

被引:138
作者
Gronbech, Christopher Heje [1 ,2 ,3 ]
Vording, Maximillian Fornitz [3 ]
Timshel, Pascal N. [4 ]
Sonderby, Casper Kaae [1 ]
Pers, Tune H. [4 ]
Winther, Ole [1 ,2 ,3 ]
机构
[1] Univ Copenhagen, Bioinformat Ctr, Dept Biol, DK-2100 Copenhagen, Denmark
[2] Copenhagen Univ Hosp, Rigshosp, Ctr Genom Med, DK-2100 Copenhagen, Denmark
[3] Tech Univ Denmark, Dept Appl Math & Comp Sci, Sect Cognit Syst, DK-2800 Lyngby, Denmark
[4] Univ Copenhagen, Novo Nordisk Fdn Ctr Basic Metab Res, Fac Hlth & Med Sci, DK-2200 Copenhagen N, Denmark
关键词
RNA-SEQUENCING DATA; MODEL;
D O I
10.1093/bioinformatics/btaa293
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Models for analysing and making relevant biological inferences from massive amounts of complex single-cell transcriptomic data typically require several individual data-processing steps, each with their own set of hyperparameter choices. With deep generative models one can work directly with count data, make likelihood-based model comparison, learn a latent representation of the cells and capture more of the variability in different cell populations. Results: We propose a novel method based on variational auto-encoders (VAEs) for analysis of single-cell RNA sequencing (scRNA-seq) data. It avoids data preprocessing by using raw count data as input and can robustly estimate the expected gene expression levels and a latent representation for each cell. We tested several count likelihood functions and a variant of the VAE that has a priori clustering in the latent space. We show for several scRNA-seq datasets that our method outperforms recently proposed scRNA-seq methods in clustering cells and that the resulting clusters reflect cell types.
引用
收藏
页码:4415 / 4422
页数:8
相关论文
共 49 条
[11]   Image Style Transfer Using Convolutional Neural Networks [J].
Gatys, Leon A. ;
Ecker, Alexander S. ;
Bethge, Matthias .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2414-2423
[12]  
Ghahramani Arsham, 2018, BioRxiv, DOI 10.1101/262501
[13]  
Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
[14]  
Gupta A., 2015, BIORXIV, DOI [10.1101/031906, DOI 10.1101/031906]
[15]   Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors [J].
Haghverdi, Laleh ;
Lun, Aaron T. L. ;
Morgan, Michael D. ;
Marioni, John C. .
NATURE BIOTECHNOLOGY, 2018, 36 (05) :421-+
[16]   COMPARING PARTITIONS [J].
HUBERT, L ;
ARABIE, P .
JOURNAL OF CLASSIFICATION, 1985, 2 (2-3) :193-218
[17]  
Ioffe Sergey, 2015, P MACHINE LEARNING R, V37, P448, DOI DOI 10.48550/ARXIV.1502.03167
[18]  
Jiang ZX, 2017, PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P1965
[19]  
Johnson MJ, 2016, ADV NEUR IN, V29
[20]   Deconvolution of autoencoders to learn biological regulatory modules from single cell mRNA sequencing data [J].
Kinalis, Savvas ;
Nielsen, Finn Cilius ;
Winther, Ole ;
Bagger, Frederik Otzen .
BMC BIOINFORMATICS, 2019, 20 (1)