VASC: Dimension Reduction and Visualization of Single-cell RNA-seq Data by Deep Variational Autoencoder

被引:150
作者
Wang, Dongfang
Gu, Jin [1 ]
机构
[1] Tsinghua Univ, BNRIST Bioinformat Div, MOE Key Lab Bioinformat, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Single cell RNA sequencing; Deep variational autoencoder; Dimension reduction; Visualization; Dropout; GENE-EXPRESSION; HETEROGENEITY; NORMALIZATION; EMBRYOS; FATE;
D O I
10.1016/j.gpb.2018.08.003
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Single-cell RNA sequencing (scRNA-seq) is a powerful technique to analyze the transcriptomic heterogeneities at the single cell level. It is an important step for studying cell sub-populations and lineages, with an effective low-dimensional representation and visualization of the original scRNA-Seq data. At the single cell level, the transcriptional fluctuations are much larger than the average of a cell population, and the low amount of RNA transcripts will increase the rate of technical dropout events. Therefore, scRNA-seq data are much noisier than traditional bulk RNA-seq data. In this study, we proposed the deep variational autoencoder for scRNA-seq data (VASC), a deep multi-layer generative model, for the unsupervised dimension reduction and visualization of scRNA-seq data. VASC can explicitly model the dropout events and find the nonlinear hierarchical feature representations of the original data. Tested on over 20 datasets, VASC shows superior performances in most cases and exhibits broader dataset compatibility compared to four state-of-the-art dimension reduction and visualization methods. In addition, VASC provides better representations for very rare cell populations in the 2D visualization. As a case study, VASC successfully re-establishes the cell dynamics in pre-implantation embryos and identifies several candidate marker genes associated with early embryo development. Moreover, VASC also performs well on a 10 x Genomics dataset with more cells and higher dropout rate.
引用
收藏
页码:320 / 331
页数:12
相关论文
共 43 条
[1]   Design and computational analysis of single-cell RNA-sequencing experiments [J].
Bacher, Rhonda ;
Kendziorski, Christina .
GENOME BIOLOGY, 2016, 17
[2]   A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure [J].
Baron, Maayan ;
Veres, Adrian ;
Wolock, Samuel L. ;
Faust, Aubrey L. ;
Gaujoux, Renaud ;
Vetere, Amedeo ;
Ryu, Jennifer Hyoje ;
Wagner, Bridget K. ;
Shen-Orr, Shai S. ;
Klein, Allon M. ;
Melton, Douglas A. ;
Yanai, Itai .
CELL SYSTEMS, 2016, 3 (04) :346-+
[3]   Cell fate inclination within 2-cell and 4-cell mouse embryos revealed by single-cell RNA sequencing [J].
Blase, Fernando H. ;
Cao, Xiaoyi ;
Zhong, Sheng .
GENOME RESEARCH, 2014, 24 (11) :1787-1796
[4]  
Brennecke P, 2013, NAT METHODS, V10, P1093, DOI [10.1038/nmeth.2645, 10.1038/NMETH.2645]
[5]   Multilineage communication regulates human liver bud development from pluripotency [J].
Camp, J. Gray ;
Sekine, Keisuke ;
Gerber, Tobias ;
Loeffler-Wirth, Henry ;
Binder, Hans ;
Gac, Malgorzata ;
Kanton, Sabina ;
Kageyama, Jorge ;
Damm, Georg ;
Seehofer, Daniel ;
Belicova, Lenka ;
Bickle, Marc ;
Barsacchi, Rico ;
Okuda, Ryo ;
Yoshizawa, Emi ;
Kimura, Masaki ;
Ayabe, Hiroaki ;
Taniguchi, Hideki ;
Takebe, Takanori ;
Treutlein, Barbara .
NATURE, 2017, 546 (7659) :533-+
[6]   A survey of human brain transcriptome diversity at the single cell level [J].
Darmanis, Spyros ;
Sloan, Steven A. ;
Zhang, Ye ;
Enge, Martin ;
Caneda, Christine ;
Shuer, Lawrence M. ;
Gephart, Melanie G. Hayden ;
Barres, Ben A. ;
Quake, Stephen R. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2015, 112 (23) :7285-7290
[7]   Single-Cell RNA-Seq Reveals Dynamic, Random Monoallelic Gene Expression in Mammalian Cells [J].
Deng, Qiaolin ;
Ramskold, Daniel ;
Reinius, Bjorn ;
Sandberg, Rickard .
SCIENCE, 2014, 343 (6167) :193-196
[8]  
Doersch C, 2016, ARXIV2016160605908
[9]   Heterogeneity in Oct4 and Sox2 Targets Biases Cell Fate in 4-Cell Mouse Embryos [J].
Goolam, Mubeen ;
Scialdone, Antonio ;
Graham, Sarah J. L. ;
Macaulay, Iain C. ;
Jedrusik, Agnieszka ;
Hupalowska, Anna ;
Voet, Thierry ;
Marioni, John C. ;
Zernicka-Goetz, Magdalena .
CELL, 2016, 165 (01) :61-74
[10]  
Gumbel Emil Julius, 1954, Statistical theory of extreme values