Imputing single-cell RNA-seq data by considering cell heterogeneity and prior expression of dropouts

被引:21
作者
Zhang, Lihua [1 ,2 ]
Zhang, Shihua [1 ,2 ,3 ,4 ]
机构
[1] Chinese Acad Sci, Acad Math & Syst Sci, RCSDS, NCMIS,CEMS, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Math Sci, Beijing 100049, Peoples R China
[3] Chinese Acad Sci, Ctr Excellence Anim Evolut & Genet, Kunming 650223, Yunnan, Peoples R China
[4] Chinese Acad Sci, Univ Chinese Acad Sci, Hangzhou Inst Adv Study, Key Lab Syst Biol, Hangzhou 310024, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
single-cell RNA-seq; dropout; imputation; low-rank; systems biology;
D O I
10.1093/jmcb/mjaa052
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
Single-cell RNA sequencing (scRNA-seq) provides a powerful tool to determine expression patterns of thousands of individual cells. However, the analysis of scRNA-seq data remains a computational challenge due to the high technical noise such as the presence of dropout events that lead to a large proportion of zeros for expressed genes. Taking into account the cell heterogeneity and the relationship between dropout rate and expected expression level, we present a cell sub-population based bounded low-rank (PBLR) method to impute the dropouts of scRNA-seq data. Through application to both simulated and real scRNA-seq datasets, PBLR is shown to be effective in recovering dropout events, and it can dramatically improve the low-dimensional representation and the recovery of genegene relationships masked by dropout events compared to several state-of-the-art methods. Moreover, PBLR also detects accurate and robust cell sub-populations automatically, shedding light on its flexibility and generality for scRNA-seq data analysis.
引用
收藏
页码:29 / 40
页数:12
相关论文
共 37 条
[1]  
[Anonymous], 2015, J GLOBAL OPTIM, DOI DOI 10.1007/s10898-014-0247-2
[2]  
[Anonymous], 2017, P 2017 SIAM INT C DA, DOI DOI 10.1137/1.9781611974973.29
[3]   DeepImpute: an accurate, fast, and scalable deep neural network method to impute single-cell RNA-seq data [J].
Arisdakessian, Cedric ;
Poirion, Olivier ;
Yunits, Breck ;
Zhu, Xun ;
Garmire, Lana X. .
GENOME BIOLOGY, 2019, 20 (01)
[4]   Metagenes and molecular pattern discovery using matrix factorization [J].
Brunet, JP ;
Tamayo, P ;
Golub, TR ;
Mesirov, JP .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (12) :4164-4169
[5]   A SINGULAR VALUE THRESHOLDING ALGORITHM FOR MATRIX COMPLETION [J].
Cai, Jian-Feng ;
Candes, Emmanuel J. ;
Shen, Zuowei .
SIAM JOURNAL ON OPTIMIZATION, 2010, 20 (04) :1956-1982
[6]   A molecular census of arcuate hypothalamus and median eminence cell types [J].
Campbell, John N. ;
Macosko, Evan Z. ;
Fenselau, Henning ;
Pers, Tune H. ;
Lyubetskaya, Anna ;
Tenen, Danielle ;
Goldman, Melissa ;
Verstegen, Anne M. J. ;
Resch, Jon M. ;
McCarroll, Steven A. ;
Rosen, Evan D. ;
Lowell, Bradford B. ;
Tsai, Linus T. .
NATURE NEUROSCIENCE, 2017, 20 (03) :484-496
[7]   Exact Matrix Completion via Convex Optimization [J].
Candes, Emmanuel J. ;
Recht, Benjamin .
FOUNDATIONS OF COMPUTATIONAL MATHEMATICS, 2009, 9 (06) :717-772
[8]   Matrix completion via an alternating direction method [J].
Chen, Caihua ;
He, Bingsheng ;
Yuan, Xiaoming .
IMA JOURNAL OF NUMERICAL ANALYSIS, 2012, 32 (01) :227-245
[9]   Single-cell RNA-seq denoising using a deep count autoencoder [J].
Eraslan, Goekcen ;
Simon, Lukas M. ;
Mircea, Maria ;
Mueller, Nikola S. ;
Theis, Fabian J. .
NATURE COMMUNICATIONS, 2019, 10 (1)
[10]  
Gabay D., 1976, Computers & Mathematics with Applications, V2, P17, DOI 10.1016/0898-1221(76)90003-1