SMURF: embedding single-cell RNA-seq data with matrix factorization preserving self-consistency

被引:2
作者
Pu, Juhua [1 ,2 ]
Wang, Bingchen [1 ,2 ]
Liu, Xingwu [3 ]
Chen, Lingxi [4 ]
Li, Shuai Cheng [4 ]
机构
[1] Beihang Univ, State Key Lab Softwarer Dev Environm, Beijing, Peoples R China
[2] Beihang Hangzhou Innovat Inst Yuhang, Hangzhou 310023, Peoples R China
[3] Dalian Univ Technol, Sch Math Sci, Dalian, Liaoning, Peoples R China
[4] City Univ Hong Kong, Dept Comp Sci, Kowloon Tong, 83 Tat Chee Ave, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
scRNA-seq; matrix factorization; imputation; embedding; cell cycle; EXPRESSION; INFORMATION; REVEALS;
D O I
10.1093/bib/bbad026
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The advance in single-cell RNA-sequencing (scRNA-seq) sheds light on cell-specific transcriptomic studies of cell developments, complex diseases and cancers. Nevertheless, scRNA-seq techniques suffer from 'dropout' events, and imputation tools are proposed to address the sparsity. Here, rather than imputation, we propose a tool, SMURF, to extract the low-dimensional embeddings from cells and genes utilizing matrix factorization with a mixture of Poisson-Gamma divergent as objective while preserving self-consistency. SMURF exhibits feasible cell subpopulation discovery efficacy with obtained cell embeddings on replicated in silico and eight web lab scRNA datasets with ground truth cell types. Furthermore, SMURF can reduce the cell embedding to a 1D-oval space to recover the time course of cell cycle. SMURF can also serve as an imputation tool; the in silico data assessment shows that SMURF parades the most robust gene expression recovery power with low root mean square error and high Pearson correlation. Moreover, SMURF recovers the gene distribution for the WM989 Drop-seq data. SMURF is available at https://github.com/deepomicslab/SMURF.
引用
收藏
页数:10
相关论文
共 50 条
[1]   Tutorial: guidelines for the computational analysis of single-cell RNA sequencing data [J].
Andrews, Tallulah S. ;
Kiselev, Vladimir Yu ;
McCarthy, Davis ;
Hemberg, Martin .
NATURE PROTOCOLS, 2021, 16 (01) :1-9
[2]   Spatially and functionally distinct subclasses of breast cancer-associated fibroblasts revealed by single cell RNA sequencing [J].
Bartoschek, Michael ;
Oskolkov, Nikolay ;
Bocci, Matteo ;
Lovrot, John ;
Larsson, Christer ;
Sommarin, Mikael ;
Madsen, Chris D. ;
Lindgren, David ;
Pekar, Gyula ;
Karlsson, Goran ;
Ringner, Markus ;
Bergh, Jonas ;
Bjorklund, Asa ;
Pietras, Kristian .
NATURE COMMUNICATIONS, 2018, 9
[3]   PD-1 immune checkpoint blockade reduces pathology and improves memory in mouse models of Alzheimer's disease [J].
Baruch, Kuti ;
Deczkowska, Aleksandra ;
Rosenzweig, Neta ;
Tsitsou-Kampeli, Afroditi ;
Sharif, Alaa Mohammad ;
Matcovitch-Natan, Orit ;
Kertser, Alexander ;
David, Eyal ;
Amit, Ido ;
Schwartz, Michal .
NATURE MEDICINE, 2016, 22 (02) :135-137
[4]   Dimensionality reduction for visualizing single-cell data using UMAP [J].
Becht, Etienne ;
McInnes, Leland ;
Healy, John ;
Dutertre, Charles-Antoine ;
Kwok, Immanuel W. H. ;
Ng, Lai Guan ;
Ginhoux, Florent ;
Newell, Evan W. .
NATURE BIOTECHNOLOGY, 2019, 37 (01) :38-+
[5]   Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells [J].
Buettner, Florian ;
Natarajan, Kedar N. ;
Casale, F. Paolo ;
Proserpio, Valentina ;
Scialdone, Antonio ;
Theis, Fabian J. ;
Teichmann, Sarah A. ;
Marioni, John C. ;
Stegie, Oliver .
NATURE BIOTECHNOLOGY, 2015, 33 (02) :155-160
[6]  
Chen L., 2022, NUCLEIC ACIDS RES, V11, pgkac1044
[7]   DeepMF: deciphering the latent patterns in omics profiles with a deep learning method [J].
Chen, Lingxi ;
Xu, Jiao ;
Li, Shuai Cheng .
BMC BIOINFORMATICS, 2019, 20 (01)
[8]   Single-cell RNA-seq enables comprehensive tumour and immune cell profiling in primary breast cancer [J].
Chung, Woosung ;
Eum, Hye Hyeon ;
Lee, Hae-Ock ;
Lee, Kyung-Min ;
Lee, Han-Byoel ;
Kim, Kyu-Tae ;
Ryu, Han Suk ;
Kim, Sangmin ;
Lee, Jeong Eon ;
Park, Yeon Hee ;
Kan, Zhengyan ;
Han, Wonshik ;
Park, Woong-Yang .
NATURE COMMUNICATIONS, 2017, 8
[9]  
Cover T. A., 2006, Elements of information theory, V2nd
[10]   Multiplex single-cell profiling of chromatin accessibility by combinatorial cellular indexing [J].
Cusanovich, Darren A. ;
Daza, Riza ;
Adey, Andrew ;
Pliner, Hannah A. ;
Christiansen, Lena ;
Gunderson, Kevin L. ;
Steemers, Frank J. ;
Trapnell, Cole ;
Shendure, Jay .
SCIENCE, 2015, 348 (6237) :910-914