EnTSSR: A Weighted Ensemble Learning Method to Impute Single-Cell RNA Sequencing Data

被引:3
作者
Lu, Fan [1 ,2 ]
Lin, Yilong [1 ,2 ]
Yuan, Chongbin [1 ,2 ]
Zhang, Xiao-Fei [3 ,4 ]
Le Ou-Yang [1 ,2 ]
机构
[1] Shenzhen Univ, Coll Elect & Informat Engn, Shenzhen Key Lab Media Secur,Guangdong Key Lab In, Guangdong Lab Artificial Intelligence & Digital E, Shenzhen 518060, Peoples R China
[2] Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen 518129, Peoples R China
[3] Cent China Normal Univ, Sch Math & Stat, Wuhan 430079, Peoples R China
[4] Cent China Normal Univ, Hubei Key Lab Math Sci, Wuhan 430079, Peoples R China
基金
中国国家自然科学基金;
关键词
Sparse matrices; Sequential analysis; Data models; RNA; Mathematical model; Linear programming; Learning systems; Single-cell RNA sequencing; dropout events; ensemble learning; GENE-EXPRESSION; MOUSE; TRANSCRIPTOME;
D O I
10.1109/TCBB.2021.3110850
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The advancements of single-cell RNA sequencing (scRNA-seq) technologies have provided us unprecedented opportunities to characterize cellular states and investigate the mechanisms of complex diseases. Due to technical issues such as dropout events, scRNA-seq data contains excess of false zero counts, which has a substantial impact on the downstream analyses. Although several computational approaches have been proposed to impute dropout events in scRNA-seq data, there is no strong consensus on which is the best approach. In this study, we propose a novel weighted ensemble learning method, named EnTSSR, to impute dropout events in scRNA-seq data. By using a multi-view two-side sparse self-representation framework, our model can exploit the consensus similarities between genes and between cells based on the imputed results of various imputation methods. Moreover, we introduce a weighted ensemble strategy to leverage the information captured by various imputation methods effectively. Down-sampling experiments, clustering analysis, differential expression analysis and cell trajectory inference are carried out to evaluate the performance of our proposed model. Experiment results demonstrate that our EnTSSR can effectively recover the true expression pattern of scRNA-seq data.
引用
收藏
页码:2781 / 2787
页数:7
相关论文
共 36 条
[1]   DeepImpute: an accurate, fast, and scalable deep neural network method to impute single-cell RNA-seq data [J].
Arisdakessian, Cedric ;
Poirion, Olivier ;
Yunits, Breck ;
Zhu, Xun ;
Garmire, Lana X. .
GENOME BIOLOGY, 2019, 20 (01)
[2]   A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure [J].
Baron, Maayan ;
Veres, Adrian ;
Wolock, Samuel L. ;
Faust, Aubrey L. ;
Gaujoux, Renaud ;
Vetere, Amedeo ;
Ryu, Jennifer Hyoje ;
Wagner, Bridget K. ;
Shen-Orr, Shai S. ;
Klein, Allon M. ;
Melton, Douglas A. ;
Yanai, Itai .
CELL SYSTEMS, 2016, 3 (04) :346-+
[3]   scRMD: imputation for single cell RNA-seq data via robust matrix decomposition [J].
Chen, Chong ;
Wu, Changjing ;
Wu, Linjie ;
Wang, Xiaochen ;
Deng, Minghua ;
Xi, Ruibin .
BIOINFORMATICS, 2020, 36 (10) :3156-3161
[4]   VIPER: variability-preserving imputation for accurate gene expression recovery in single-cell RNA sequencing studies [J].
Chen, Mengjie ;
Zhou, Xiang .
GENOME BIOLOGY, 2018, 19
[5]   Single-Cell RNA-Seq Reveals Hypothalamic Cell Diversity [J].
Chen, Renchao ;
Wu, Xiaoji ;
Jiang, Lan ;
Zhang, Yi .
CELL REPORTS, 2017, 18 (13) :3227-3241
[6]   Single-cell RNA-seq reveals novel regulators of human embryonic stem cell differentiation to definitive endoderm [J].
Chu, Li-Fang ;
Leng, Ning ;
Zhang, Jue ;
Hou, Zhonggang ;
Mamott, Daniel ;
Vereide, David T. ;
Choi, Jeea ;
Kendziorski, Christina ;
Stewart, Ron ;
Thomson, James A. .
GENOME BIOLOGY, 2016, 17
[7]   Single-Cell RNA-Seq Reveals Dynamic, Random Monoallelic Gene Expression in Mammalian Cells [J].
Deng, Qiaolin ;
Ramskold, Daniel ;
Reinius, Bjorn ;
Sandberg, Rickard .
SCIENCE, 2014, 343 (6167) :193-196
[8]   Single-cell RNA-seq denoising using a deep count autoencoder [J].
Eraslan, Goekcen ;
Simon, Lukas M. ;
Mircea, Maria ;
Mueller, Nikola S. ;
Theis, Fabian J. .
NATURE COMMUNICATIONS, 2019, 10 (1)
[9]   TCM visualizes trajectories and cell populations from single cell data [J].
Gong, Wuming ;
Kwak, Il-Youp ;
Koyano-Nakagawa, Naoko ;
Pan, Wei ;
Garry, Daniel J. .
NATURE COMMUNICATIONS, 2018, 9
[10]   DrImpute: imputing dropout events in single cell RNA sequencing data [J].
Gong, Wuming ;
Kwak, Il-Youp ;
Pota, Pruthvi ;
Koyano-Nakagawa, Naoko ;
Garry, Daniel J. .
BMC BIOINFORMATICS, 2018, 19