Accurate Single-Cell Clustering through Ensemble Similarity Learning

被引:1
|
作者
Jeong, Hyundoo [1 ]
Shin, Sungtae [2 ]
Yeom, Hong-Gi [3 ]
机构
[1] Incheon Natl Univ, Dept Mechatron Engn, Incheon 22012, South Korea
[2] Dong A Univ, Dept Mech Engn, Busan 49315, South Korea
[3] Chosun Univ, Dept Elect Engn, Gwangju 61452, South Korea
基金
新加坡国家研究基金会;
关键词
single-cell RNA sequencing; zero-inflated noise reduction; ensemble similarity estimation; correspondence network; visualization and clustering; imputation; RNA-SEQ; IDENTIFICATION;
D O I
10.3390/genes12111670
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Single-cell sequencing provides novel means to interpret the transcriptomic profiles of individual cells. To obtain in-depth analysis of single-cell sequencing, it requires effective computational methods to accurately predict single-cell clusters because single-cell sequencing techniques only provide the transcriptomic profiles of each cell. Although an accurate estimation of the cell-to-cell similarity is an essential first step to derive reliable single-cell clustering results, it is challenging to obtain the accurate similarity measurement because it highly depends on a selection of genes for similarity evaluations and the optimal set of genes for the accurate similarity estimation is typically unknown. Moreover, due to technical limitations, single-cell sequencing includes a larger number of artificial zeros, and the technical noise makes it difficult to develop effective single-cell clustering algorithms. Here, we describe a novel single-cell clustering algorithm that can accurately predict single-cell clusters in large-scale single-cell sequencing by effectively reducing the zero-inflated noise and accurately estimating the cell-to-cell similarities. First, we construct an ensemble similarity network based on different similarity estimates, and reduce the artificial noise using a random walk with restart framework. Finally, starting from a larger number small size but highly consistent clusters, we iteratively merge a pair of clusters with the maximum similarities until it reaches the predicted number of clusters. Extensive performance evaluation shows that the proposed single-cell clustering algorithm can yield the accurate single-cell clustering results and it can help deciphering the key messages underlying complex biological mechanisms.
引用
收藏
页数:21
相关论文
empty
未找到相关数据