scNAME: neighborhood contrastive clustering with ancillary mask estimation for scRNA-seq data

被引:47
作者
Wan, Hui [1 ]
Chen, Liang [1 ]
Deng, Minghua [1 ,2 ,3 ]
机构
[1] Peking Univ, Sch Math Sci, Beijing 100871, Peoples R China
[2] Peking Univ, Ctr Quantitat Biol, Beijing 100871, Peoples R China
[3] Peking Univ, Ctr Stat Sci, Beijing 100871, Peoples R China
基金
中国国家自然科学基金;
关键词
CELL-ADHESION;
D O I
10.1093/bioinformatics/btac011
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The rapid development of single-cell RNA sequencing (scRNA-seq) makes it possible to study the heterogeneity of individual cell characteristics. Cell clustering is a vital procedure in scRNA-seq analysis, providing insight into complex biological phenomena. However, the noisy, high-dimensional and large-scale nature of scRNA-seq data introduces challenges in clustering analysis. Up to now, many deep learning-based methods have emerged to learn underlying feature representations while clustering. However, these methods are inefficient when it comes to rare cell type identification and barely able to fully utilize gene dependencies or cell similarity integrally. As a result, they cannot detect a clear cell type structure which is required for clustering accuracy as well as downstream analysis. Results: Here, we propose a novel scRNA-seq clustering algorithm called scNAME which incorporates a mask estimation task for gene pertinence mining and a neighborhood contrastive learning framework for cell intrinsic structure exploitation. The learned pattern through mask estimation helps reveal uncorrupted data structure and denoise the original single-cell data. In addition, the randomly created augmented data introduced in contrastive learning not only helps improve robustness of clustering, but also increases sample size in each cluster for better data capacity. Beyond this, we also introduce a neighborhood contrastive paradigm with an offline memory bank, global in scope, which can inspire discriminative feature representation and achieve intra-cluster compactness, yet inter-cluster separation. The combination of mask estimation task, neighborhood contrastive learning and global memory bank designed in scNAME is conductive to rare cell type detection. The experimental results of both simulations and real data confirm that our method is accurate, robust and scalable. We also implement biological analysis, including marker gene identification, gene ontology and pathway enrichment analysis, to validate the biological significance of our method. To the best of our knowledge, we are among the first to introduce a gene relationship exploration strategy, as well as a global cellular similarity repository, in the single-cell field.
引用
收藏
页码:1575 / 1583
页数:9
相关论文
共 28 条
[1]   Cell Adhesion Molecules in the Normal and Cancerous Mammary Gland [J].
Alford, Deborah ;
Taylor-Papadimitriou, Joyce .
JOURNAL OF MAMMARY GLAND BIOLOGY AND NEOPLASIA, 1996, 1 (02) :207-218
[2]   Extracellular vesicles: An overview of biogenesis, function, and role in breast cancer [J].
Bin Zha, Quan ;
Yao, Yu Feng ;
Ren, Zhao Jun ;
Li, Xiu Juan ;
Tang, Jin Hai .
TUMOR BIOLOGY, 2017, 39 (02)
[3]  
Chen L., 2020, NAR GENOMICS BIOINF, V2, plqaa039
[4]   Single-cell RNA-seq data semi-supervised clustering and annotation via structural regularized domain adaptation [J].
Chen, Liang ;
He, Qiuyan ;
Zhai, Yuyao ;
Deng, Minghua .
BIOINFORMATICS, 2021, 37 (06) :775-784
[5]   Integrating Deep Supervised, Self-Supervised and Unsupervised Learning for Single-Cell RNA-seq Clustering and Annotation [J].
Chen, Liang ;
Zhai, Yuyao ;
He, Qiuyan ;
Wang, Weinan ;
Deng, Minghua .
GENES, 2020, 11 (07) :1-20
[6]   Single-Cell Transcriptome Data Clustering via Multinomial Modeling and Adaptive Fuzzy K-Means Algorithm [J].
Chen, Liang ;
Wang, Weinan ;
Zhai, Yuyao ;
Deng, Minghua .
FRONTIERS IN GENETICS, 2020, 11
[7]   Contrastive self-supervised clustering of scRNA-seq data [J].
Ciortan, Madalina ;
Defrance, Matthieu .
BMC BIOINFORMATICS, 2021, 22 (01)
[8]   The mouse mammary gland requires the actin-binding protein gelsolin for proper ductal morphogenesis [J].
Crowley, MR ;
Head, KL ;
Kwiatkowski, DJ ;
Asch, HL ;
Asch, BB .
DEVELOPMENTAL BIOLOGY, 2000, 225 (02) :407-423
[9]   Cross talk of vascular endothelial growth factor and neurotrophins in mammary gland development [J].
Dangat, Kamini ;
Khaire, Amrita ;
Joshi, Sadhana .
GROWTH FACTORS, 2020, 38 (01) :16-24
[10]   Single-cell RNA-seq denoising using a deep count autoencoder [J].
Eraslan, Goekcen ;
Simon, Lukas M. ;
Mircea, Maria ;
Mueller, Nikola S. ;
Theis, Fabian J. .
NATURE COMMUNICATIONS, 2019, 10 (1)