scMAE: a masked autoencoder for single-cell RNA-seq clustering

被引:7
|
作者
Fang, Zhaoyu [1 ]
Zheng, Ruiqing [1 ]
Li, Min [1 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, 932 South Lushan Rd, Changsha 410083, Peoples R China
基金
中国国家自然科学基金;
关键词
HETEROGENEITY; MODEL;
D O I
10.1093/bioinformatics/btae020
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation Single-cell RNA sequencing has emerged as a powerful technology for studying gene expression at the individual cell level. Clustering individual cells into distinct subpopulations is fundamental in scRNA-seq data analysis, facilitating the identification of cell types and exploration of cellular heterogeneity. Despite the recent development of many deep learning-based single-cell clustering methods, few have effectively exploited the correlations among genes, resulting in suboptimal clustering outcomes.Results Here, we propose a novel masked autoencoder-based method, scMAE, for cell clustering. scMAE perturbs gene expression and employs a masked autoencoder to reconstruct the original data, learning robust and informative cell representations. The masked autoencoder introduces a masking predictor, which captures relationships among genes by predicting whether gene expression values are masked. By integrating this masking mechanism, scMAE effectively captures latent structures and dependencies in the data, enhancing clustering performance. We conducted extensive comparative experiments using various clustering evaluation metrics on 15 scRNA-seq datasets from different sequencing platforms. Experimental results indicate that scMAE outperforms other state-of-the-art methods on these datasets. In addition, scMAE accurately identifies rare cell types, which are challenging to detect due to their low abundance. Furthermore, biological analyses confirm the biological significance of the identified cell subpopulations.Availability and implementation The source code of scMAE is available at: https://zenodo.org/records/10465991.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Analysis of Single-Cell RNA-seq Data by Clustering Approaches
    Zhu, Xiaoshu
    Li, Hong-Dong
    Guo, Lilu
    Wu, Fang-Xiang
    Wang, Jianxin
    CURRENT BIOINFORMATICS, 2019, 14 (04) : 314 - 322
  • [2] Single-cell RNA-seq denoising using a deep count autoencoder
    Eraslan, Goekcen
    Simon, Lukas M.
    Mircea, Maria
    Mueller, Nikola S.
    Theis, Fabian J.
    NATURE COMMUNICATIONS, 2019, 10 (1)
  • [3] Challenges in unsupervised clustering of single-cell RNA-seq data
    Kiselev, Vladimir Yu
    Andrews, Tallulah S.
    Hemberg, Martin
    NATURE REVIEWS GENETICS, 2019, 20 (05) : 273 - 282
  • [4] Single-cell RNA-seq clustering: datasets, models, and algorithms
    Peng, Lihong
    Tian, Xiongfei
    Tian, Geng
    Xu, Junlin
    Huang, Xin
    Weng, Yanbin
    Yang, Jialiang
    Zhou, Liqian
    RNA BIOLOGY, 2020, 17 (06) : 765 - 783
  • [5] Deep Learning for Clustering Single-cell RNA-seq Data
    Zhu, Yuan
    Bai, Litai
    Ning, Zilin
    Fu, Wenfei
    Liu, Jie
    Jiang, Linfeng
    Fei, Shihuang
    Gong, Shiyun
    Lu, Lulu
    Deng, Minghua
    Yi, Ming
    CURRENT BIOINFORMATICS, 2024, 19 (03) : 193 - 210
  • [6] Improving Single-Cell RNA-seq Clustering by Integrating Pathways
    Zhang, Chenxing
    Gao, Lin
    Wang, Bingbo
    Gao, Yong
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (06)
  • [7] An interpretable framework for clustering single-cell RNA-Seq datasets
    Zhang, Jesse M.
    Fan, Jue
    Fan, Christina
    Rosenfeld, David
    Tse, David N.
    BMC BIOINFORMATICS, 2018, 19
  • [8] A Global Similarity Learning for Clustering of Single-Cell RNA-Seq Data
    Zhu, Xiaoshu
    Guo, Lilu
    Xu, Yunpei
    Li, Hong-Dong
    Liao, Xingyu
    Wu, Fang-Xiang
    Peng, Xiaoqing
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 261 - 266
  • [9] Impact of similarity metrics on single-cell RNA-seq data clustering
    Kim, Taiyun
    Chen, Irene Rui
    Lin, Yingxin
    Wang, Andy Yi-Yang
    Yang, Jean Yee Hwa
    Yang, Pengyi
    BRIEFINGS IN BIOINFORMATICS, 2019, 20 (06) : 2316 - 2326
  • [10] Consensus clustering of single-cell RNA-seq data by enhancing network affinity
    Cui, Yaxuan
    Zhang, Shaoqiang
    Liang, Ying
    Wang, Xiangyun
    Ferraro, Thomas N.
    Chen, Yong
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (06)