scMAE: a masked autoencoder for single-cell RNA-seq clustering

被引:7
|
作者
Fang, Zhaoyu [1 ]
Zheng, Ruiqing [1 ]
Li, Min [1 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, 932 South Lushan Rd, Changsha 410083, Peoples R China
基金
中国国家自然科学基金;
关键词
HETEROGENEITY; MODEL;
D O I
10.1093/bioinformatics/btae020
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation Single-cell RNA sequencing has emerged as a powerful technology for studying gene expression at the individual cell level. Clustering individual cells into distinct subpopulations is fundamental in scRNA-seq data analysis, facilitating the identification of cell types and exploration of cellular heterogeneity. Despite the recent development of many deep learning-based single-cell clustering methods, few have effectively exploited the correlations among genes, resulting in suboptimal clustering outcomes.Results Here, we propose a novel masked autoencoder-based method, scMAE, for cell clustering. scMAE perturbs gene expression and employs a masked autoencoder to reconstruct the original data, learning robust and informative cell representations. The masked autoencoder introduces a masking predictor, which captures relationships among genes by predicting whether gene expression values are masked. By integrating this masking mechanism, scMAE effectively captures latent structures and dependencies in the data, enhancing clustering performance. We conducted extensive comparative experiments using various clustering evaluation metrics on 15 scRNA-seq datasets from different sequencing platforms. Experimental results indicate that scMAE outperforms other state-of-the-art methods on these datasets. In addition, scMAE accurately identifies rare cell types, which are challenging to detect due to their low abundance. Furthermore, biological analyses confirm the biological significance of the identified cell subpopulations.Availability and implementation The source code of scMAE is available at: https://zenodo.org/records/10465991.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] A Global Similarity Learning for Clustering of Single-Cell RNA-Seq Data
    Zhu, Xiaoshu
    Guo, Lilu
    Xu, Yunpei
    Li, Hong-Dong
    Liao, Xingyu
    Wu, Fang-Xiang
    Peng, Xiaoqing
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 261 - 266
  • [22] Accurate feature selection improves single-cell RNA-seq cell clustering
    Su, Kenong
    Yu, Tianwei
    Wu, Hao
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (05)
  • [23] Autoencoder-based cluster ensembles for single-cell RNA-seq data analysis
    Thomas A. Geddes
    Taiyun Kim
    Lihao Nan
    James G. Burchfield
    Jean Y. H. Yang
    Dacheng Tao
    Pengyi Yang
    BMC Bioinformatics, 20
  • [24] ScCAEs: deep clustering of single-cell RNA-seq via convolutional autoencoder embedding and soft K-means
    Hu, Hang
    Li, Zhong
    Li, Xiangjie
    Yu, Minzhe
    Pan, Xiutao
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (01)
  • [25] ZINB-Based Graph Embedding Autoencoder for Single-Cell RNA-Seq Interpretations
    Yu, Zhuohan
    Lu, Yifu
    Wang, Yunhe
    Tang, Fan
    Wong, Ka-Chun
    Li, Xiangtao
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 4671 - 4679
  • [26] Autoencoder-based cluster ensembles for single-cell RNA-seq data analysis
    Geddes, Thomas A.
    Kim, Taiyun
    Nan, Lihao
    Burchfield, James G.
    Yang, Jean Y. H.
    Tao, Dacheng
    Yang, Pengyi
    BMC BIOINFORMATICS, 2019, 20 (01)
  • [27] ScDA: A Denoising AutoEncoder Based Dimensionality Reduction for Single-cell RNA-seq Data
    Zhu, Xiaoshu
    Lin, Yongchang
    Li, Jian
    Wang, Jianxin
    Peng, Xiaoqing
    BIOINFORMATICS RESEARCH AND APPLICATIONS, ISBRA 2021, 2021, 13064 : 534 - 545
  • [28] Multiobjective Deep Clustering and Its Applications in Single-cell RNA-seq Data
    Wang, Yunhe
    Bian, Chuang
    Wong, Ka-Chun
    Li, Xiangtao
    Yang, Shengxiang
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (08): : 5016 - 5027
  • [29] scDFC: A deep fusion clustering method for single-cell RNA-seq data
    Hu, Dayu
    Liang, Ke
    Zhou, Sihang
    Tu, Wenxuan
    Liu, Meng
    Liu, Xinwang
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (04)
  • [30] Publisher Correction: Challenges in unsupervised clustering of single-cell RNA-seq data
    Vladimir Yu Kiselev
    Tallulah S. Andrews
    Martin Hemberg
    Nature Reviews Genetics, 2019, 20 : 310 - 310