LRSK: a low-rank self-representationK-means method for clustering single-cell RNA-sequencing data

被引:7
|
作者
Sun, Ye-Sen [1 ]
Le Ou-Yang [2 ]
Dai, Dao-Qing [1 ]
机构
[1] Sun Yat Sen Univ, Intelligent Data Ctr, Sch Math, Guangzhou, Peoples R China
[2] Shenzhen Univ, Shenzhen Key Lab Media Secur, Coll Elect & Informat Engn, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
GENE-EXPRESSION; HETEROGENEITY; FATE;
D O I
10.1039/d0mo00034e
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The development of single-cell RNA-sequencing (scRNA-seq) technologies brings tremendous opportunities for quantitative research and analyses at the cellular level. In particular, as a crucial task of scRNA-seq analysis, single cell clustering shines a light on natural groupings of cells to give new insights into the biological mechanisms and disease studies. However, it remains a challenge to identify cell clusters from lots of cell mixtures effectively and accurately. In this paper, we propose a novel adaptive joint clustering framework, named the low-rank self-representationK-means method (LRSK), to learn the data representation matrix and cluster indicator matrix jointly from scRNA-seq data. Specifically, instead of calculating the similarities among cells from the original data, we seek a low-rank representation of the original data to better reflect the underlying relationships among cells. Moreover, an Augmented Lagrangian Multiplier (ALM) based optimization algorithm is adopted to solve this problem. Experimental results on various scRNA-seq datasets and case studies demonstrate that our method performs better than other state-of-the-art single cell clustering algorithms. The analysis of unlabeled large single-cell liver cancer sequencing data further shows that our prediction results are more reasonable and interpretable.
引用
收藏
页码:465 / 473
页数:9
相关论文
共 50 条
  • [1] Clustering and classification methods for single-cell RNA-sequencing data
    Qi, Ren
    Ma, Anjun
    Ma, Qin
    Zou, Quan
    BRIEFINGS IN BIOINFORMATICS, 2020, 21 (04) : 1196 - 1208
  • [2] Single-Cell RNA Sequencing Data Clustering by Low-Rank Subspace Ensemble Framework
    Wang, ChuanYuan
    Gao, Ying-Lian
    Liu, Jin-Xing
    Kong, Xiong-Zhen
    Zheng, Chun-Hou
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (02) : 1154 - 1164
  • [3] A Data-Driven Clustering Recommendation Method for Single-Cell RNA-Sequencing Data
    Tian, Yu
    Zheng, Ruiqing
    Liang, Zhenlan
    Li, Suning
    Wu, Fang-Xiang
    Li, Min
    TSINGHUA SCIENCE AND TECHNOLOGY, 2021, 26 (05) : 772 - 789
  • [4] Non-negative low-rank representation based on dictionary learning for single-cell RNA-sequencing data analysis
    Wang, Juan
    Zhang, Nana
    Yuan, Shasha
    Shang, Junliang
    Dai, Lingyun
    Li, Feng
    Liu, Jinxing
    BMC GENOMICS, 2022, 23 (01)
  • [5] Consensus Nature Inspired Clustering of Single-Cell RNA-Sequencing Data
    Abou El-Naga, Amany H.
    Sayed, Sabah
    Salah, Akram
    Mohsen, Heba
    IEEE ACCESS, 2022, 10 : 98079 - 98094
  • [6] PanoView: An iterative clustering method for single-cell RNA sequencing data
    Hu, Ming-Wen
    Kim, Dong Won
    Liu, Sheng
    Zack, Donald J.
    Blackshaw, Seth
    Qian, Jiang
    PLOS COMPUTATIONAL BIOLOGY, 2019, 15 (08)
  • [7] A HIERARCHICAL BAYESIAN MODEL FOR SINGLE-CELL CLUSTERING USING RNA-SEQUENCING DATA
    Liu, Yiyi
    Warren, Joshua L.
    Zhao, Hongyu
    ANNALS OF APPLIED STATISTICS, 2019, 13 (03) : 1733 - 1752
  • [8] Joint learning dimension reduction and clustering of single-cell RNA-sequencing data
    Wu, Wenming
    Ma, Xiaoke
    BIOINFORMATICS, 2020, 36 (12) : 3825 - 3832
  • [9] scGAAC: A graph attention autoencoder for clustering single-cell RNA-sequencing data
    Zhang, Lin
    Xiang, Haiping
    Wang, Feng
    Chen, Zepeng
    Shen, Mo
    Ma, Jiani
    Liu, Hui
    Zheng, Hongdang
    METHODS, 2024, 229 : 115 - 124
  • [10] One-step spectral clustering of weighted variables on single-cell RNA-sequencing data
    Park, Min Young
    Park, Seyoung
    KOREAN JOURNAL OF APPLIED STATISTICS, 2020, 33 (04) : 511 - 526