Clustering single-cell RNA-seq data by rank constrained similarity learning

被引:10
|
作者
Mei, Qinglin [1 ,2 ]
Li, Guojun [1 ,2 ,3 ]
Su, Zhengchang [4 ]
机构
[1] Shandong Univ, Res Ctr Math & Interdisciplinary Sci, Jinan 250100, Peoples R China
[2] Shandong Univ, Sch Math, Jinan 250100, Peoples R China
[3] Liaocheng Univ, Sch Math Sci, Liaocheng 252000, Shandong, Peoples R China
[4] Univ North Carolina Charlotte, Dept Bioinformat & Genom, Charlotte, NC 28223 USA
基金
美国国家科学基金会;
关键词
GENE-EXPRESSION; REVEALS; TRANSCRIPTOME; HETEROGENEITY; EMBRYOS; NUMBER; FATE;
D O I
10.1093/bioinformatics/btab276
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Recent breakthroughs of single-cell RNA sequencing (scRNA-seq) technologies offer an exciting opportunity to identify heterogeneous cell types in complex tissues. However, the unavoidable biological noise and technical artifacts in scRNA-seq data as well as the high dimensionality of expression vectors make the problem highly challenging. Consequently, although numerous tools have been developed, their accuracy remains to be improved. Results: Here, we introduce a novel clustering algorithm and tool RCSL (Rank Constrained Similarity Learning) to accurately identify various cell types using scRNA-seq data from a complex tissue. RCSL considers both local similarity and global similarity among the cells to discern the subtle differences among cells of the same type as well as larger differences among cells of different types. RCSL uses Spearman's rank correlations of a cell's expression vector with those of other cells to measure its global similarity, and adaptively learns neighbor representation of a cell as its local similarity. The overall similarity of a cell to other cells is a linear combination of its global similarity and local similarity. RCSL automatically estimates the number of cell types defined in the similarity matrix, and identifies them by constructing a block-diagonal matrix, such that its distance to the similarity matrix is minimized. Each block-diagonal submatrix is a cell cluster/type, corresponding to a connected component in the cognate similarity graph. When tested on 16 benchmark scRNA-seq datasets in which the cell types are well-annotated, RCSL substantially outperformed six state-of-the-art methods in accuracy and robustness as measured by three metrics.
引用
收藏
页码:3235 / 3242
页数:8
相关论文
共 50 条
  • [1] A Global Similarity Learning for Clustering of Single-Cell RNA-Seq Data
    Zhu, Xiaoshu
    Guo, Lilu
    Xu, Yunpei
    Li, Hong-Dong
    Liao, Xingyu
    Wu, Fang-Xiang
    Peng, Xiaoqing
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 261 - 266
  • [2] Impact of similarity metrics on single-cell RNA-seq data clustering
    Kim, Taiyun
    Chen, Irene Rui
    Lin, Yingxin
    Wang, Andy Yi-Yang
    Yang, Jean Yee Hwa
    Yang, Pengyi
    BRIEFINGS IN BIOINFORMATICS, 2019, 20 (06) : 2316 - 2326
  • [3] Consensus clustering of single-cell RNA-seq data by enhancing network affinity
    Cui, Yaxuan
    Zhang, Shaoqiang
    Liang, Ying
    Wang, Xiangyun
    Ferraro, Thomas N.
    Chen, Yong
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (06)
  • [4] Single-cell RNA-seq data clustering: A survey with performance comparison study
    Li, Ruiyi
    Guan, Jihong
    Zhou, Shuigeng
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2020, 18 (04)
  • [5] Single-cell RNA-seq clustering: datasets, models, and algorithms
    Peng, Lihong
    Tian, Xiongfei
    Tian, Geng
    Xu, Junlin
    Huang, Xin
    Weng, Yanbin
    Yang, Jialiang
    Zhou, Liqian
    RNA BIOLOGY, 2020, 17 (06) : 765 - 783
  • [6] ScGSLC: An unsupervised graph similarity learning framework for single-cell RNA-seq data clustering
    Li, Junyi
    Jiang, Wei
    Han, Henry
    Liu, Jing
    Liu, Bo
    Wang, Yadong
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2021, 90
  • [7] Challenges in unsupervised clustering of single-cell RNA-seq data
    Kiselev, Vladimir Yu
    Andrews, Tallulah S.
    Hemberg, Martin
    NATURE REVIEWS GENETICS, 2019, 20 (05) : 273 - 282
  • [8] FEATS: feature selection-based clustering of single-cell RNA-seq data
    Vans, Edwin
    Patil, Ashwini
    Sharma, Alok
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (04)
  • [9] SC3: consensus clustering of single-cell RNA-seq data
    Kiselev, Vladimir Yu
    Kirschner, Kristina
    Schaub, Michael T.
    Andrews, Tallulah
    Yiu, Andrew
    Chandra, Tamir
    Natarajan, Kedar N.
    Reik, Wolf
    Barahona, Mauricio
    Green, Anthony R.
    Hemberg, Martin
    NATURE METHODS, 2017, 14 (05) : 483 - +
  • [10] Comparison of Gene Selection Methods for Clustering Single-cell RNA-seq Data
    Zhu, Xiaoshu
    Wang, Jianxin
    Li, Rongruan
    Peng, Xiaoqing
    CURRENT BIOINFORMATICS, 2023, 18 (01) : 1 - 11