Secuer: Ultrafast, scalable and accurate clustering of single-cell RNA-seq data

被引:5
|
作者
Wei, Nana [1 ]
Nie, Yating [1 ]
Liu, Lin [2 ]
Zheng, Xiaoqi [3 ,4 ]
Wu, Hua-Jun [5 ,6 ]
机构
[1] Shanghai Normal Univ, Dept Math, Shanghai, Peoples R China
[2] Shanghai Jiao Tong Univ, SJTU Yale Joint Ctr Biostat & Data Sci, CMA Shanghai, Inst Nat Sci,MOE LSC,Sch Math Sci, Shanghai, Peoples R China
[3] Shanghai Artificial Intelligence Lab, Shanghai, Peoples R China
[4] Shanghai Jiao Tong Univ, Ctr Single Cell Omics, Sch Publ Hlth, Sch Med, Shanghai, Peoples R China
[5] Peking Univ Hlth Sci Ctr, Ctr Precis Med Multiom Res, Sch Basic Med Sci, Beijing, Peoples R China
[6] Peking Univ Canc Hosp & Inst, Beijing, Peoples R China
基金
中国国家自然科学基金; 上海市自然科学基金; 国家重点研发计划;
关键词
HETEROGENEITY; TRANSCRIPTOMES; FATE;
D O I
10.1371/journal.pcbi.1010753
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Identifying cell clusters is a critical step for single-cell transcriptomics study. Despite the numerous clustering tools developed recently, the rapid growth of scRNA-seq volumes prompts for a more (computationally) efficient clustering method. Here, we introduce Secuer, a Scalable and Efficient speCtral clUstERing algorithm for scRNA-seq data. By employing an anchor-based bipartite graph representation algorithm, Secuer enjoys reduced runtime and memory usage over one order of magnitude for datasets with more than 1 million cells. Meanwhile, Secuer also achieves better or comparable accuracy than competing methods in small and moderate benchmark datasets. Furthermore, we showcase that Secuer can also serve as a building block for a new consensus clustering method, Secuer-consensus, which again improves the runtime and scalability of state-of-the-art consensus clustering methods while also maintaining the accuracy. Overall, Secuer is a versatile, accurate, and scalable clustering framework suitable for small to ultra-large single-cell clustering tasks. Author summary Recently, single-cell RNA sequencing (scRNA-seq) has enabled profiling of thousands to millions of cells, spurring the development of efficient clustering algorithms for large or ultra-large datasets. In this work, we developed an ultrafast clustering method, Secuer, for small to ultra-large scRNA-seq data. Using simulation and real datasets, we demonstrated that Secuer yields high accuracy, while saving runtime and memory usage by orders of magnitude, and that it can be efficiently scaled up to ultra-large datasets. Additionally, with Secuer as a subroutine, we proposed Secuer-consensus, a consensus clustering algorithm. Our results show that Secuer-consensus performs better in terms of clustering accuracy and runtime.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] Fast and accurate single-cell RNA-seq analysis by clustering of transcript-compatibility counts
    Ntranos, Vasilis
    Kamath, Govinda M.
    Zhang, Jesse M.
    Pachter, Lior
    Tse, David N.
    GENOME BIOLOGY, 2016, 17
  • [22] An Efficient and Flexible Method for Deconvoluting Bulk RNA-Seq Data with Single-Cell RNA-Seq Data
    Sun, Xifang
    Sun, Shiquan
    Yang, Sheng
    CELLS, 2019, 8 (10)
  • [23] scDFN: enhancing single-cell RNA-seq clustering with deep fusion networks
    Liu, Tianxiang
    Jia, Cangzhi
    Bi, Yue
    Guo, Xudong
    Zou, Quan
    Li, Fuyi
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (06)
  • [24] scSemiAAE: a semi-supervised clustering model for single-cell RNA-seq data
    Wang, Zile
    Wang, Haiyun
    Zhao, Jianping
    Zheng, Chunhou
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [25] A deep matrix factorization based approach for single-cell RNA-seq data clustering
    Liang, Zhenlan
    Zheng, Ruiqing
    Chen, Siqi
    Yan, Xuhua
    Li, Min
    METHODS, 2022, 205 : 114 - 122
  • [26] Review of single-cell RNA-seq data clustering for cell-type identification and characterization
    Zhang, Shixiong
    Li, Xiangtao
    Lin, Jiecong
    Lin, Qiuzhen
    Wong, Ka-Chun
    RNA, 2023, 29 (05) : 517 - 530
  • [27] A Hybrid Clustering Algorithm for Identifying Cell Types from Single-Cell RNA-Seq Data
    Zhu, Xiaoshu
    Li, Hong-Dong
    Xu, Yunpei
    Guo, Lilu
    Wu, Fang-Xiang
    Duan, Guihua
    Wang, Jianxin
    GENES, 2019, 10 (02)
  • [28] A hybrid deep clustering approach for robust cell type profiling using single-cell RNA-seq data
    Srinivasan, Suhas
    Leshchyk, Anastasia
    Johnson, Nathan T.
    Korkin, Dmitry
    RNA, 2020, 26 (10) : 1303 - 1319
  • [29] LAK: Lasso and K-Means Based Single-Cell RNA-Seq Data Clustering Analysis
    Hua, Jiao
    Liu, Hongkun
    Zhang, Boyang
    Jin, Shuilin
    IEEE ACCESS, 2020, 8 : 129679 - 129688
  • [30] scBKAP: A Clustering Model for Single-Cell RNA-Seq Data Based on Bisecting K-Means
    Wang, Xiaolin
    Gao, Hongli
    Qi, Ren
    Zheng, Ruiqing
    Gao, Xin
    Yu, Bin
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (03) : 2007 - 2015