Secuer: Ultrafast, scalable and accurate clustering of single-cell RNA-seq data

被引:5
|
作者
Wei, Nana [1 ]
Nie, Yating [1 ]
Liu, Lin [2 ]
Zheng, Xiaoqi [3 ,4 ]
Wu, Hua-Jun [5 ,6 ]
机构
[1] Shanghai Normal Univ, Dept Math, Shanghai, Peoples R China
[2] Shanghai Jiao Tong Univ, SJTU Yale Joint Ctr Biostat & Data Sci, CMA Shanghai, Inst Nat Sci,MOE LSC,Sch Math Sci, Shanghai, Peoples R China
[3] Shanghai Artificial Intelligence Lab, Shanghai, Peoples R China
[4] Shanghai Jiao Tong Univ, Ctr Single Cell Omics, Sch Publ Hlth, Sch Med, Shanghai, Peoples R China
[5] Peking Univ Hlth Sci Ctr, Ctr Precis Med Multiom Res, Sch Basic Med Sci, Beijing, Peoples R China
[6] Peking Univ Canc Hosp & Inst, Beijing, Peoples R China
基金
中国国家自然科学基金; 上海市自然科学基金; 国家重点研发计划;
关键词
HETEROGENEITY; TRANSCRIPTOMES; FATE;
D O I
10.1371/journal.pcbi.1010753
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Identifying cell clusters is a critical step for single-cell transcriptomics study. Despite the numerous clustering tools developed recently, the rapid growth of scRNA-seq volumes prompts for a more (computationally) efficient clustering method. Here, we introduce Secuer, a Scalable and Efficient speCtral clUstERing algorithm for scRNA-seq data. By employing an anchor-based bipartite graph representation algorithm, Secuer enjoys reduced runtime and memory usage over one order of magnitude for datasets with more than 1 million cells. Meanwhile, Secuer also achieves better or comparable accuracy than competing methods in small and moderate benchmark datasets. Furthermore, we showcase that Secuer can also serve as a building block for a new consensus clustering method, Secuer-consensus, which again improves the runtime and scalability of state-of-the-art consensus clustering methods while also maintaining the accuracy. Overall, Secuer is a versatile, accurate, and scalable clustering framework suitable for small to ultra-large single-cell clustering tasks. Author summary Recently, single-cell RNA sequencing (scRNA-seq) has enabled profiling of thousands to millions of cells, spurring the development of efficient clustering algorithms for large or ultra-large datasets. In this work, we developed an ultrafast clustering method, Secuer, for small to ultra-large scRNA-seq data. Using simulation and real datasets, we demonstrated that Secuer yields high accuracy, while saving runtime and memory usage by orders of magnitude, and that it can be efficiently scaled up to ultra-large datasets. Additionally, with Secuer as a subroutine, we proposed Secuer-consensus, a consensus clustering algorithm. Our results show that Secuer-consensus performs better in terms of clustering accuracy and runtime.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] GRACE: A Graph-Based Cluster Ensemble Approach for Single-Cell RNA-Seq Data Clustering
    Guan, Jihong
    Li, Rui-Yi
    Wang, Jiasheng
    IEEE ACCESS, 2020, 8 : 166730 - 166741
  • [42] scHFC: a hybrid fuzzy clustering method for single-cell RNA-seq data optimized by natural computation
    Wang, Jing
    Xia, Junfeng
    Tan, Dayu
    Lin, Rongxin
    Su, Yansen
    Zheng, Chun-Hou
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (02)
  • [43] Identification of innate lymphoid cells in single-cell RNA-Seq data
    Suffiotti, Madeleine
    Carmona, Santiago J.
    Jandus, Camilla
    Gfeller, David
    IMMUNOGENETICS, 2017, 69 (07) : 439 - 450
  • [44] Normalization Methods on Single-Cell RNA-seq Data: An Empirical Survey
    Lytal, Nicholas
    Ran, Di
    An, Lingling
    FRONTIERS IN GENETICS, 2020, 11
  • [45] Comparative Analysis of Single-Cell RNA-seq Cluster Methods
    Fang, Jingwen
    Yin, Zhaohua
    Guo, Chuang
    2ND INTERNATIONAL CONFERENCE ON FRONTIERS OF BIOLOGICAL SCIENCES AND ENGINEERING (FSBE 2019), 2020, 2208
  • [46] The effect of data transformation on low-dimensional integration of single-cell RNA-seq
    Park, Youngjun
    Hauschild, Anne-Christin
    BMC BIOINFORMATICS, 2024, 25 (01)
  • [47] Identification of cancer subtypes from single-cell RNA-seq data using a consensus clustering method
    Gan, Yanglan
    Li, Ning
    Zou, Guobing
    Xin, Yongchang
    Guan, Jihong
    BMC MEDICAL GENOMICS, 2018, 11
  • [48] f-scLVM: scalable and versatile factor analysis for single-cell RNA-seq
    Buettner, Florian
    Pratanwanich, Naruemon
    McCarthy, Davis J.
    Marioni, John C.
    Stegle, Oliver
    GENOME BIOLOGY, 2017, 18
  • [49] Clustering single cells: a review of approaches on high-and low-depth single-cell RNA-seq data
    Menon, Vilas
    BRIEFINGS IN FUNCTIONAL GENOMICS, 2018, 17 (04) : 240 - 245
  • [50] SAIC: an iterative clustering approach for analysis of single cell RNA-seq data
    Yang, Lu
    Liu, Jiancheng
    Lu, Qiang
    Riggs, Arthur D.
    Wu, Xiwei
    BMC GENOMICS, 2017, 18