FlowGrid enables fast clustering of very large single-cell RNA-seq data

被引:4
|
作者
Fang, Xiunan [1 ]
Ho, Joshua W. K. [1 ,2 ]
机构
[1] Univ Hong Kong, Li Ka Shing Fac Med, Sch Biomed Sci, Hong Kong, Peoples R China
[2] Lab Data Discovery Hlth Ltd D24H, Hong Kong Sci Pk, Hong Kong, Peoples R China
关键词
D O I
10.1093/bioinformatics/btab521
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Scalable clustering algorithms are needed to analyze millions of cells in single cell RNA-seq (scRNA-seq) data. Results: Here, we present an open source python package called FlowGrid that can integrate into the Scanpy workflow to perform clustering on very large scRNA-seq datasets. FlowGrid implements a fast density-based clustering algorithm originally designed for flow cytometry data analysis. We introduce a new automated parameter tuning procedure, and show that FlowGrid can achieve comparable clustering accuracy as state-of-the-art clustering algorithms but at a substantially reduced run time for very large single cell RNA-seq datasets. For example, FlowGrid can complete a one-hour clustering task for one million cells in about five min.
引用
收藏
页码:282 / 283
页数:2
相关论文
共 50 条
  • [31] Comparison of transformations for single-cell RNA-seq data
    Constantin Ahlmann-Eltze
    Wolfgang Huber
    Nature Methods, 2023, 20 : 665 - 672
  • [32] An Efficient and Flexible Method for Deconvoluting Bulk RNA-Seq Data with Single-Cell RNA-Seq Data
    Sun, Xifang
    Sun, Shiquan
    Yang, Sheng
    CELLS, 2019, 8 (10)
  • [33] Comparison of transformations for single-cell RNA-seq data
    Ahlmann-Eltze, Constantin
    Huber, Wolfgang
    NATURE METHODS, 2023, 20 (05) : 665 - +
  • [34] scSemiAAE: a semi-supervised clustering model for single-cell RNA-seq data
    Zile Wang
    Haiyun Wang
    Jianping Zhao
    Chunhou Zheng
    BMC Bioinformatics, 24
  • [35] Impact of data preprocessing on cell-type clustering based on single-cell RNA-seq data
    Wang, Chunxiang
    Gao, Xin
    Liu, Juntao
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [36] Impact of data preprocessing on cell-type clustering based on single-cell RNA-seq data
    Chunxiang Wang
    Xin Gao
    Juntao Liu
    BMC Bioinformatics, 21
  • [37] CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data
    Lin, Peijie
    Troup, Michael
    Ho, Joshua W. K.
    GENOME BIOLOGY, 2017, 18
  • [38] Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis
    Li, Xiangjie
    Wang, Kui
    Lyu, Yafei
    Pan, Huize
    Zhang, Jingxiao
    Stambolian, Dwight
    Susztak, Katalin
    Reilly, Muredach P.
    Hu, Gang
    Li, Mingyao
    NATURE COMMUNICATIONS, 2020, 11 (01)
  • [39] CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data
    Peijie Lin
    Michael Troup
    Joshua W. K. Ho
    Genome Biology, 18
  • [40] scAIDE: clustering of large-scale single-cell RNA-seq data reveals putative and rare cell types
    Xie, Kaikun
    Huang, Yu
    Zeng, Feng
    Liu, Zehua
    Chen, Ting
    NAR GENOMICS AND BIOINFORMATICS, 2020, 2 (04)