SCCLRR: A Robust Computational Method for Accurate Clustering Single Cell RNA-Seq Data

被引:21
作者
Zhang, Wei [1 ]
Li, Yuanyuan [2 ]
Zou, Xiufen [3 ]
机构
[1] East China Jiaotong Univ Nanchang, Sch Sci, Nanchang 330013, Jiangxi, Peoples R China
[2] Wuhan Inst Technol Wuhan, Sch Math & Phys, Wuhan 430072, Peoples R China
[3] Wuhan Univ, Sch Math & Stat, Wuhan 430072, Peoples R China
关键词
Optimization; Euclidean distance; Clustering methods; Mathematical model; Informatics; Clustering algorithms; Simulation; scRNA-seq data; clustering; mathematical model; low rank representation; optimization; GENE-EXPRESSION; IDENTIFICATION; HETEROGENEITY; IMPUTATION;
D O I
10.1109/JBHI.2020.2991172
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Single-cell RNA transcriptome data present a tremendous opportunity for studying the cellular heterogeneity. Identifying subpopulations based on scRNA-seq data is a hot topic in recent years, although many researchers have been focused on designing elegant computational methods for identifying new cell types; however, the performance of these methods is still unsatisfactory due to the high dimensionality, sparsity and noise of scRNA-seq data. In this study, we propose a new cell type detection method by learning a robust and accurate similarity matrix, named SCCLRR. The method simultaneously captures both global and local intrinsic properties of data based on a low rank representation (LRR) framework mathematical model. The integrated normalized Euclidean distance and cosine similarity are used to balance the intrinsic linear and nonlinear manifold of data in the local regularization term. To solve the non-convex optimization model, we present an iterative optimization procedure using the alternating direction method of multipliers (ADMM) algorithm. We evaluate the performance of the SCCLRR method on nine real scRNA-seq datasets and compare it with seven state-of-the-art methods. The simulation results show that the SCCLRR outperforms other methods and is robust and effective for clustering scRNA-seq data. (The code of SCCLRR is free available for academic <uri>https://github.com/wzhangwhu/SCCLRR</uri>).
引用
收藏
页码:247 / 256
页数:10
相关论文
共 51 条
  • [1] Utility of Single-Cell Genomics in Diagnostic Evaluation of Prostate Cancer
    Alexander, Joan
    Kendall, Jude
    McIndoo, Jean
    Rodgers, Linda
    Aboukhalil, Robert
    Levy, Dan
    Stepansky, Asya
    Sun, Guoli
    Chobardjiev, Lubomir
    Riggs, Michael
    Cox, Hilary
    Hakker, Inessa
    Nowak, Dawid G.
    Laze, Juliana
    Llukani, Elton
    Srivastava, Abhishek
    Gruschow, Siobhan
    Yadav, Shalini S.
    Robinson, Brian
    Atwal, Gurinder
    Trotman, Lloyd C.
    Lepor, Herbert
    Hicks, James
    Wigler, Michael
    Krasnitz, Alexander
    [J]. CANCER RESEARCH, 2018, 78 (02) : 348 - 358
  • [2] [Anonymous], 2007, COMPARING CLUSTERING
  • [3] Cell fate inclination within 2-cell and 4-cell mouse embryos revealed by single-cell RNA sequencing
    Blase, Fernando H.
    Cao, Xiaoyi
    Zhong, Sheng
    [J]. GENOME RESEARCH, 2014, 24 (11) : 1787 - 1796
  • [4] Distributed optimization and statistical learning via the alternating direction method of multipliers
    Boyd S.
    Parikh N.
    Chu E.
    Peleato B.
    Eckstein J.
    [J]. Foundations and Trends in Machine Learning, 2010, 3 (01): : 1 - 122
  • [5] Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells
    Buettner, Florian
    Natarajan, Kedar N.
    Casale, F. Paolo
    Proserpio, Valentina
    Scialdone, Antonio
    Theis, Fabian J.
    Teichmann, Sarah A.
    Marioni, John C.
    Stegie, Oliver
    [J]. NATURE BIOTECHNOLOGY, 2015, 33 (02) : 155 - 160
  • [6] A survey of human brain transcriptome diversity at the single cell level
    Darmanis, Spyros
    Sloan, Steven A.
    Zhang, Ye
    Enge, Martin
    Caneda, Christine
    Shuer, Lawrence M.
    Gephart, Melanie G. Hayden
    Barres, Ben A.
    Quake, Stephen R.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2015, 112 (23) : 7285 - 7290
  • [7] Single-Cell RNA-Seq Reveals Dynamic, Random Monoallelic Gene Expression in Mammalian Cells
    Deng, Qiaolin
    Ramskold, Daniel
    Reinius, Bjorn
    Sandberg, Rickard
    [J]. SCIENCE, 2014, 343 (6167) : 193 - 196
  • [8] netNMF-sc: leveraging gene-gene interactions for imputation and dimensionality reduction in single-cell expression analysis
    Elyanow, Rebecca
    Dumitrascu, Bianca
    Engelhardt, Barbara E.
    Raphael, Benjamin J.
    [J]. GENOME RESEARCH, 2020, 30 (02) : 195 - 204
  • [9] Linking transcriptional and genetic tumor heterogeneity through allele analysis of single-cell RNA-seq data
    Fan, Jean
    Lee, Hae-Ock
    Lee, Soohyun
    Ryu, Da-eun
    Lee, Semin
    Xue, Catherine
    Kim, Seok Jin
    Kim, Kihyun
    Barkas, Nikolaos
    Park, Peter J.
    Park, Woong-Yang
    Kharchenko, Peter V.
    [J]. GENOME RESEARCH, 2018, 28 (08) : 1217 - 1227
  • [10] FORGY EW, 1965, BIOMETRICS, V21, P768