Truncated Robust Principal Component Analysis and Noise Reduction for Single Cell RNA Sequencing Data

被引:7
作者
Gogolewski, Krzysztof [1 ]
Sykulski, Maciej [2 ,3 ]
Chung, Neo Christopher [1 ]
Gambin, Anna [1 ]
机构
[1] Univ Warsaw, Inst Informat, Fac Math Informat & Mech, Banacha 2, PL-02097 Warsaw, Poland
[2] Warsaw Med Univ, Dept Med Genet, Warsaw, Poland
[3] GenXone Inc, Res & Dev Lab, Poznan, Poland
关键词
matrix decomposition; principal component analysis; robust PCA; single cell RNA-seq; truncated singular value decomposition; unsupervised learning; GENE-EXPRESSION; DECOMPOSITION;
D O I
10.1089/cmb.2018.0255
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The development of single cell RNA sequencing (scRNA-seq) has enabled innovative approaches to investigating mRNA abundances. In our study, we are interested in extracting the systematic patterns of scRNA-seq data in an unsupervised manner; thus, we have developed two extensions of robust principal component analysis (RPCA). First, we present a truncated version of RPCA (tRPCA), which is much faster and memory efficient. Second, we introduce a noise reduction in tRPCA with L-2 regularization. Unlike RPCA that only considers a low-rank L and sparse S matrices, the proposed method can also extract a noise E matrix inherent in modern genomic data. We demonstrate its usefulness by applying our methods on the peripheral blood mononuclear cell scRNA-seq data. Particularly, the clustering of a low-rank L matrix showcases better classification of unlabeled single cells. Overall, the proposed variants are well suited for high-dimensional and noisy data that are routinely generated in genomics.
引用
收藏
页码:782 / 793
页数:12
相关论文
共 50 条
  • [21] Effectively Clustering Single Cell RNA Sequencing Data by Sparse Representation
    Li, Rui-Yi
    Wang, Zhiye
    Guan, Jihong
    Zhou, Shuigeng
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (06) : 3425 - 3434
  • [22] Critical downstream analysis steps for single-cell RNA sequencing data
    Zhang, Zilong
    Cui, Feifei
    Lin, Chen
    Zhao, Lingling
    Wang, Chunyu
    Zou, Quan
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (05)
  • [23] Differential gene expression analysis in single-cell RNA sequencing data
    Wang, Tianyu
    Nabavi, Sheida
    2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2017, : 202 - 207
  • [24] Robust sparse principal component analysis
    ZHAO Qian
    MENG DeYu
    XU ZongBen
    Science China(Information Sciences), 2014, 57 (09) : 175 - 188
  • [25] A robust model for cell type-specific interindividual variation in single-cell RNA sequencing data
    Chen, Minhui
    Dahl, Andy
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [26] Bayesian Robust Principal Component Analysis
    Ding, Xinghao
    He, Lihan
    Carin, Lawrence
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2011, 20 (12) : 3419 - 3430
  • [27] Robust sparse principal component analysis
    Zhao Qian
    Meng DeYu
    Xu ZongBen
    SCIENCE CHINA-INFORMATION SCIENCES, 2014, 57 (09) : 1 - 14
  • [28] Robust Principal Component Analysis: A Median of Means Approach
    Paul, Debolina
    Chakraborty, Saptarshi
    Das, Swagatam
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) : 16788 - 16800
  • [29] Robust principal component and factor analysis in the geostatistical treatment of environmental data
    Filzmoser, P
    ENVIRONMETRICS, 1999, 10 (04) : 363 - 375
  • [30] Tutorial: guidelines for the computational analysis of single-cell RNA sequencing data
    Andrews, Tallulah S.
    Kiselev, Vladimir Yu
    McCarthy, Davis
    Hemberg, Martin
    NATURE PROTOCOLS, 2021, 16 (01) : 1 - 9