Truncated Robust Principal Component Analysis and Noise Reduction for Single Cell RNA Sequencing Data

被引:7
作者
Gogolewski, Krzysztof [1 ]
Sykulski, Maciej [2 ,3 ]
Chung, Neo Christopher [1 ]
Gambin, Anna [1 ]
机构
[1] Univ Warsaw, Inst Informat, Fac Math Informat & Mech, Banacha 2, PL-02097 Warsaw, Poland
[2] Warsaw Med Univ, Dept Med Genet, Warsaw, Poland
[3] GenXone Inc, Res & Dev Lab, Poznan, Poland
关键词
matrix decomposition; principal component analysis; robust PCA; single cell RNA-seq; truncated singular value decomposition; unsupervised learning; GENE-EXPRESSION; DECOMPOSITION;
D O I
10.1089/cmb.2018.0255
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The development of single cell RNA sequencing (scRNA-seq) has enabled innovative approaches to investigating mRNA abundances. In our study, we are interested in extracting the systematic patterns of scRNA-seq data in an unsupervised manner; thus, we have developed two extensions of robust principal component analysis (RPCA). First, we present a truncated version of RPCA (tRPCA), which is much faster and memory efficient. Second, we introduce a noise reduction in tRPCA with L-2 regularization. Unlike RPCA that only considers a low-rank L and sparse S matrices, the proposed method can also extract a noise E matrix inherent in modern genomic data. We demonstrate its usefulness by applying our methods on the peripheral blood mononuclear cell scRNA-seq data. Particularly, the clustering of a low-rank L matrix showcases better classification of unlabeled single cells. Overall, the proposed variants are well suited for high-dimensional and noisy data that are routinely generated in genomics.
引用
收藏
页码:782 / 793
页数:12
相关论文
共 50 条
  • [31] Robust sparse principal component analysis
    Qian Zhao
    DeYu Meng
    ZongBen Xu
    Science China Information Sciences, 2014, 57 : 1 - 14
  • [32] Joint learning dimension reduction and clustering of single-cell RNA-sequencing data
    Wu, Wenming
    Ma, Xiaoke
    BIOINFORMATICS, 2020, 36 (12) : 3825 - 3832
  • [33] Robust algorithms for principal component analysis
    Yang, TN
    Wang, SD
    PATTERN RECOGNITION LETTERS, 1999, 20 (09) : 927 - 933
  • [34] Principal component analysis combined with truncated-Newton minimization for dimensionality reduction of chemical databases
    Dexuan Xie
    Suresh B. Singh
    Eugene M. Fluder
    Tamar Schlick
    Mathematical Programming, 2003, 95 : 161 - 185
  • [35] The effect of data reduction by independent component analysis and principal component analysis in hand motion identification
    Du, YC
    Hu, WC
    Shyu, LY
    PROCEEDINGS OF THE 26TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2004, 26 : 84 - 86
  • [36] Reduction of instrument-dependent noise in hyperspectral image data using the principal component analysis: Applications to Galileo NIMS data
    Stephan, K.
    Hibbitts, C. A.
    Hoffmann, H.
    Jaumann, R.
    PLANETARY AND SPACE SCIENCE, 2008, 56 (3-4) : 406 - 419
  • [37] Recovery of Corrupted Data in Wireless Sensor Networks Using Tensor Robust Principal Component Analysis
    Zhang, Xiaoyue
    He, Jingfei
    Li, Yunpei
    Chi, Yue
    Zhou, Yatong
    IEEE COMMUNICATIONS LETTERS, 2021, 25 (10) : 3389 - 3393
  • [38] Clustering and classification methods for single-cell RNA-sequencing data
    Qi, Ren
    Ma, Anjun
    Ma, Qin
    Zou, Quan
    BRIEFINGS IN BIOINFORMATICS, 2020, 21 (04) : 1196 - 1208
  • [39] EPCA—Enhanced Principal Component Analysis for Medical Data Dimensionality Reduction
    Vinutha M.R.
    Chandrika J.
    Krishnan B.
    Kokatnoor S.A.
    SN Computer Science, 4 (3)
  • [40] Spectacle: An interactive resource for ocular single-cell RNA sequencing data analysis
    Voigt, Andrew P.
    Whitmore, S. Scott
    Lessing, Nicholas D.
    DeLuca, Adam P.
    Tucker, Budd A.
    Stone, Edwin M.
    Mullins, Robert F.
    Scheetz, Todd E.
    EXPERIMENTAL EYE RESEARCH, 2020, 200