The idm Package: Incremental Decomposition Methods in R

被引:6
作者
D'Enza, Alfonso Iodice [1 ,2 ]
Markos, Angelos [3 ]
Buttarazzi, Davide [1 ,2 ]
机构
[1] Univ Cassino, Cassino, Italy
[2] Univ Cassino & Lazio Meridionale, Dept Econ & Law, I-03043 Cassino, Italy
[3] Democritus Univ Thrace, Dept Primary Educ, Alexandroupolis 68100, Greece
来源
JOURNAL OF STATISTICAL SOFTWARE | 2018年 / 86卷 / CN4期
关键词
singular value decomposition; dimensionality reduction; principal component analysis; correspondence analysis; PRINCIPAL-COMPONENTS-ANALYSIS; COVARIANCE; MODELS; PCA;
D O I
10.18637/jss.v086.c04
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In modern applications large amounts of data are produced at a high rate and are characterized by relationship structures changing over time. Principal component analysis (PCA) and multiple correspondence analysis (MCA) are well established dimension reduction methods to explore relationships within a set of variables. A critical step of the PCA and MCA algorithms is a singular value decomposition (SVD) or an eigen-value decomposition (EVD) of a suitably transformed matrix. The high computational and memory requirements of ordinary SVD and EVD make their application impractical on massive or sequential data sets. A series of incremental SVD/EVD approaches are available to address these issues. The idm R package is introduced that implements two efficient incremental SVD approaches. The procedures in question share desirable properties that ease their embedding in PCA and MCA. The package also provides functions for producing animated visualizations of the obtained solutions. A comparison of online MCA implementations in terms of accuracy is also included.
引用
收藏
页码:1 / 24
页数:24
相关论文
共 43 条
  • [1] [Anonymous], 2010, BIPLOTS PRACTICE
  • [2] [Anonymous], 2013, ADV NEURAL INFORM PR
  • [3] Arora R, 2012, ANN ALLERTON CONF, P861, DOI 10.1109/Allerton.2012.6483308
  • [4] Augmented implicitly restarted Lanczos bidiagonalization methods
    Baglama, J
    Reichel, L
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2005, 27 (01) : 19 - 42
  • [5] Low-Rank Incremental methods for computing dominant singular subspaces
    Baker, C. G.
    Gallivan, K. A.
    Van Dooren, P.
    [J]. LINEAR ALGEBRA AND ITS APPLICATIONS, 2012, 436 (08) : 2866 - 2888
  • [6] Principal component analysis in sensory analysis: covariance or correlation matrix?
    Borgognone, MG
    Bussi, J
    Hough, G
    [J]. FOOD QUALITY AND PREFERENCE, 2001, 12 (5-7) : 323 - 326
  • [7] Online Principal Component Analysis in High Dimension: Which Algorithm to Choose?
    Cardot, Herve
    Degras, David
    [J]. INTERNATIONAL STATISTICAL REVIEW, 2018, 86 (01) : 29 - 50
  • [8] Low-dimensional tracking of association structures in categorical data
    D'Enza, Alfonso Iodice
    Markos, Angelos
    [J]. STATISTICS AND COMPUTING, 2015, 25 (05) : 1009 - 1022
  • [9] Gentry J., 2015, TWITTER R BASED TWIT
  • [10] High-content screening: A new approach to easing key bottlenecks in the drug discovery process
    Giuliano, KA
    DeBiasio, RL
    Dunlay, RT
    Gough, A
    Volosky, JM
    Zock, J
    Pavlakis, GN
    Taylor, DL
    [J]. JOURNAL OF BIOMOLECULAR SCREENING, 1997, 2 (04) : 249 - 259