Low-dimensional tracking of association structures in categorical data

被引:4
作者
D'Enza, Alfonso Iodice [1 ]
Markos, Angelos [2 ]
机构
[1] Univ Cassino & Lazio Merid, Dept Econ & Law, Cassino, Italy
[2] Democritus Univ Thrace, Dept Primary Educ, Alexandroupolis, Greece
关键词
Singular value decomposition; Correspondence analysis; Incremental methods; Dimensionality reduction; Visualization; EFFICIENT; VISUALIZATION; ALGORITHM;
D O I
10.1007/s11222-014-9470-4
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In modern applications, such as text mining and signal processing, large amounts of categorical data are produced at a high rate and are characterized by association structures changing over time. Multiple correspondence analysis (MCA) is a well established dimension reduction method to explore the associations within a set of categorical variables. A critical step of the MCA algorithm is a singular value decomposition (SVD) or an eigenvalue decomposition (EVD) of a suitably transformed matrix. The high computational and memory requirements of ordinary SVD and EVD make their application impractical on massive or sequential data sets. Several enhanced SVD/EVD approaches have been recently introduced in an effort to overcome these issues. The aim of the present contribution is twofold: (1) to extend MCA to a split-apply-combine framework, that leads to an exact and parallel MCA implementation; (2) to allow for incremental updates (downdates) of existing MCA solutions, which lead to an approximate yet highly accurate solution. For this purpose, two incremental EVD and SVD approaches with desirable properties are revised and embedded in the context of MCA.
引用
收藏
页码:1009 / 1022
页数:14
相关论文
共 35 条
[1]  
[Anonymous], 2002, Series: Springer Series in Statistics
[2]  
[Anonymous], GGPLOT2 IMPLEMENTATI
[3]  
[Anonymous], TWITTERR R BASED TWI
[4]  
[Anonymous], P 10 ACM SIGKDD INT
[5]  
[Anonymous], J STAT SOFTW
[6]   Augmented implicitly restarted Lanczos bidiagonalization methods [J].
Baglama, J ;
Reichel, L .
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2005, 27 (01) :19-42
[7]   Low-Rank Incremental methods for computing dominant singular subspaces [J].
Baker, C. G. ;
Gallivan, K. A. ;
Van Dooren, P. .
LINEAR ALGEBRA AND ITS APPLICATIONS, 2012, 436 (08) :2866-2888
[8]   Fast low-rank modifications of the thin singular value decomposition [J].
Brand, M .
LINEAR ALGEBRA AND ITS APPLICATIONS, 2006, 415 (01) :20-30
[9]  
Brand M, 2003, SIAM PROC S, P37
[10]  
Chahlaoui Y, 2001, COMPUTATIONAL INFORMATION RETRIEVAL, P53