Real Time Principal Component Analysis

被引:0
作者
Chowdhury, Ranak Roy [1 ]
Adnan, Muhammad Abdullah [1 ]
Gupta, Rajesh K. [2 ]
机构
[1] BUET, Dhaka, Bangladesh
[2] Univ Calif San Diego, San Diego, CA 92103 USA
来源
2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019) | 2019年
关键词
Big Data; Real Time; Dimensionality Reduction; PCA;
D O I
10.1109/ICDE.2019.00171
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
By processing the data in motion, real-time data processing enables us to extract instantaneous results from online input data that ensures timely responsiveness to events as well as a much enhanced capacity to process large data sets. This is especially important when decision loops include querying and processing data on the web where size and latency considerations make it impossible to process raw data in real-time. This makes dimensionality reduction techniques, like principal component analysis (PCA), an important data preprocessing tool to gain insights into data. In this paper, we propose a variant of PCA, that is suited for real-time applications. In the real-time version of the PCA problem, we maintain a window over the most recent data and project every incoming row of data into lower dimensional subspace, which we generate as the output of the model. The goal is to minimize the reconstruction error of the output from the input. We use the reconstruction error as the termination criteria to update the eigenspace as new data arrives. To verify whether our proposed model can capture the essence of the changing distribution of large datasets in real-time, we have implemented the algorithm and evaluated performance against carefully designed simulations that change distributions of data sources over time in a controllable manner. Furthermore, we have demonstrated that our algorithm can capture the changing distributions of real-life datasets by running simulations on datasets from a variety of real-time applications e.g. localization, customer expenditure, etc. We propose algorithmic enhancements that rely upon spectral analysis to improve dimensionality reduction. Results show that our method can successfully capture the changing distribution of data in a real-time scenario, thus enabling real-time PCA.
引用
收藏
页码:1678 / 1681
页数:4
相关论文
共 50 条
[41]   A Filtering of Incomplete GNSS Position Time Series with Probabilistic Principal Component Analysis [J].
Gruszczynski, Maciej ;
Klos, Anna ;
Bogusz, Janusz .
PURE AND APPLIED GEOPHYSICS, 2018, 175 (05) :1841-1867
[42]   Principal Component Analysis and Dynamic Time-Warping in Subbands for ECG Reconstruction [J].
Moron, Tomasz ;
Kotas, Marian ;
Leski, Jacek M. .
MAN-MACHINE INTERACTIONS 4, ICMMI 2015, 2016, 391 :315-325
[43]   Anomaly Detection in Financial Time Series by Principal Component Analysis and Neural Networks [J].
Crepey, Stephane ;
Lehdili, Noureddine ;
Madhar, Nisrine ;
Thomas, Maud .
ALGORITHMS, 2022, 15 (10)
[44]   Sparse principal component analysis for high-dimensional stationary time series [J].
Fujimori, Kou ;
Goto, Yuichi ;
Liu, Yan ;
Taniguchi, Masanobu .
SCANDINAVIAN JOURNAL OF STATISTICS, 2023, 50 (04) :1953-1983
[45]   Spectral simulation study on the influence of the principal component analysis step on principal component regression [J].
Hasegawa, T .
APPLIED SPECTROSCOPY, 2006, 60 (01) :95-98
[46]   Data Analysis Using Principal Component Analysis [J].
Sehgal, Shrub ;
Singh, Harpreet ;
Agarwal, Mohit ;
Bhasker, V. ;
Shantanu .
2014 INTERNATIONAL CONFERENCE ON MEDICAL IMAGING, M-HEALTH & EMERGING COMMUNICATION SYSTEMS (MEDCOM), 2015, :45-48
[47]   Multilinear Sparse Principal Component Analysis [J].
Lai, Zhihui ;
Xu, Yong ;
Chen, Qingcai ;
Yang, Jian ;
Zhang, David .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (10) :1942-1950
[48]   Double robust principal component analysis [J].
Wang, Qianqian ;
Gao, QuanXue ;
Sun, Gan ;
Ding, Chris .
NEUROCOMPUTING, 2020, 391 :119-128
[49]   Hierarchical disjoint principal component analysis [J].
Cavicchia, Carlo ;
Vichi, Maurizio ;
Zaccaria, Giorgia .
ASTA-ADVANCES IN STATISTICAL ANALYSIS, 2023, 107 (03) :537-574
[50]   Efficient fair principal component analysis [J].
Kamani, Mohammad Mahdi ;
Haddadpour, Farzin ;
Forsati, Rana ;
Mahdavi, Mehrdad .
MACHINE LEARNING, 2022, 111 (10) :3671-3702