Big Data Analysis with Signal Processing on Graphs

被引:567
作者
Sandryhaila, Aliaksei [1 ]
Moura, Jose M. F. [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
关键词
TUKEY-TYPE ALGORITHMS; DIMENSIONALITY REDUCTION; TRANSFORMS; DECOMPOSITIONS; EIGENMAPS;
D O I
10.1109/MSP.2014.2329213
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Analysis and processing of very large data sets, or big data, poses a significant challenge. Massive data sets are collected and studied in numerous domains, from engineering sciences to social networks, biomolecular research, commerce, and security. Extracting valuable information from big data requires innovative approaches that efficiently process large amounts of data as well as handle and, moreover, utilize their structure. This article discusses a paradigm for large-scale data analysis based on the discrete signal processing (DSP) on graphs (DSPG). DSPG extends signal processing concepts and methodologies from the classical signal processing theory to data indexed by general graphs. Big data analysis presents several challenges to DSPG, in particular, in filtering and frequency analysis of very large data sets. We review fundamental concepts of DSPG, including graph signals and graph filters, graph Fourier transform, graph frequency, and spectrum ordering, and compare them with their counterparts from the classical signal processing theory. We then consider product graphs as a graph model that helps extend the application of DSPG methods to large data sets through efficient implementation based on parallelization and vectorization. We relate the presented framework to existing methods for large-scale data processing and illustrate it with an application to data compression. © 2014 IEEE.
引用
收藏
页码:80 / 90
页数:11
相关论文
共 45 条
[1]  
Acar E., 2009, NSF WORKSH ARL VA FE
[2]   Unsupervised Multiway Data Analysis: A Literature Survey [J].
Acar, Evrim ;
Yener, Buelent .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (01) :6-20
[3]   Structure-seeking multilinear methods for the analysis of MRI data [J].
Andersen, AH ;
Rayens, WS .
NEUROIMAGE, 2004, 22 (02) :728-739
[4]  
[Anonymous], LINEAR ALGEBRA LARGE
[5]  
[Anonymous], WAVELET TOUR SIGNAL
[6]  
[Anonymous], 1977, DISCRETE TIME SIGNAL
[7]   Laplacian eigenmaps for dimensionality reduction and data representation [J].
Belkin, M ;
Niyogi, P .
NEURAL COMPUTATION, 2003, 15 (06) :1373-1396
[8]   AN ALGORITHM FOR MACHINE CALCULATION OF COMPLEX FOURIER SERIES [J].
COOLEY, JW ;
TUKEY, JW .
MATHEMATICS OF COMPUTATION, 1965, 19 (90) :297-&
[9]   Blind identification of underdetermined mixtures by simultaneous matrix diagonalization [J].
De lathauwer, Lieven ;
Castaing, Josephine .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2008, 56 (03) :1096-1105
[10]   Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data [J].
Donoho, DL ;
Grimes, C .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2003, 100 (10) :5591-5596