Graph Laplacian for Heterogeneous Data Clustering in Sensor-Based Internet of Things

被引:0
作者
Singh, Vishal Krishna [1 ]
Tripathi, Gaurav [2 ]
Ojha, Aman [2 ,3 ]
Bhardwaj, Rajat [4 ]
Raza, Haider [5 ]
机构
[1] Indian Inst Informat Technol, Head Wireless Commun & Analyt Res Lab, Lucknow 226002, Uttar Pradesh, India
[2] Indian Inst Informat Technol, Wireless Commun & Analyt Res Lab, Lucknow 226002, Uttar Pradesh, India
[3] Intel Corp, Bengaluru 560103, Karnataka, India
[4] H&M AI, Lucknow 226002, Uttar Pradesh, India
[5] Essex Univ, Wivenhoe Pk, Colchester CO4 3SQ, England
关键词
Heterogeneous data; Iinternet of things; Mmachine learning; Spectral clustering; Ssensor-Based IoT; CATEGORICAL-DATA; ALGORITHM;
D O I
10.1080/03772063.2023.2173673
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Traditional clustering algorithms are not suited for the heterogeneous data of the Sensor-Based Internet of Things. The accuracy of real-time data processing, in such applications, is further compromised because of the noise and missing values in the data. Considering the need for accurate clustering, a graph Laplacian-based heterogeneous data clustering is proposed in this work. Exploiting the correlation structure of the data, weight graphs are used to generate a graph Laplacian matrix to obtain co-related data points. Eigenvalues are further used to obtain distance-based, accurate clusters. The proposed algorithm is validated on five different real-world data sets and is able to outperform most of the existing algorithms. A detailed mathematical analysis followed by extensive simulation on real-world data sets proves the dexterity of the proposed method, as the performance gap, with respect to the state-of-the-art methods, in terms of accuracy and purity is as high as 30%.
引用
收藏
页码:2615 / 2627
页数:13
相关论文
共 19 条
[1]  
Adamyan L., 2020, NAT SUSTAIN, V2, P169, DOI [10.1007/s42521-020-00017-z, DOI 10.1007/S42521-020-00017-Z]
[2]  
Basbug B., 2015, ARXIV
[3]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[4]   Mixed-order spectral clustering for complex networks [J].
Ge, Yan ;
Peng, Pan ;
Lu, Haiping .
PATTERN RECOGNITION, 2021, 117
[5]   Clustering categorical data: an approach based on dynamical systems [J].
Gibson, D ;
Kleinberg, J ;
Raghavan, P .
VLDB JOURNAL, 2000, 8 (3-4) :222-236
[6]   Rock: A robust clustering algorithm for categorical attributes [J].
Guha, S ;
Rastogi, R ;
Shim, K .
INFORMATION SYSTEMS, 2000, 25 (05) :345-366
[7]   Extensions to the k-means algorithm for clustering large data sets with categorical values [J].
Huang, ZX .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (03) :283-304
[8]   Approximate Graph Laplacians for Multimodal Data Clustering [J].
Khan, Aparajita ;
Maji, Pradipta .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (03) :798-813
[9]  
Kumar Prakash, 2009, International Journal of Rapid Manufacturing, V1, P189, DOI 10.1504/IJRAPIDM.2009.029382
[10]   Fast spectral clustering method based on graph similarity matrix completion [J].
Ma, Xu ;
Zhang, Shengen ;
Pena-Pena, Karelia ;
Arce, Gonzalo R. .
SIGNAL PROCESSING, 2021, 189