Anomaly detection method for sensor network data streams based on sliding window sampling and optimized clustering

被引:12
作者
Lin, Ling [1 ,2 ]
Su, Jinshan [1 ]
机构
[1] Yili Normal Univ, Elect & Informat Engn Coll, Yining 835000, Xinjiang, Peoples R China
[2] Nanjing Univ, Collaborat Innovat Ctr Novel Software Technol & I, State Key Lab Novel Software Technol, Nanjing 210025, Jiangsu, Peoples R China
关键词
Data stream sampling; Dimension cluster; Maximum entropy principle; Clustering; Anomaly detection;
D O I
10.1016/j.ssci.2019.04.047
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
When detecting abnormal data in the sensor network data stream, it is necessary to accurately obtain the source of the abnormal data. The traditional data stream clustering algorithm has the disadvantages of large clustering information loss and low accuracy. Therefore, this paper proposes a sensor network data stream anomaly detection method based on optimized clustering. Firstly, the proposed sampling algorithm is used to sample the data stream. The sampling result is used as a sample set. Use dynamic data histogram to divide the data dimension into different dimension groups, calculate the maximum entropy division dimension space cluster of each dimension, and aggregate the data of the same dimension cluster into the micro cluster. The abnormality detection of the data stream is realized by comparing the information entropy size of the micro cluster and its distribution characteristics. The experimental results show that the proposed algorithm can improve the accuracy and effectiveness of data stream anomaly detection.
引用
收藏
页码:70 / 75
页数:6
相关论文
共 17 条
  • [1] Characterization of focal EEG signals: A review
    Acharya, U. Rajendra
    Hagiwara, Yuki
    Deshpande, Sunny Nitin
    Suren, S.
    Koh, Joel En Wei
    Oh, Shu Lih
    Arunkumar, N.
    Ciaccio, Edward J.
    Lim, Choo Min
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 91 : 290 - 299
  • [2] Ahn KJ, 2015, PR MACH LEARN RES, V37, P2237
  • [3] Friezes and a construction of the Euclidean cluster variables
    Assem, Ibrahim
    Dupont, Gregoire
    [J]. JOURNAL OF PURE AND APPLIED ALGEBRA, 2011, 215 (10) : 2322 - 2340
  • [4] Atwa Walid, 2014, Database and Expert Systems Applications 25th International Conference (DEXA 2014). Proceedings. LNCS 8644, P446, DOI 10.1007/978-3-319-10073-9_38
  • [5] Bocquet M., 2010, Q J R METEOROLOG SOC, V131, P2191
  • [6] Feature selection and classification in multiple class datasets: An application to KDD Cup 99 dataset
    Bolon-Canedo, V.
    Sanchez-Marono, N.
    Alonso-Betanzos, A.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (05) : 5947 - 5957
  • [7] Optimal sampling from sliding windows
    Braverman, Vladimir
    Ostrovsky, Rafail
    Zaniolo, Carlo
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2012, 78 (01) : 260 - 272
  • [8] Density-Based Clustering over an Evolving Data Stream with Noise
    Cao, Feng
    Ester, Martin
    Qian, Weining
    Zhou, Aoying
    [J]. PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, : 328 - +
  • [9] Chang Jian-Long, 2007, Journal of Software, V18, P905, DOI 10.1360/jos180905
  • [10] A single pass algorithm for clustering evolving data streams based on swarm intelligence
    Forestiero, Agostino
    Pizzuti, Clara
    Spezzano, Giandomenico
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2013, 26 (01) : 1 - 26