The Application of a Double CUSUM Algorithm in Industrial Data Stream Anomaly Detection

被引:9
作者
Li, Guang [1 ,2 ]
Wang, Jie [1 ]
Liang, Jing [1 ]
Yue, Caitong [1 ]
机构
[1] Zhengzhou Univ, Sch Elect Engn, Zhengzhou 450001, Henan, Peoples R China
[2] China Elect Technol Grp Corp, Res Inst 22, Xinxiang 453003, Peoples R China
来源
SYMMETRY-BASEL | 2018年 / 10卷 / 07期
基金
中国国家自然科学基金;
关键词
concept drift; machine learning; anomaly detection; nested sliding window; data stream;
D O I
10.3390/sym10070264
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The effect of the application of machine learning on data streams is influenced by concept drift, drift deviation, and noise interference. This paper proposes a data stream anomaly detection algorithm combined with control chart and sliding window methods. This algorithm is named DCUSUM-DS (Double CUSUM Based on Data Stream), because it uses a dual mean value cumulative sum. The DCUSUM-DS algorithm based on nested sliding windows is proposed to satisfy the concept drift problem; it calculates the average value of the data within the window twice, extracts new features, and then calculates accumulated and controlled graphs to avoid misleading by interference points. The new algorithm is simulated using drilling engineering industrial data. Compared with automatic outlier detection for data streams (A-ODDS) and with sliding nest window chart anomaly detection based on data streams (SNWCAD-DS), the DCUSUM-DS can account for concept drift and shield a small amount of interference deviating from the overall data. Although the algorithm complexity increased from 0.1 second to 0.19 second, the classification accuracy receiver operating characteristic (ROC) increased from 0.89 to 0.95. This meets the needs of the oil drilling industry data stream with a sampling frequency of 1 Hz, and it improves the classification accuracy.
引用
收藏
页数:14
相关论文
共 31 条
  • [1] Ahn KJ, 2015, PR MACH LEARN RES, V37, P2237
  • [2] [Anonymous], 2018, 2018 International Joint Conference on Neural Networks (IJCNN), DOI DOI 10.1109/IJCNN.2018.8489068
  • [3] [Anonymous], 2007, Proceedings of the 33rd International Conference on Very Large Data Bases. VLDB'07
  • [4] Babcock B., 2002, PODS, P1, DOI [DOI 10.1145/543613.543615, 10.1145/543613.543615]
  • [5] Energy-Efficient Dynamic Traffic Offloading and Reconfiguration of Networked Data Centers for Big Data Stream Mobile Computing: Review, Challenges, and a Case Study
    Baccarelli, Enzo
    Cordeschi, Nicola
    Mei, Alessandro
    Panella, Massimo
    Shojafar, Mohammad
    Stefa, Julinda
    [J]. IEEE NETWORK, 2016, 30 (02): : 54 - 61
  • [6] Cordeschi N., 2016, BIG DATA CONCEPTS ME, P848
  • [7] Multidimensional surrogate stability to detect data stream concept drift
    da Costa, Fausto G.
    Duarte, Felipe S. L. G.
    Vallim, Rosane M. M.
    de Mello, Rodrigo F.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2017, 87 : 15 - 29
  • [8] Behavior Anomaly Indicators Based on Reference PatternsApplication to the Gearbox and Electrical Generator of a Wind Turbine
    Gil, Angel
    Sanz-Bobi, Miguel A.
    Rodriguez-Lopez, Miguel A.
    [J]. ENERGIES, 2018, 11 (01):
  • [9] Real-Time Detection of False Data Injection in Smart Grid Networks: An Adaptive CUSUM Method and Analysis
    Huang, Yi
    Tang, Jin
    Cheng, Yu
    Li, Husheng
    Campbell, Kristy A.
    Han, Zhu
    [J]. IEEE SYSTEMS JOURNAL, 2016, 10 (02): : 532 - 543
  • [10] Jankov Dimitrije., 2017, Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems, P292, DOI DOI 10.1145/3093742.3095102