Real-time Outlier Detection over Streaming Data

被引:7
|
作者
Yu, Kangqing [1 ]
Shi, Wei [2 ]
Santoro, Nicola [1 ]
Ma, Xiangyu [2 ]
机构
[1] Carleton Univ, Sch Comp Sci, Ottawa, ON, Canada
[2] Carleton Univ, Sch Informat Technol, Ottawa, ON, Canada
关键词
outlier detections; streaming data; parallel processing; sliding-window; CUDA;
D O I
10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00063
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Designing outlier detection algorithms over streaming data involves several issues such as concept drift, temporal context, transience, uncertainty, etc. Moreover, to produce results in real-time with limited memory resources, the processing of such data must occur in an online fashion. Therefore, real time detection of outliers on streaming data faces more challenges than performing the same task on batches of data. Several methods have been proposed to detect outliers over streaming data, among which a sliding window technique is frequently used. In this technique, only a chunk of data is kept in memory at each point in time and used to build predictive models. The size of the data in memory simultaneously is referred to as the size of a sliding window. The correctness of the outlier detection results depends largely on the choice of window size. Other similar techniques exist but most of them fail to address the properties of streaming data, and thus produce results exhibiting poor accuracy. In this paper, we present an online outlier detection algorithm, that addresses the aforementioned challenges. The proposed algorithm adopts the sliding window technique, however efficiently mines in memory a statistical summary of previous observed data, which contributes to the prediction of incoming data. It further addresses the concept drift problem that exists in streaming data. We evaluated the accuracy of our algorithm on both synthetic and real-world datasets. Results show that the proposed method detects outliers over streaming data with higher accuracy than SOD GPU algorithm proposed in [9], even when concept drifts occur. The algorithm does not require a secondary memory for processing and is further accelerated using CUDA GPU.
引用
收藏
页码:125 / 132
页数:8
相关论文
共 50 条
  • [41] Real-time streaming over wireless links: A comparative study
    Yang, G
    Chen, LJ
    Sun, T
    Gerla, M
    Sanadidi, MY
    10TH IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS, PROCEEDINGS, 2005, : 249 - 254
  • [42] Feedback-based real-time streaming over WiMax
    Chatterjee, Mainak
    Sengupta, Shamik
    Ganguly, Samrat
    IEEE WIRELESS COMMUNICATIONS, 2007, 14 (01) : 64 - 71
  • [43] KNN-Based Approximate Outlier Detection Algorithm Over IoT Streaming Data
    Zhu, Rui
    Ji, Xiaoling
    Yu, Danyang
    Tan, Zhiyuan
    Zhao, Liang
    Li, Jiajia
    Xia, Xiufeng
    IEEE ACCESS, 2020, 8 : 42749 - 42759
  • [44] A survey of real-time approximate nearest neighbor query over streaming data for fog computing
    Jiang, Xiaohui
    Hu, Peng
    Li, Yanchao
    Yuan, Chi
    Masood, Isma
    Jelodar, Hamed
    Rabbani, Mandi
    Wang, Yongli
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2018, 116 : 50 - 62
  • [45] Developing a Real-time Data Analytics Framework For Twitter Streaming Data
    Yadranjiaghdam, Babak
    Yasrobi, Seyedfaraz
    Tabrizi, Nasseh
    2017 IEEE 6TH INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS 2017), 2017, : 329 - 336
  • [46] A Real-time Anomalies Detection System based on Streaming Technology
    Du, Yutan
    Liu, Jun
    Liu, Fang
    Chen, Luying
    2014 SIXTH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC), VOL 2, 2014, : 275 - 279
  • [47] PPCensor: Architecture for real-time pornography detection in video streaming
    Mallmann, Jackson
    Santin, Altair Olivo
    Viegas, Eduardo Kugler
    dos Santos, Roger Robson
    Geremias, Jhonatan
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 112 (112): : 945 - 955
  • [48] EFFICIENT REAL-TIME SIMILARITY DETECTION FOR VIDEO CACHING AND STREAMING
    Wu, Victor K. Y.
    2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 2249 - 2252
  • [49] Real-time outlier detection for large datasets by RT-DetMCD
    De Ketelaere, Bart
    Hubert, Mia
    Raymaekers, Jakob
    Rousseeuw, Peter J.
    Vranckx, Iwein
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2020, 199
  • [50] Real-time creation of bitmap indexes on streaming network data
    Fusco, Francesco
    Vlachos, Michail
    Stoecklin, Marc Ph
    VLDB JOURNAL, 2012, 21 (03): : 287 - 307