Real-time Outlier Detection over Streaming Data

被引:7
|
作者
Yu, Kangqing [1 ]
Shi, Wei [2 ]
Santoro, Nicola [1 ]
Ma, Xiangyu [2 ]
机构
[1] Carleton Univ, Sch Comp Sci, Ottawa, ON, Canada
[2] Carleton Univ, Sch Informat Technol, Ottawa, ON, Canada
关键词
outlier detections; streaming data; parallel processing; sliding-window; CUDA;
D O I
10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00063
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Designing outlier detection algorithms over streaming data involves several issues such as concept drift, temporal context, transience, uncertainty, etc. Moreover, to produce results in real-time with limited memory resources, the processing of such data must occur in an online fashion. Therefore, real time detection of outliers on streaming data faces more challenges than performing the same task on batches of data. Several methods have been proposed to detect outliers over streaming data, among which a sliding window technique is frequently used. In this technique, only a chunk of data is kept in memory at each point in time and used to build predictive models. The size of the data in memory simultaneously is referred to as the size of a sliding window. The correctness of the outlier detection results depends largely on the choice of window size. Other similar techniques exist but most of them fail to address the properties of streaming data, and thus produce results exhibiting poor accuracy. In this paper, we present an online outlier detection algorithm, that addresses the aforementioned challenges. The proposed algorithm adopts the sliding window technique, however efficiently mines in memory a statistical summary of previous observed data, which contributes to the prediction of incoming data. It further addresses the concept drift problem that exists in streaming data. We evaluated the accuracy of our algorithm on both synthetic and real-world datasets. Results show that the proposed method detects outliers over streaming data with higher accuracy than SOD GPU algorithm proposed in [9], even when concept drifts occur. The algorithm does not require a secondary memory for processing and is further accelerated using CUDA GPU.
引用
收藏
页码:125 / 132
页数:8
相关论文
共 50 条
  • [31] Interactive Data Cleaning for Real-Time Streaming Applications
    Raeth, Timo
    Onah, Ngozichukwuka
    Sattler, Kai-Uwe
    WORKSHOP ON HUMAN-IN-THE-LOOP DATA ANALYTICS, HILDA 2023, 2023,
  • [32] Management of real-time streaming data grid services
    Fox, G
    Aydin, G
    Gadgil, H
    Pallickara, S
    Pierce, M
    Wu, WJ
    GRID AND COOPERATIVE COMPUTING - GCC 2005, PROCEEDINGS, 2005, 3795 : 3 - 12
  • [33] Streaming Data Movement for Real-Time Image Analysis
    Abelardo López-Lagunas
    Sek Chai
    Journal of Signal Processing Systems, 2011, 62 : 29 - 42
  • [34] Streaming Data Movement for Real-Time Image Analysis
    Lopez-Lagunas, Abelardo
    Chai, Sek
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2011, 62 (01): : 29 - 42
  • [35] A Novel Real-Time LiDAR Data Streaming Framework
    Anand, Bhaskar
    Kambhampaty, Harish Rohan
    Rajalakshmi, Pachamuthu
    IEEE SENSORS JOURNAL, 2022, 22 (23) : 23476 - 23485
  • [36] Management of real-time streaming data Grid services
    Fox, Geoffrey
    Aydin, Galip
    Bulut, Hasan
    Gadgil, Harshawardhan
    Pallickara, Shrideep
    Pierce, Marlon
    Wu, Wenjun
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2007, 19 (07): : 983 - 998
  • [37] Research on a real-time receiving scheme of streaming data
    Zhang X.
    Liu Z.
    Du X.
    Lu T.
    Tongxin Xuebao/Journal on Communications, 2022, 43 (04): : 154 - 163
  • [38] Using Federated Learning in Anomaly Detection and Analytics on Real-time Streaming Data of Healthcare
    Yogitha, M.
    Srinivas, K. S.
    PROCEEDINGS OF 2023 THE 7TH INTERNATIONAL CONFERENCE ON GRAPHICS AND SIGNAL PROCESSING, ICGSP, 2023, : 29 - 34
  • [39] VEAD: Variance profile Exploitation for Anomaly Detection in real-time IoT data streaming
    Le, Kim-Ngoc T.
    Dang, Thien-Binh
    Le, Duc-Tai
    Raza, Syed M.
    Kim, Moonseong
    Choo, Hyunseung
    INTERNET OF THINGS, 2024, 25
  • [40] Real-time Anomaly Detection for Streaming Data using Burst Code on a Neurosynaptic Processor
    Chen, Qiuwen
    Qiu, Qinru
    PROCEEDINGS OF THE 2017 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2017, : 205 - 207