MWFP-outlier: Maximal weighted frequent-pattern-based approach for detecting outliers from uncertain weighted data streams

被引:9
|
作者
Cai, Saihua [1 ,2 ]
Li, Li [3 ]
Chen, Jinfu [1 ]
Zhao, Kaiyi [3 ]
Yuan, Gang [3 ]
Sun, Ruizhi [3 ]
Huang, Longxia [1 ,4 ]
Sosu, Rexford Nii Ayitey [1 ]
机构
[1] Jiangsu Univ, Sch Comp Sci & Commun Engn, Zhenjiang 212013, Jiangsu, Peoples R China
[2] Jiangsu Univ, Jiangsu Key Lab Secur Technol Ind Cyberspace, Zhenjiang 212013, Jiangsu, Peoples R China
[3] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China
[4] Ghana Commun Technol Univ, Fac Comp & Informat Syst, Accra, Ghana
基金
国家重点研发计划;
关键词
Outlier detection; Maximal weighted frequent patterns; Uncertain weighted data streams; Deviation indices; Data mining; DETECTION STRATEGY; ANOMALY DETECTION;
D O I
10.1016/j.ins.2022.01.028
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many outlier detection approaches have been proposed for identifying previously unknown outliers, therefore improving the credibility of data. However, previous outlier detection approaches have some problems. First, most approaches were designed for static precise datasets, thus, their detection accuracy is very low when processing uncertain data streams. Second, these approaches considered the importance (aka weight) of each pattern is the same, which could not accurately reflect some actual situations in real life. To solve these problems, we propose an efficient maximal weighted frequent-pattern-based outlier detection approach, called MWFP-Outlier, for accurately detecting potential outliers from uncertain data streams through two phases, namely pattern mining phase and an outlier detection phase. In the pattern mining phase, through fully considering the existential probabilities and weights for each pattern, we propose the MWFP-Mine approach to accurately and efficiently mine maximal weighted frequent patterns based on the designed tree structure, list structure, and pruning strategies. In the outlier detection phase, we design four deviation indices to accurately measure the deviation degree of each transaction, and then the transactions in the top k ranked are identified as potential outliers. Extensive experimental results demonstrate that the MWFP-Outlier approach can accurately detect the outliers from uncertain weighted data streams, as well as uses less time consumption.(c) 2022 Elsevier Inc. All rights reserved.
引用
收藏
页码:195 / 225
页数:31
相关论文
共 41 条
  • [1] WMFP-Outlier: An Efficient Maximal Frequent-Pattern-Based Outlier Detection Approach for Weighted Data Streams
    Cai, Saihua
    Li, Qian
    Li, Sicong
    Yuan, Gang
    Sun, Ruizhi
    INFORMATION TECHNOLOGY AND CONTROL, 2019, 48 (04): : 505 - 521
  • [2] UWFP-Outlier: an efficient frequent-pattern-based outlier detection method for uncertain weighted data streams
    Cai, Saihua
    Li, Li
    Li, Qian
    Li, Sicong
    Hao, Shangbo
    Sun, Ruizhi
    APPLIED INTELLIGENCE, 2020, 50 (10) : 3452 - 3470
  • [3] UWFP-Outlier: an efficient frequent-pattern-based outlier detection method for uncertain weighted data streams
    Saihua Cai
    Li Li
    Qian Li
    Sicong Li
    Shangbo Hao
    Ruizhi Sun
    Applied Intelligence, 2020, 50 : 3452 - 3470
  • [4] An efficient approach for outlier detection from uncertain data streams based on maximal frequent patterns
    Cai, Saihua
    Li, Li
    Li, Sicong
    Sun, Ruizhi
    Yuan, Gang
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 160
  • [5] Sliding window based weighted maximal frequent pattern mining over data streams
    Lee, Gangin
    Yun, Unil
    Ryu, Keun Ho
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (02) : 694 - 708
  • [6] Mining Weighted Frequent Patterns from Uncertain Data Streams
    Ovi, Jesan Ahammed
    Ahmed, Chowdhury Farhan
    Leung, Carson K.
    Pazdor, Adam G. M.
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM) 2019, 2019, 935 : 917 - 936
  • [7] A Novel Weighted Frequent Pattern-Based Outlier Detection Method Applied to Data Stream
    Yuan, Gang
    Cai, Saihua
    Hao, Shangbo
    2019 IEEE 4TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA), 2019, : 503 - 510
  • [8] Time-weighted counting for recently frequent pattern mining in data streams
    Yongsub Lim
    U. Kang
    Knowledge and Information Systems, 2017, 53 : 391 - 422
  • [9] Time-weighted counting for recently frequent pattern mining in data streams
    Kang, Yongsub U.
    Kang, U.
    KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 53 (02) : 391 - 422
  • [10] An Efficient Algorithm for Sliding Window-Based Weighted Frequent Pattern Mining over Data Streams
    Ahmed, Chowdhury Farhan
    Tanbeer, Syed Khairuzzaman
    Jeong, Byeong-Soo
    Lee, Young-Koo
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2009, E92D (07): : 1369 - 1381