Minimal weighted infrequent itemset mining-based outlier detection approach on uncertain data stream

被引:0
作者
Saihua Cai
Ruizhi Sun
Shangbo Hao
Sicong Li
Gang Yuan
机构
[1] China Agricultural University,College of Information and Electrical Engineering
[2] Ministry of Agriculture,Key Laboratory of Agricultural Information Acquisition Technology
来源
Neural Computing and Applications | 2020年 / 32卷
关键词
Minimal infrequent itemset mining; Outlier detection; Uncertain weighted data stream; Deviation index;
D O I
暂无
中图分类号
学科分类号
摘要
Outliers are a critical factor that affects the accuracy of data-based predictions and some other data-based processing; thus, outliers must be effectively detected as soon as possible to improve the credibility of the data. In recent years, massive outlier detection approaches have been proposed for static data and precise data; however, the uncertainty and weight information of each item was not considered in this prior work. Moreover, traditional outlier detection approaches only take the deviation degree of each data element as the standard for determining outliers; therefore, the detected outliers do not fit the definition of an outlier (i.e., rarely appearing and different from most of the other data). Aimed at these problems, a minimal weighted infrequent itemset mining-based outlier detection approach that can be applied to an uncertain data stream, called MWIFIM–OD–UDS, is proposed in this paper to effectively detect implicit outliers, which have a rarely occurring frequency, uncertainty and a certain weight of the itemset, while the characteristics of the data stream are considered. In particular, a matrix structure-based approach that is called MWIFIM–UDS is proposed to mine the minimal weighted infrequent itemsets (MWiFIs) from an uncertain data stream, and then, the MWIFIM–OD–UDS method is proposed based on the mined MWiFIs and the designed deviation indexes. Experimental results show that the proposed MWIFIM–OD–UDS method outperforms the frequent itemset mining-based outlier detection methods, FindFPOF and LFP, in terms of its runtime and detection accuracy.
引用
收藏
页码:6619 / 6639
页数:20
相关论文
共 124 条
[1]  
Ahmed CF(2012)Single-pass incremental and interactive mining for weighted frequent patterns Expert Syst Appl 39 7976-7994
[2]  
Tanbeer SK(2017)Anomaly detection based on LRD behavior analysis of decomposed control and data planes network traffic using SOSS and FARIMA models IEEE Access 5 13501-13519
[3]  
Jeong BS(2016)An efficient algorithm for distributed density-based outlier detection on big data Neurocomputing 181 19-28
[4]  
Lee YK(2014)Infrequent weighted itemset mining using frequent pattern growth IEEE Trans Knowl Data Eng 26 903-915
[5]  
Choi HJ(2014)Continuous outlier monitoring on uncertain data streams J Comput Sci Technol 29 436-448
[6]  
AsSadhan B(2014)Mining constrained frequent itemsets from distributed uncertain data Future Gener Comput Syst 37 117-126
[7]  
Zeb K(2000)Mining frequent patterns without candidate generation ACM SIGMOD Record 29 1-12
[8]  
Al-Muhtadi J(2005)FP-outlier: frequent pattern based outlier detection Comput Sci Inf Syst 2 103-118
[9]  
Alshebeili S(2015)Minimal infrequent pattern based approach for mining outliers in data streams Expert Syst Appl 42 1998-2012
[10]  
Bai M(2017)A novel outlier cluster detection algorithm without top-n parameter Knowl-Based Syst 121 32-40