Mining significant association rules from uncertain data

被引:0
作者
Anshu Zhang
Wenzhong Shi
Geoffrey I. Webb
机构
[1] The Hong Kong Polytechnic University,Department of Land Surveying and Geo
[2] Monash University,Informatics
来源
Data Mining and Knowledge Discovery | 2016年 / 30卷
关键词
Pattern discovery; Association rules; Statistical evaluation; Uncertain data;
D O I
暂无
中图分类号
学科分类号
摘要
In association rule mining, the trade-off between avoiding harmful spurious rules and preserving authentic ones is an ever critical barrier to obtaining reliable and useful results. The statistically sound technique for evaluating statistical significance of association rules is superior in preventing spurious rules, yet can also cause severe loss of true rules in presence of data error. This study presents a new and improved method for statistical test on association rules with uncertain erroneous data. An original mathematical model was established to describe data error propagation through computational procedures of the statistical test. Based on the error model, a scheme combining analytic and simulative processes was designed to correct the statistical test for distortions caused by data error. Experiments on both synthetic and real-world data show that the method significantly recovers the loss in true rules (reduces type-2 error) due to data error occurring in original statistically sound method. Meanwhile, the new method maintains effective control over the familywise error rate, which is the distinctive advantage of the original statistically sound technique. Furthermore, the method is robust against inaccurate data error probability information and situations not fulfilling the commonly accepted assumption on independent error probabilities of different data items. The method is particularly effective for rules which were most practically meaningful yet sensitive to data error. The method proves promising in enhancing values of association rule mining results and helping users make correct decisions.
引用
收藏
页码:928 / 963
页数:35
相关论文
共 50 条
  • [21] Mining association rules in big data with NGEP
    Yunliang Chen
    Fangyuan Li
    Junqing Fan
    Cluster Computing, 2015, 18 : 577 - 585
  • [22] Mining Association Rules from Stream Data Based on the Dynamic Support
    Luo, Jia
    Chen, Shihe
    Pan, Fengping
    Zhu, Yaqin
    Wu, Le
    Sun, Yaqi
    Zhang, Chunkai
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE: TECHNOLOGIES AND APPLICATIONS, 2016, 127
  • [23] Efficient mining of salinity and temperature association rules from ARGO data
    Huang, Yo-Ping
    Kao, Li-Jen
    Sandnes, Frode-Eika
    EXPERT SYSTEMS WITH APPLICATIONS, 2008, 35 (1-2) : 59 - 68
  • [24] Efficient Association Rules Mining from Streaming Data with a Fault Tolerance
    Abd Elaty, Amr Aly
    Salem, Rashed
    Abd Elkader, Hatem
    PROCEEDINGS OF 2018 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND SYSTEMS (ICCES), 2018, : 627 - 632
  • [25] Scalable Approach for Mining Association Rules from Structured XML Data
    Abazeed, Ashraf
    Mamat, Ali
    Sulaiman, Md Nasir
    Ibrahim, Hamidah
    2009 2ND CONFERENCE ON DATA MINING AND OPTIMIZATION, 2009, : 5 - 9
  • [26] Data Mining Using Association Rules for Intuitionistic Fuzzy Data
    Petry, Frederick
    Yager, Ronald
    INFORMATION, 2023, 14 (07)
  • [27] A Comprehensive Survey Of Association Rules On Quantitative Data In Data Mining
    Gosain, Anjana
    Bhugra, Maneela
    2013 IEEE CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES (ICT 2013), 2013, : 1003 - 1008
  • [28] Expert deduction rules in data mining with association rules: a case study
    Rauch, Jan
    KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 59 (01) : 167 - 195
  • [29] Expert deduction rules in data mining with association rules: a case study
    Jan Rauch
    Knowledge and Information Systems, 2019, 59 : 167 - 195
  • [30] Research of Commonly Used Association Rules Mining Algorithm in Data Mining
    Zhong, Ruowu
    Wang, Huiping
    2010 THE 3RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INDUSTRIAL APPLICATION (PACIIA2010), VOL III, 2010, : 260 - 263