Determining context of association rules by using machine learning

被引:1
作者
Nisar, Kanwal [1 ]
Shaheen, Muhammad [1 ]
机构
[1] Fdn Univ Islamabad, Fac Engn & Informat Technol, Islamabad, Pakistan
关键词
CBPNARM; context; box plot; diversity; association; rule mining; data mining; SYSTEMS;
D O I
10.1080/0952813X.2021.1955980
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Association rule mining is typically used to uncover the enthralling interdependencies between the set of variables and reveals the hidden pattern within the dataset. The associations are identified based on co-occurring variables with high frequencies. These associations can be positive (A -> B) or negative (A ->-B). The number of these association rules in larger databases are considerably higher which restricted the extraction of valuable insights from the dataset. Some rule pruning strategies are used to reduce the number of rules that can sometimes miss an important, or include an unimportant rule into the final rule set because of not considering the context of the rule. Context-based positive and negative association rule mining (CBPNARM) for the first time included context variable in the algorithms of association rule mining for selection/ de-selection of such rules. In CBPNARM, the selection of context variable and its range of values are done by the user/expert of the system which demands unwanted user interaction and may add some bias to the results. This paper proposes a method to automate the selection of context variable and selection of its value range. The context variable is chosen by using the diversity index and chi-square test, and the range of values for the context variable is set by using box plot analysis. The proposed method on top of it added conditional-probability increment ratio (CPIR) for further pruning uninteresting rules. Experiments show the system can select the context variable automatically and set the right range for the selected context variable. The performance of the proposed method is compared with CBPNARM and other state of the art methods.
引用
收藏
页码:59 / 76
页数:18
相关论文
共 28 条
[1]   Parallel mining of association rules [J].
Agrawal, R ;
Shafer, JC .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1996, 8 (06) :962-969
[2]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[3]  
Agrawal R., 1994, VLDB 94, P487
[4]   Association rule mining algorithms on high-dimensional datasets [J].
Ai, Dongmei ;
Pan, Hongfei ;
Li, Xiaoxin ;
Gao, Yingxin ;
He, Di .
ARTIFICIAL LIFE AND ROBOTICS, 2018, 23 (03) :420-427
[5]   Mining association rules in big data with NGEP [J].
Chen, Yunliang ;
Li, Fangyuan ;
Fan, Junqing .
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (02) :577-585
[6]  
Chihani B, 2011, COMM COM INF SC, V167, P718
[7]   MapReduce: A Flexible Data Processing Tool [J].
Dean, Jeffrey ;
Ghemawat, Sanjay .
COMMUNICATIONS OF THE ACM, 2010, 53 (01) :72-77
[8]   Intelligent mapping between GPU and cluster computing for discovering big association rules [J].
Djenouri, Youcef ;
Djenouri, Djamel ;
Habbas, Zineb .
APPLIED SOFT COMPUTING, 2018, 65 :387-399
[9]  
Fayyad U, 1996, AI MAG, V17, P37
[10]  
Han J, 2012, MOR KAUF D, P1