A Theory of Evidence-based method for assessing frequent patterns

被引:10
作者
Guil, Francisco [1 ]
Marin, Roque [2 ]
机构
[1] Univ Almeria, High Sch Engn, Dept Languages & Comp Sci, Almeria, Spain
[2] Univ Murcia, Fac Comp Sci, Dept Informat & Commun Engn, E-30001 Murcia, Spain
关键词
Frequent itemset mining; Theory of Evidence; Information measures; Uncertainty management; MATHEMATICAL-THEORY; SPECIFICITY; ENTROPY; SYSTEM;
D O I
10.1016/j.eswa.2012.12.030
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Frequent itemset (or frequent pattern) mining is a very important issue within the data mining field. Both, syntactic simplicity and descriptive potential, are the key features of the itemset-based pattern which have led to its widespread use in a growing number of real-life domains. Some of the most representative algorithms for mining this kind of pattern are Apriori-like algorithms and, therefore, the number of patterns obtained under normal conditions is very large, making the process of evaluation and interpretation quite difficult. This problem is compounded if we consider that knowledge discovery is an iterative process, and the change in the parameters of the preprocessing techniques or the mining algorithm can lead to significant changes in the result. In this paper, we propose a method based on Shafer's Theory of Evidence which uses two information measures for the quality evaluation of the set of frequent patterns. From a practical point of view, the main goal is to select, for a given database, the best preprocessing technique that lead to the discovery of useful knowledge. Nevertheless, the underlying idea is to propose a formal method to assess, objectively, sets of frequent patterns, seen as belief structures, in terms of certainty in the information they represent. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:3121 / 3127
页数:7
相关论文
共 24 条
[1]  
[Anonymous], 1999, FUZZY SETS APPROXIMA
[2]  
[Anonymous], P 3 INT SEM FUZZ SET
[3]  
[Anonymous], LECT NOTES COMPUTER
[4]  
[Anonymous], 1993, PROC 1993 ACM SIGMOD
[5]  
[Anonymous], P ACM SIGMOD INT C M
[6]  
[Anonymous], P 3 INT C HLTH INF H
[7]  
Berzal F., 2002, Intelligent Data Analysis, V6, P221
[8]   An expert system for multi-criteria decision making using Dempster Shafer theory [J].
Beynon, M ;
Cosker, D ;
Marshall, D .
EXPERT SYSTEMS WITH APPLICATIONS, 2001, 20 (04) :357-367
[9]   A NOTE ON MEASURES OF SPECIFICITY FOR FUZZY-SETS [J].
DUBOIS, D ;
PRADE, H .
INTERNATIONAL JOURNAL OF GENERAL SYSTEMS, 1985, 10 (04) :279-283
[10]   Efficient multisplitting revisited: Optima-preserving elimination of partition candidates [J].
Elomaa, T ;
Rousu, J .
DATA MINING AND KNOWLEDGE DISCOVERY, 2004, 8 (02) :97-126