A survey of interestingness measures for knowledge discovery

被引:156
作者
McGarry, K [1 ]
机构
[1] Univ Sunderland, Sch Comp & Technol, Sunderland SR6 0DD, Durham, England
关键词
D O I
10.1017/S0269888905000408
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is a well-known fact that the data mining process can generate many hundreds and often thousands of patterns from data. The task for the data miner then becomes one of determining the most useful patterns from those that are trivial or are already well known to the organization. It is therefore necessary to filter out those patterns through the use of some measure of the patterns actual worth. This article presents a review of the available literature on the various measures devised for evaluating and ranking the discovered patterns produced by the data mining process. These so-called interestingness measures are generally divided into two categories: objective measures based on the statistical strengths or properties of the discovered patterns and subjective measures that are derived from the user's beliefs or expectations of their particular problem domain. We evaluate the strengths and weaknesses of the various interestingness measures with respect to the level of user integration within the discovery process.
引用
收藏
页码:39 / 61
页数:23
相关论文
共 92 条
[1]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[2]  
Agrawal R., 1996, ADV KNOWLEDGE DISCOV
[3]   A Data Mining methodology for cross-sales [J].
Anand, SS ;
Patrick, AR ;
Hughes, JG ;
Bell, DA .
KNOWLEDGE-BASED SYSTEMS, 1998, 10 (07) :449-461
[4]  
[Anonymous], 1996, KNOWLEDGE DISCOVERY
[5]  
[Anonymous], 1991, KNOWLEDGE DISCOVERY
[6]  
[Anonymous], 1999, SIGKDD Explorations
[7]  
[Anonymous], 2004, P 10 ACM SIGKDD INT, DOI DOI 10.1145/1014052.1014074
[8]  
[Anonymous], 1995, P 11 INT C DAT ENG T
[9]  
[Anonymous], 2000, CAUSALITY
[10]   MEASUREMENT OF INEQUALITY [J].
ATKINSON, AB .
JOURNAL OF ECONOMIC THEORY, 1970, 2 (03) :244-263