Classification by clustering using an extended saliency measure

被引:2
作者
Barak, A. [1 ]
Gelbard, R. [1 ]
机构
[1] Bar Ilan Univ, Grad Sch Business Adm, Informat Syst Program, IL-52900 Ramat Gan, Israel
关键词
data mining; cluster analysis; classification; decision trees; bounded-rationality; saliency; classification by clustering (CBC); DESIGN SCIENCE; DECISION; REPRESENTATION; METHODOLOGY; SIMILARITY; SYSTEM; MODEL;
D O I
10.1111/exsy.12121
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many data mining tasks, the goal is to classify entities into a set of pre-defined groups (classes). A second and equally important goal is the interpretation, i.e. understanding the nature of the population aggregated in each class. These tasks are rendered even more complex when there is no a-priori information regarding the right classification. The current paper is based on two concepts: (1) Bounded-Rationality theory which implements an S-shaped function that represents human logic as a saliency measure to determine the substantial features that characterize each potential group and (2) Classification by clustering (CBC) that applies Decision Tree-like classification in unsupervised clustering problems, where neither an a-priori classification nor target-attributes are known in advance. In the context of these two concepts, the current research contributes: (1) by expanding the saliency measure to all possible types of variables (nominal as well as numerical), (2) by evaluating, using five datasets, a composite model that combines the CBC method and the saliency concept. The findings show that by using clustering algorithms for classification tasks (CBC method) the results are as accurate as those obtained by conventional Decision Trees, but with a better saliency factor.
引用
收藏
页码:46 / 59
页数:14
相关论文
共 44 条