An interpretable classification rule mining algorithm

被引:55
作者
Cano, Alberto [1 ]
Zafra, Amelia [1 ]
Ventura, Sebastian [1 ]
机构
[1] Univ Cordoba, Dept Comp Sci & Numer Anal, E-14071 Cordoba, Spain
关键词
Classification; Evolutionary programming; Interpretability; Rule mining; COEVOLUTIONARY ALGORITHM; STATISTICAL TECHNIQUES; PROGRAMMING ALGORITHM; MULTIPLE COMPARISONS; SOFTWARE TOOL; CLASSIFIERS; SELECTION; MODELS; OPTIMIZATION; PERFORMANCE;
D O I
10.1016/j.ins.2013.03.038
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Obtaining comprehensible classifiers may be as important as achieving high accuracy in many real-life applications such as knowledge discovery tools and decision support systems. This paper introduces an efficient Evolutionary Programming algorithm for solving classification problems by means of very interpretable and comprehensible IF-THEN classification rules. This algorithm, called the Interpretable Classification Rule Mining (ICRM) algorithm, is designed to maximize the comprehensibility of the classifier by minimizing the number of rules and the number of conditions. The evolutionary process is conducted to construct classification rules using only relevant attributes, avoiding noisy and redundant data information. The algorithm is evaluated and compared to nine other well-known classification techniques in 35 varied application domains. Experimental results are validated using several non-parametric statistical tests applied on multiple classification and interpretability metrics. The experiments show that the proposal obtains good results, improving significantly the interpretability measures over the rest of the algorithms, while achieving competitive accuracy. This is a significant advantage over other algorithms as it allows to obtain an accurate and very comprehensible classifier quickly. (C) 2013 Elsevier Inc. All rights reserved.
引用
收藏
页码:1 / 20
页数:20
相关论文
共 66 条
[21]   A hybrid decision tree/genetic algorithm method for data mining [J].
Carvalho, DR ;
Freitas, AA .
INFORMATION SCIENCES, 2004, 163 (1-3) :13-35
[22]   A genetic design of linguistic terms for fuzzy rule based classifiers [J].
Cat Ho Nguyen ;
Pedrycz, Witold ;
Thang Long Duong ;
Thai Son Tran .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2013, 54 (01) :1-21
[23]   Genetic programming-based feature transform and classification for the automatic detection of pulmonary nodules on computed tomography images [J].
Choi, Wook-Jin ;
Choi, Tae-Sun .
INFORMATION SCIENCES, 2012, 212 :57-78
[24]  
Cintra ME, 2013, STUD FUZZ SOFT COMP, V291, P89, DOI 10.1007/978-3-642-34922-5_7
[25]  
Cohen W., 1995, P 12 INT C MACH LEAR, P1
[26]  
Demsar J, 2006, J MACH LEARN RES, V7, P1
[27]   MULTIPLE COMPARISONS AMONG MEANS [J].
DUNN, OJ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1961, 56 (293) :52-&
[28]   A Survey on the Application of Genetic Programming to Classification [J].
Espejo, Pedro G. ;
Ventura, Sebastian ;
Herrera, Francisco .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2010, 40 (02) :121-144
[29]   So near and yet so far: New insight into properties of some well-known classifier paradigms [J].
Fisch, Dominik ;
Kuehbeck, Bernhard ;
Sick, Bernhard ;
Ovaska, Seppo J. .
INFORMATION SCIENCES, 2010, 180 (18) :3381-3401
[30]   A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability [J].
Garcia, S. ;
Fernandez, A. ;
Luengo, J. ;
Herrera, F. .
SOFT COMPUTING, 2009, 13 (10) :959-977