To tune or not to tune: rule evaluation for metaheuristic-based sequential covering algorithms

被引:10
作者
Minnaert, Bart [1 ,2 ]
Martens, David [2 ]
De Backer, Manu [1 ,2 ,3 ]
Baesens, Bart [3 ]
机构
[1] Univ Ghent, Fac Econ & Business Adm, B-9000 Ghent, Belgium
[2] Univ Antwerp, Fac Appl Econ, B-2020 Antwerp, Belgium
[3] Katholieke Univ Leuven, Dept Decis Sci Informat Management, Leuven, Belgium
关键词
Classification; Rule induction; Heuristics; Rule evaluation; Sequential covering; OF-THE-ART; EVOLUTIONARY ALGORITHMS; STATISTICAL COMPARISONS; SOFTWARE TOOL; CLASSIFICATION; INDUCTION; DISCOVERY; MODELS; COLONY; OPTIMIZATION;
D O I
10.1007/s10618-013-0339-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While many papers propose innovative methods for constructing individual rules in separate-and-conquer rule learning algorithms, comparatively few study the heuristic rule evaluation functions used in these algorithms to ensure that the selected rules combine into a good rule set. Underestimating the impact of this component has led to suboptimal design choices in many algorithms. The main goal of this paper is to demonstrate the importance of heuristic rule evaluation functions by improving existing rule induction techniques and to provide guidelines for algorithm designers. We first select optimal heuristic rule learning functions for several metaheuristic-based algorithms and empirically compare the resulting heuristics across algorithms. This results in large and significant improvements of the predictive accuracy for two techniques. We find that despite the absence of a global optimal choice for all algorithms, good default choices can be shared across algorithms with similar search biases. A near-optimal selection can thus be found for new algorithms with minor experimental tuning. Lastly, a major contribution is made towards balancing a model's predictive accuracy with its comprehensibility. We construct a Pareto front of optimal solutions for this trade-off and show that gains in comprehensibility and/or accuracy are possible for the techniques studied. The parametrized heuristics enable users to select the desired balance as they offer a high flexibility when it comes to selecting the desired accuracy and comprehensibility in rule miners.
引用
收藏
页码:237 / 272
页数:36
相关论文
共 63 条
  • [11] Benchmarking state-of-the-art classification algorithms for credit scoring
    Baesens, B
    Van Gestel, T
    Viaene, S
    Stepanova, M
    Suykens, J
    Vanthienen, J
    [J]. JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2003, 54 (06) : 627 - 635
  • [12] Using neural network rule extraction and decision tables for credit-risk evaluation
    Baesens, B
    Setiono, R
    Mues, C
    Vanthienen, J
    [J]. MANAGEMENT SCIENCE, 2003, 49 (03) : 312 - 329
  • [13] CLASSIFIER SYSTEMS AND GENETIC ALGORITHMS
    BOOKER, LB
    GOLDBERG, DE
    HOLLAND, JH
    [J]. ARTIFICIAL INTELLIGENCE, 1989, 40 (1-3) : 235 - 282
  • [14] Cestnik B., 1990, ECAI 90. Proceedings of the 9th European Conference on Artificial Intelligence, P147
  • [15] Clark P., 1989, Machine Learning, V3, P261, DOI 10.1023/A:1022641700528
  • [16] Cohen W. W., 1995, Machine Learning. Proceedings of the Twelfth International Conference on Machine Learning, P115
  • [17] Demsar J, 2006, J MACH LEARN RES, V7, P1
  • [18] Ant system: Optimization by a colony of cooperating agents
    Dorigo, M
    Maniezzo, V
    Colorni, A
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1996, 26 (01): : 29 - 41
  • [19] ON THE HANDLING OF CONTINUOUS-VALUED ATTRIBUTES IN DECISION TREE GENERATION
    FAYYAD, UM
    IRANI, KB
    [J]. MACHINE LEARNING, 1992, 8 (01) : 87 - 102
  • [20] Genetics-Based Machine Learning for Rule Induction: State of the Art, Taxonomy, and Comparative Study
    Fernandez, Alberto
    Garcia, Salvador
    Luengo, Julian
    Bernado-Mansilla, Ester
    Herrera, Francisco
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2010, 14 (06) : 913 - 941