To tune or not to tune: rule evaluation for metaheuristic-based sequential covering algorithms

被引:10
作者
Minnaert, Bart [1 ,2 ]
Martens, David [2 ]
De Backer, Manu [1 ,2 ,3 ]
Baesens, Bart [3 ]
机构
[1] Univ Ghent, Fac Econ & Business Adm, B-9000 Ghent, Belgium
[2] Univ Antwerp, Fac Appl Econ, B-2020 Antwerp, Belgium
[3] Katholieke Univ Leuven, Dept Decis Sci Informat Management, Leuven, Belgium
关键词
Classification; Rule induction; Heuristics; Rule evaluation; Sequential covering; OF-THE-ART; EVOLUTIONARY ALGORITHMS; STATISTICAL COMPARISONS; SOFTWARE TOOL; CLASSIFICATION; INDUCTION; DISCOVERY; MODELS; COLONY; OPTIMIZATION;
D O I
10.1007/s10618-013-0339-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While many papers propose innovative methods for constructing individual rules in separate-and-conquer rule learning algorithms, comparatively few study the heuristic rule evaluation functions used in these algorithms to ensure that the selected rules combine into a good rule set. Underestimating the impact of this component has led to suboptimal design choices in many algorithms. The main goal of this paper is to demonstrate the importance of heuristic rule evaluation functions by improving existing rule induction techniques and to provide guidelines for algorithm designers. We first select optimal heuristic rule learning functions for several metaheuristic-based algorithms and empirically compare the resulting heuristics across algorithms. This results in large and significant improvements of the predictive accuracy for two techniques. We find that despite the absence of a global optimal choice for all algorithms, good default choices can be shared across algorithms with similar search biases. A near-optimal selection can thus be found for new algorithms with minor experimental tuning. Lastly, a major contribution is made towards balancing a model's predictive accuracy with its comprehensibility. We construct a Pareto front of optimal solutions for this trade-off and show that gains in comprehensibility and/or accuracy are possible for the techniques studied. The parametrized heuristics enable users to select the desired balance as they offer a high flexibility when it comes to selecting the desired accuracy and comprehensibility in rule miners.
引用
收藏
页码:237 / 272
页数:36
相关论文
共 63 条
  • [1] Natural encoding for evolutionary supervised learning
    Aguilar-Ruiz, Jesus S.
    Giraldez, Raul
    Riquelme, Jose C.
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2007, 11 (04) : 466 - 479
  • [2] Evolutionary learning of hierarchical decision rules
    Aguilar-Ruiz, JS
    Riquelme, JC
    Toro, M
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2003, 33 (02): : 324 - 331
  • [3] KEEL: a software tool to assess evolutionary algorithms for data mining problems
    Alcala-Fdez, J.
    Sanchez, L.
    Garcia, S.
    del Jesus, M. J.
    Ventura, S.
    Garrell, J. M.
    Otero, J.
    Romero, C.
    Bacardit, J.
    Rivas, V. M.
    Fernandez, J. C.
    Herrera, F.
    [J]. SOFT COMPUTING, 2009, 13 (03) : 307 - 318
  • [4] Alcalá-Fdez J, 2011, J MULT-VALUED LOG S, V17, P255
  • [5] An A., 2000, Foundations of Intelligent Systems. 12th International Symposium, ISMIS 2000. Proceedings (Lecture Notes in Artificial Intelligence Vol.1932), P119
  • [6] Survey and critique of techniques for extracting rules from trained artificial neural networks
    Andrews, R
    Diederich, J
    Tickle, AB
    [J]. KNOWLEDGE-BASED SYSTEMS, 1995, 8 (06) : 373 - 389
  • [7] [Anonymous], 2005, DATA MINING
  • [8] [Anonymous], 2009, SDM
  • [9] [Anonymous], 2002, Least Squares Support Vector Machines, DOI DOI 10.1142/5089
  • [10] [Anonymous], THESIS PITTSBURGH PA