Forest-ORE: Mining an optimal rule ensemble to interpret random forest models

被引:0
作者
Haddouchi, Maissae [1 ]
Berrado, Abdelaziz [1 ]
机构
[1] Mohammed V Univ Rabat, Ecole Mohammadia Ingn EMI, AMIPS Res Team, Rabat, Morocco
关键词
Interpretability; Optimization; Tree ensemble; Random forest; Rule ensemble; CLASSIFICATION; SET; NUMBER;
D O I
10.1016/j.engappai.2024.109997
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Random Forest (RF) is well-known as an efficient ensemble learning method with strong predictive performance. However, it is often regarded as a "black box"due to its reliance on hundreds of deep decision trees. This lack of interpretability can be a real drawback for the acceptance of RF models in several real- world applications, especially those affecting individuals' lives. In this work, we present Forest-ORE, a method that makes RF interpretable via an optimized rule ensemble (ORE) for local and global interpretation. Unlike other rule-based approaches aimed at interpreting the RF model, this method simultaneously considers several parameters that influence the choice of an interpretable rule ensemble. Existing methods often prioritize predictive performance over interpretability coverage and do not account for existing overlaps or interactions between rules. Forest-ORE uses a mixed-integer optimization program to build an ORE that considers the trade-off between predictive performance, interpretability coverage, and model size (ensemble size, rule length, and overlap). In addition to producing an ORE competitive with RF in predictive performance, this method enriches the ORE through other rules that afford complementary information. This framework is illustrated through an example, and its robustness is evaluated across 36 benchmark datasets. A comparative analysis with well-known methods shows that Forest-ORE achieves an excellent trade-off between predictive performance, interpretability coverage, and model size.
引用
收藏
页数:14
相关论文
共 70 条
  • [31] Fletcher Sam, 2018, Australasian Journal of Information Systems, V22, P1
  • [32] Greedy function approximation: A gradient boosting machine
    Friedman, JH
    [J]. ANNALS OF STATISTICS, 2001, 29 (05) : 1189 - 1232
  • [33] PREDICTIVE LEARNING VIA RULE ENSEMBLES
    Frieman, Jerome H.
    Popescu, Bogdan E.
    [J]. ANNALS OF APPLIED STATISTICS, 2008, 2 (03) : 916 - 954
  • [34] A Survey of Discretization Techniques: Taxonomy and Empirical Analysis in Supervised Learning
    Garcia, Salvador
    Luengo, Julian
    Antonio Saez, Jose
    Lopez, Victoria
    Herrera, Francisco
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (04) : 734 - 750
  • [35] Golino H F., 2014, Psychology, V05, P2084, DOI [DOI 10.4236/PSYCH.2014.519211, DOI 10.4236/PSYCH.2014]
  • [36] European Union Regulations on Algorithmic Decision Making and a "Right to Explanation"
    Goodman, Bryce
    Flaxman, Seth
    [J]. AI MAGAZINE, 2017, 38 (03) : 50 - 57
  • [37] Guidotti R, 2018, Arxiv, DOI arXiv:1802.01933
  • [38] Welling SH, 2016, Arxiv, DOI arXiv:1605.09196
  • [39] Haddouchi M., 2019, 1 INT C SMART SYST D, P1, DOI DOI 10.1109/ICSSD47982.2019.9002770
  • [40] Haddouchi M, 2024, Arxiv, DOI [arXiv:2407.12759, 10.48550/arXiv.2407.12759, DOI 10.48550/ARXIV.2407.12759]