Forest-ORE: Mining an optimal rule ensemble to interpret random forest models

被引：0

作者：

Haddouchi, Maissae ^{[1
]}

Berrado, Abdelaziz ^{[1
]}

机构：

[1] Mohammed V Univ Rabat, Ecole Mohammadia Ingn EMI, AMIPS Res Team, Rabat, Morocco

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2025年 / 143卷

关键词：

Interpretability; Optimization; Tree ensemble; Random forest; Rule ensemble; CLASSIFICATION; SET; NUMBER;

D O I：

10.1016/j.engappai.2024.109997

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Random Forest (RF) is well-known as an efficient ensemble learning method with strong predictive performance. However, it is often regarded as a "black box"due to its reliance on hundreds of deep decision trees. This lack of interpretability can be a real drawback for the acceptance of RF models in several real- world applications, especially those affecting individuals' lives. In this work, we present Forest-ORE, a method that makes RF interpretable via an optimized rule ensemble (ORE) for local and global interpretation. Unlike other rule-based approaches aimed at interpreting the RF model, this method simultaneously considers several parameters that influence the choice of an interpretable rule ensemble. Existing methods often prioritize predictive performance over interpretability coverage and do not account for existing overlaps or interactions between rules. Forest-ORE uses a mixed-integer optimization program to build an ORE that considers the trade-off between predictive performance, interpretability coverage, and model size (ensemble size, rule length, and overlap). In addition to producing an ORE competitive with RF in predictive performance, this method enriches the ORE through other rules that afford complementary information. This framework is illustrated through an example, and its robustness is evaluated across 36 benchmark datasets. A comparative analysis with well-known methods shows that Forest-ORE achieves an excellent trade-off between predictive performance, interpretability coverage, and model size.

引用

页数：14

共 70 条

[31] Fletcher Sam, 2018, Australasian Journal of Information Systems, V22, P1
[32] Greedy function approximation: A gradient boosting machine
Friedman, JH
[J]. ANNALS OF STATISTICS, 2001, 29 (05) : 1189 - 1232
[33] PREDICTIVE LEARNING VIA RULE ENSEMBLES
Frieman, Jerome H.
Popescu, Bogdan E.
[J]. ANNALS OF APPLIED STATISTICS, 2008, 2 (03) : 916 - 954
[34] A Survey of Discretization Techniques: Taxonomy and Empirical Analysis in Supervised Learning
Garcia, Salvador
Luengo, Julian
Antonio Saez, Jose
Lopez, Victoria
Herrera, Francisco
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (04) : 734 - 750
[35] Golino H F., 2014, Psychology, V05, P2084, DOI [DOI 10.4236/PSYCH.2014.519211, DOI 10.4236/PSYCH.2014]
[36] European Union Regulations on Algorithmic Decision Making and a "Right to Explanation"
Goodman, Bryce
Flaxman, Seth
[J]. AI MAGAZINE, 2017, 38 (03) : 50 - 57
[37] Guidotti R, 2018, Arxiv, DOI arXiv:1802.01933
[38] Welling SH, 2016, Arxiv, DOI arXiv:1605.09196
[39] Haddouchi M., 2019, 1 INT C SMART SYST D, P1, DOI DOI 10.1109/ICSSD47982.2019.9002770
[40] Haddouchi M, 2024, Arxiv, DOI [arXiv:2407.12759, 10.48550/arXiv.2407.12759, DOI 10.48550/ARXIV.2407.12759]

← 1 2 3 4 5 6 7 →