Multi-Objective Optimization of Performance and Interpretability of Tabular Supervised Machine Learning Models

被引:3
作者
Schneider, Lennart [1 ,2 ]
Bischl, Bernd [1 ,2 ]
Thomas, Janek [1 ,2 ]
机构
[1] Ludwig Maximilians Univ Munchen, Munich, Germany
[2] Munich Ctr Machine Learning MCML, Munich, Germany
来源
PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, GECCO 2023 | 2023年
关键词
supervised learning; performance; interpretability; tabular data; multi-objective; evolutionary computation; group structure; ALGORITHM; SELECTION;
D O I
10.1145/3583131.3590380
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a model-agnostic framework for jointly optimizing the predictive performance and interpretability of supervised machine learning models for tabular data. Interpretability is quantified via three measures: feature sparsity, interaction sparsity of features, and sparsity of non-monotone feature effects. By treating hyperparameter optimization of a machine learning algorithm as a multi-objective optimization problem, our framework allows for generating diverse models that trade off high performance and ease of interpretability in a single optimization run. Efficient optimization is achieved via augmentation of the search space of the learning algorithm by incorporating feature selection, interaction and monotonicity constraints into the hyperparameter search space. We demonstrate that the optimization problem effectively translates to finding the Pareto optimal set of groups of selected features that are allowed to interact in a model, along with finding their optimal monotonicity constraints and optimal hyperparameters of the learning algorithm itself. We then introduce a novel evolutionary algorithm that can operate efficiently on this augmented search space. In benchmark experiments, we show that our framework is capable of finding diverse models that are highly competitive or outperform state-of-the-art XGBoost or Explainable Boosting Machine models, both with respect to performance and interpretability.
引用
收藏
页码:538 / 547
页数:10
相关论文
共 58 条
[1]  
[Anonymous], 2006, INT J COMPUTATIONAL
[2]   Visualizing the effects of predictor variables in black box supervised learning models [J].
Apley, Daniel W. ;
Zhu, Jingyu .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2020, 82 (04) :1059-1086
[3]  
Belgrave D., 2022, ADV NEURAL INFORM PR, V35
[4]   Multi-Objective Hyperparameter Tuning and Feature Selection using Filter Ensembles [J].
Binder, Martin ;
Moosbauer, Julia ;
Thomas, Janek ;
Bischl, Bernd .
GECCO'20: PROCEEDINGS OF THE 2020 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2020, :471-479
[5]  
Bischl B., 2021, P NEURAL INFORM PROC, V1
[6]   Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges [J].
Bischl, Bernd ;
Binder, Martin ;
Lang, Michel ;
Pielok, Tobias ;
Richter, Jakob ;
Coors, Stefan ;
Thomas, Janek ;
Ullmann, Theresa ;
Becker, Marc ;
Boulesteix, Anne-Laure ;
Deng, Difan ;
Lindauer, Marius .
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2023, 13 (02)
[7]  
Bischl B, 2010, LECT NOTES COMPUT SC, V6238, P314, DOI 10.1007/978-3-642-15844-5_32
[8]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[9]  
Chang C.-H., 2022, INT C LEARN REPR
[10]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794