Generating collective counterfactual explanations in score-based classification via mathematical optimization

被引:9
作者
Carrizosa, Emilio [1 ]
Ramirez-Ayerbe, Jasone [1 ]
Morales, Dolores Romero [2 ]
机构
[1] Univ Seville, Inst Matemat, Seville, Spain
[2] Copenhagen Business Sch, Dept Econ, Frederiksberg, Denmark
关键词
Collective counterfactual explanations; Mathematical optimization; Explainable machine learning; Linear models; Random forests;
D O I
10.1016/j.eswa.2023.121954
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the increasing use of Machine Learning models in high stakes decision making settings, it has become increasingly important to have tools to understand how models arrive at decisions. Assuming an already trained Supervised Classification model, post-hoc explanations can be obtained via so-called counterfactual analysis: a counterfactual explanation of an instance indicates how this instance should be minimally modified so that the perturbed instance is classified in the desired class by the given Machine Learning classification model. Most of the Counterfactual Analysis literature focuses on the single-instance single-counterfactual setting, in which the analysis is done for one single instance to provide one single counterfactual explanation. Taking a stakeholder's perspective, in this paper we introduce the so-called collective counterfactual explanations. By means of novel Mathematical Optimization models, we provide a counterfactual explanation for each instance in a group of interest, so that the total cost of the perturbations is minimized under some linking constraints. Making the process of constructing counterfactuals collective instead of individual enables us to detect the features that are critical to the entire dataset to have the individuals classified in the desired class. Our methodology allows for some instances to be treated individually, as in the single-instance single-counterfactual case, performing the collective counterfactual analysis for a fraction of records of the group of interest. This way, outliers are identified and handled appropriately. Under some assumptions on the classifier and the space in which counterfactuals are sought, finding collective counterfactual explanations is reduced to solving a convex quadratic linearly constrained mixed integer optimization problem, which, for datasets of moderate size, can be solved to optimality using existing solvers. The performance of our approach is illustrated on real-world datasets, demonstrating its usefulness.
引用
收藏
页数:9
相关论文
共 52 条
[1]   Placement Optimization in Refugee Resettlement [J].
Ahani, Narges ;
Andersson, Tommy ;
Martinello, Alessandro ;
Teytelboym, Alexander ;
Trapp, Andrew C. .
OPERATIONS RESEARCH, 2021, 69 (05) :1468-1486
[2]   On sparse ensemble methods: An application to short-term predictions of the evolution of COVID-19 [J].
Benitez-Pena, Sandra ;
Carrizosa, Emilio ;
Guerrero, Vanesa ;
Dolores Jimenez-Gamero, M. ;
Martin-Barragan, Belen ;
Molero-Rio, Cristina ;
Ramirez-Cobo, Pepa ;
Morales, Dolores Romero ;
Remedios Sillero-Denamiel, M. .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2021, 295 (02) :648-663
[3]   Optimal classification trees [J].
Bertsimas, Dimitris ;
Dunn, Jack .
MACHINE LEARNING, 2017, 106 (07) :1039-1082
[4]   BEST SUBSET SELECTION VIA A MODERN OPTIMIZATION LENS [J].
Bertsimas, Dimitris ;
King, Angela ;
Mazumder, Rahul .
ANNALS OF STATISTICS, 2016, 44 (02) :813-852
[5]   Theory and Applications of Robust Optimization [J].
Bertsimas, Dimitris ;
Brown, David B. ;
Caramanis, Constantine .
SIAM REVIEW, 2011, 53 (03) :464-501
[6]   Sparsity in optimal randomized classification trees [J].
Blanquero, Rafael ;
Carrizosa, Emilio ;
Molero-Rio, Cristina ;
Morales, Dolores Romero .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2020, 284 (01) :255-272
[7]  
Boukerche A, 2020, ACM COMPUT SURV, V53, DOI [10.1145/3381028, 10.1145/3421763]
[8]  
Browne K, 2020, Arxiv, DOI arXiv:2012.10076
[9]   NICE: an algorithm for nearest instance counterfactual explanations [J].
Brughmans, Dieter ;
Leyman, Pieter ;
Martens, David .
DATA MINING AND KNOWLEDGE DISCOVERY, 2024, 38 (05) :2665-2703
[10]  
Bynum ML, 2021, Pyomo-optimization modeling in python, V67, DOI DOI 10.1007/978-3-319-58821-6