Mathematical optimization in classification and regression trees

被引:94
作者
Carrizosa, Emilio [1 ]
Molero-Rio, Cristina [1 ]
Romero Morales, Dolores [2 ]
机构
[1] Univ Seville, Inst Matemat, Seville, Spain
[2] Copenhagen Business Sch, Dept Econ, Frederiksberg, Denmark
基金
欧盟地平线“2020”;
关键词
Classification and regression trees; Tree ensembles; Mixed-integer linear optimization; Continuous nonlinear optimization; Sparsity; Explainability; DECISION TREES; RANDOM FORESTS; GLOBAL OPTIMIZATION; RULE EXTRACTION; MACHINE; TIME; MODELS; INDUCTION; SELECTION; ALGORITHMS;
D O I
10.1007/s11750-021-00594-1
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
Classification and regression trees, as well as their variants, are off-the-shelf methods in Machine Learning. In this paper, we review recent contributions within the Continuous Optimization and the Mixed-Integer Linear Optimization paradigms to develop novel formulations in this research area. We compare those in terms of the nature of the decision variables and the constraints required, as well as the optimization algorithms proposed. We illustrate how these powerful formulations enhance the flexibility of tree models, being better suited to incorporate desirable properties such as cost-sensitivity, explainability, and fairness, and to deal with complex data, such as functional data.
引用
收藏
页码:5 / 33
页数:29
相关论文
共 196 条
[1]  
Aghaei S., 2020, ARXIV200209142
[2]  
Aghaei S, 2019, AAAI CONF ARTIF INTE, P1418
[3]  
Aglin G, 2020, AAAI CONF ARTIF INTE, V34, P3146
[4]  
Ahuja RK, 1993, Network Flows: Theory, Algorithms, and Applications
[5]   Permutation importance: a corrected feature importance measure [J].
Altmann, Andre ;
Tolosi, Laura ;
Sander, Oliver ;
Lengauer, Thomas .
BIOINFORMATICS, 2010, 26 (10) :1340-1347
[6]  
[Anonymous], 2019, ARXIV190607177
[7]  
[Anonymous], 1980, Journal of the Royal Statistical Society. Series C (Applied Statistics), DOI [DOI 10.2307/2986296, 10.2307/2986296]
[8]  
Aouad, 2019, ARXIV190601174
[9]   A review of machine learning kernel methods in statistical process monitoring [J].
Apsemidis, Anastasios ;
Psarakis, Stelios ;
Moguerza, Javier M. .
COMPUTERS & INDUSTRIAL ENGINEERING, 2020, 142
[10]   Forecasting with temporal hierarchies [J].
Athanasopoulos, George ;
Hyndman, Rob J. ;
Kourentzes, Nikolaos ;
Petropoulos, Fotios .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2017, 262 (01) :60-74