Optimal Action Extraction for Random Forests and Boosted Trees

被引：75

作者：

Cui, Zhicheng ^{[1
]}

Chen, Wenlin ^{[1
]}

He, Yujie ^{[1
]}

Chen, Yixin ^{[1
]}

机构：

[1] Washington Univ, St Louis, MO 63110 USA

来源：

KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING | 2015年

基金：

美国国家科学基金会;

关键词：

Actionable knowledge; additive tree model; decision tree; random forest; integer linear programming;

D O I：

10.1145/2783258.2783281

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Additive tree models (ATMs) are widely used for data mining and machine learning. Important examples of ATMs include random forest, adaboost (with decision trees as weak learners), and gradient boosted trees, and they are often referred to as the best off-the-shelf classifiers. Though capable of attaining high accuracy, ATMs are not well interpretable in the sense that they do not provide actionable knowledge for a given instance. This greatly limits the potential of ATMs on many applications such as medical prediction and business intelligence, where practitioners need suggestions on actions that can lead to desirable outcomes with minimum costs. To address this problem, we present a novel framework to post-process any ATM classifier to extract an optimal actionable plan that can change a given input to a desired class with a minimum cost. In particular, we prove the NP-hardness of the optimal action extraction problem for ATMs and formulate this problem in an integer linear programming formulation which can be efficiently solved by existing packages. We also empirically demonstrate the effectiveness of the proposed framework by conducting comprehensive experiments on challenging real-world datasets.

引用

页码：179 / 188

页数：10

共 31 条

[1]

[Anonymous], 2009, International Business Machines Corporation, V46, P157

[2] A trial of a real-time Alert for clinical deterioration in Patients hospitalized on general medical wards [J].

Bailey, Thomas C. ;

Chen, Yixin ;

Mao, Yi ;

Lu, Chenyang ;

Hackmann, Gregory ;

Micek, Scott T. ;

Heard, Kevin M. ;

Faulkner, Kelly M. ;

Kollef, Marin H. .

JOURNAL OF HOSPITAL MEDICINE, 2013, 8 (05) :236-242

[3] Using negotiable features for prescription problems [J].

Bella, Antonio ;

Ferri, Cesar ;

Hernandez-Orallo, Jose ;

Jose Ramirez-Quintana, Maria .

COMPUTING, 2011, 91 (02) :135-168

[4] Random forests [J].

Breiman, L .

MACHINE LEARNING, 2001, 45 (01) :5-32

[5] Random forests [J].

Breiman, L .

MACHINE LEARNING, 2001, 45 (01) :5-32

[6]

Cao LB, 2007, IEEE INTELL SYST, V22, P78, DOI 10.1109/MIS.2007.67

[7] Flexible Frameworks for Actionable Knowledge Discovery [J].

Cao, Longbing ;

Zhao, Yanchang ;

Zhang, Huaifeng ;

Luo, Dan ;

Zhang, Chengqi ;

Park, E. K. .

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2010, 22 (09) :1299-1312

[8]

Chen W., 2015, P 18 INT C ART INT S

[9]

Chen WL, 2013, 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), P140

[10]

Domingos P., 1998, Intelligent data Analysis, V2, P187, DOI [DOI 10.1016/S1088-467X(98)00023-7ALA, DOI 10.3233/IDA-1998-2303]

← 1 2 3 4 →