Inducing non-orthogonal and non-linear decision boundaries in decision trees via interactive basis functions

被引:15
作者
Paez, Antonio [1 ]
Lopez, Fernando [2 ]
Ruiz, Manuel [2 ]
Camacho, Maximo [3 ]
机构
[1] Sch Geog & Earth Sci, 1280 Main St West, Hamilton, ON L8S 4K1, Canada
[2] Fac Ciencias Empresa, Dept Metodos Cuantitat & Informat, Calle Real 3, Murcia 30201, Spain
[3] Fac Econ & Empresa, Dept Metodos Cuantitat Econ & Empresa, Murcia 30100, Spain
关键词
Decision trees; Oblique partitions; Non-linear partitions; Interactive basis functions; Classification; Regression; REGRESSION TREE; CLASSIFICATION; ENSEMBLE; SEGMENTATION; CLASSIFIERS; DEMAND; SYSTEM;
D O I
10.1016/j.eswa.2018.12.041
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Decision Trees (DTs) are a machine learning technique widely used for regression and classification purposes. Conventionally, the decision boundaries of Decision Trees are orthogonal to the features under consideration. A well-known limitation of this is that the algorithm may fail to find optimal partitions, or in some cases any partitions at all, depending on the underlying distribution of the data. To remedy this limitation, several modifications have been proposed that allow for oblique decision boundaries. The objective of this paper is to propose a new strategy for generating flexible decision boundaries by means of interactive basis functions (IBFs). We show how oblique decision boundaries can be obtained as a particular case of IBFs, and in addition how non-linear decision boundaries can be induced. One attractive aspect of the strategy proposed in this paper is that training Decision Trees with IBFs does not require custom software, since the functions can be precalculated for use in any existing implementation of the algorithm. Since the underlying mechanisms remain unchanged there is no substantial computational overhead compared to conventional trees. Furthermore, this also means that IBFs can be used in any extensions of the Decision Tree algorithm, such as evolutionary trees, boosting, and bagging. We conduct a benchmarking exercise to understand under which conditions the use of IBFs can improve model the performance. In addition, we present three empirical applications that illustrate the approach in classification and regression. As part of discussing the empirical applications, we introduce a device called decision charts to facilitate the interpretation of DTs with IBFs. Finally, we conclude the paper by outlining some directions for future research. (C) 2018 Elsevier Ltd. All rights reserved.
引用
收藏
页码:183 / 206
页数:24
相关论文
共 54 条
[21]   A NONPARAMETRIC PARTITIONING PROCEDURE FOR PATTERN CLASSIFICATION [J].
HENRICHON, EG ;
FU, KS .
IEEE TRANSACTIONS ON COMPUTERS, 1969, C 18 (07) :614-+
[22]   POLITICAL-INSTITUTIONS AND VOTER TURNOUT IN THE INDUSTRIAL DEMOCRACIES [J].
JACKMAN, RW .
AMERICAN POLITICAL SCIENCE REVIEW, 1987, 81 (02) :405-423
[23]  
James G, 2013, SPRINGER TEXTS STAT, V103, P1, DOI [10.1007/978-1-4614-7138-7, 10.1007/978-1-4614-7138-7_1]
[24]  
Kahle D, 2013, R J, V5, P144
[25]   Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease [J].
Kurt, Imran ;
Ture, Mevlut ;
Kurum, A. Turhan .
EXPERT SYSTEMS WITH APPLICATIONS, 2008, 34 (01) :366-374
[26]   IDENTIFYING AND BOUNDING ETHNIC NEIGHBORHOODS [J].
Logan, John R. ;
Spielman, Seth ;
Xu, Hongwei ;
Klein, Philip N. .
URBAN GEOGRAPHY, 2011, 32 (03) :334-359
[27]   Mapping America in 1880: The Urban Transition Historical GIS Project [J].
Logan, John R. ;
Jindrich, Jason ;
Shin, Hyoungjin ;
Zhang, Weiwei .
HISTORICAL METHODS, 2011, 44 (01) :49-60
[28]   Classification and regression trees [J].
Loh, Wei-Yin .
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 1 (01) :14-23
[29]   Geometric Decision Tree [J].
Manwani, Naresh ;
Sastry, P. S. .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2012, 42 (01) :181-192
[30]  
Menze BH, 2011, LECT NOTES ARTIF INT, V6912, P453, DOI 10.1007/978-3-642-23783-6_29