Inducing non-orthogonal and non-linear decision boundaries in decision trees via interactive basis functions

被引:15
作者
Paez, Antonio [1 ]
Lopez, Fernando [2 ]
Ruiz, Manuel [2 ]
Camacho, Maximo [3 ]
机构
[1] Sch Geog & Earth Sci, 1280 Main St West, Hamilton, ON L8S 4K1, Canada
[2] Fac Ciencias Empresa, Dept Metodos Cuantitat & Informat, Calle Real 3, Murcia 30201, Spain
[3] Fac Econ & Empresa, Dept Metodos Cuantitat Econ & Empresa, Murcia 30100, Spain
关键词
Decision trees; Oblique partitions; Non-linear partitions; Interactive basis functions; Classification; Regression; REGRESSION TREE; CLASSIFICATION; ENSEMBLE; SEGMENTATION; CLASSIFIERS; DEMAND; SYSTEM;
D O I
10.1016/j.eswa.2018.12.041
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Decision Trees (DTs) are a machine learning technique widely used for regression and classification purposes. Conventionally, the decision boundaries of Decision Trees are orthogonal to the features under consideration. A well-known limitation of this is that the algorithm may fail to find optimal partitions, or in some cases any partitions at all, depending on the underlying distribution of the data. To remedy this limitation, several modifications have been proposed that allow for oblique decision boundaries. The objective of this paper is to propose a new strategy for generating flexible decision boundaries by means of interactive basis functions (IBFs). We show how oblique decision boundaries can be obtained as a particular case of IBFs, and in addition how non-linear decision boundaries can be induced. One attractive aspect of the strategy proposed in this paper is that training Decision Trees with IBFs does not require custom software, since the functions can be precalculated for use in any existing implementation of the algorithm. Since the underlying mechanisms remain unchanged there is no substantial computational overhead compared to conventional trees. Furthermore, this also means that IBFs can be used in any extensions of the Decision Tree algorithm, such as evolutionary trees, boosting, and bagging. We conduct a benchmarking exercise to understand under which conditions the use of IBFs can improve model the performance. In addition, we present three empirical applications that illustrate the approach in classification and regression. As part of discussing the empirical applications, we introduce a device called decision charts to facilitate the interpretation of DTs with IBFs. Finally, we conclude the paper by outlining some directions for future research. (C) 2018 Elsevier Ltd. All rights reserved.
引用
收藏
页码:183 / 206
页数:24
相关论文
共 54 条
[31]   A System for Induction of Oblique Decision Trees [J].
Murthy, Sreerama K. ;
Kasif, Simon ;
Salzberg, Steven .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1994, 2 :1-32
[32]   Spatial association and heterogeneity issues in land price models [J].
Páez, A ;
Uchida, T ;
Miyamoto, K .
URBAN STUDIES, 2001, 38 (09) :1493-1508
[33]  
Powell, 1982, Contemporary Democracies
[34]   AMERICAN-VOTER TURNOUT IN COMPARATIVE PERSPECTIVE [J].
POWELL, GB .
AMERICAN POLITICAL SCIENCE REVIEW, 1986, 80 (01) :17-43
[35]   River Classification as a Geographic Tool in the Age of Big Data and Global Change [J].
Praskievicz, Sarah .
GEOGRAPHICAL REVIEW, 2018, 108 (01) :120-137
[36]   Oblique random forest ensemble via Least Square Estimation for time series forecasting [J].
Qiu, Xueheng ;
Zhang, Le ;
Suganthan, Ponnuthurai Nagaratnam ;
Amaratunga, Gehan A. J. .
INFORMATION SCIENCES, 2017, 420 :249-262
[37]   Ensemble Classification and Regression-Recent Developments, Applications and Future Directions [J].
Ren, Ye ;
Zhang, Le ;
Suganthan, P. N. .
IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2016, 11 (01) :41-53
[38]   CARTopt: a random search method for nonsmooth unconstrained optimization [J].
Robertson, B. L. ;
Price, C. J. ;
Reale, M. .
COMPUTATIONAL OPTIMIZATION AND APPLICATIONS, 2013, 56 (02) :291-315
[39]   Ensemble-based classifiers [J].
Rokach, Lior .
ARTIFICIAL INTELLIGENCE REVIEW, 2010, 33 (1-2) :1-39