Symbolic Regression Enhanced Decision Trees for Classification Tasks

被引：0

作者：

Sen Fong, Kei ^{[1
]}

Motani, Mehul ^{[1
,2
]}

机构：

[1] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore

[2] Natl Univ Singapore, Inst Hlth N 1, Inst Digital Med WisDM, Inst Data Sci, Singapore, Singapore

来源：

THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 11 | 2024年

基金：

新加坡国家研究基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We introduce a conceptually simple yet effective method to create small, compact decision trees - by using splits found via Symbolic Regression (SR). Traditional decision tree (DT) algorithms partition a dataset on axis-parallel splits. When the true boundaries are not along the feature axes, DT is likely to have a complicated structure and a dense decision boundary. In this paper, we introduce SR-Enhanced DT (SREDT) - a method which utilizes SR to increase the richness of the class of possible DT splits. We evaluate SREDT on both synthetic and real-world datasets. Despite its simplicity, our method produces surprisingly small trees that outperform both DT and oblique DT (ODT) on supervised classification tasks in terms of accuracy and F-score. We show empirically that SREDTs decrease inference time (compared to DT and ODT) and argue that they allow us to obtain more explainable descriptions of the decision process. SREDT also performs competitively against state-of-the-art tabular classification methods, including tree ensembles and deep models. Finally, we introduce a local search mechanism to im-prove SREDT and evaluate it on 56 PMLB datasets. This mechanism shows improved performance on 77.2% of the datasets, outperforming DT and ODT. In terms of F-Score, local SREDT outperforms DT and ODT in 82.5% and 73.7% of the datasets respectively and in terms of inference time, local SREDT requires 25.8% and 26.6% less inference time than DT and ODT respectively.

引用

页码：12033 / 12042

页数：10

共 50 条

[21] Binary Classification of Medical Images by Symbolic Regression
Allison, Ezekiel
ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS, UKCI 2023, 2024, 1453 : 516 - 527
[22] On classification and regression trees for multiple responses
Lee, SK
CLASSIFICATION, CLUSTERING, AND DATA MINING APPLICATIONS, 2004, : 177 - 184
[23] Criteria for growing classification and regression trees
Buja, A
Lee, YS
DIMENSION REDUCTION, COMPUTATIONAL COMPLEXITY AND INFORMATION, 1998, 30 : 414 - 414
[24] OPTIMAL PARTITIONING FOR CLASSIFICATION AND REGRESSION TREES
CHOU, PA
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1991, 13 (04) : 340 - 354
[25] Classification and regression using augmented trees
Sambasivan, Rajiv
Das, Sourish
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2019, 7 (04) : 259 - 276
[26] Evolutionary algorithms for classification and regression trees
Mola, Francesco
Miele, Raffaele
DATA ANALYSIS, CLASSIFICATION AND THE FORWARD SEARCH, 2006, : 255 - +
[27] Classification trees and regression in biomedical research
Luz Calle, M.
Sanchez-Espigares, Jose A.
MEDICINA CLINICA, 2007, 129 (18): : 702 - 706
[28] Correction: Corrigendum: Classification and regression trees
Martin Krzywinksi
Naomi Altman
Nature Methods, 2017, 14 (9) : 928 - 928
[29] A Method to Build Classification and Regression Trees
Unda-Trillas, Emilio
Rivera-Rovelo, Jorge
PROGRESS IN PATTERN RECOGNITION IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2014, 2014, 8827 : 448 - 453
[30] Fifty Years of Classification and Regression Trees
Loh, Wei-Yin
INTERNATIONAL STATISTICAL REVIEW, 2014, 82 (03) : 329 - 348

← 1 2 3 4 5 →