MAPTree: Beating "Optimal" Decision Trees with Bayesian Decision Trees

被引:0
作者
Sullivan, Colin [1 ]
Tiwari, Mo [1 ]
Thrun, Sebastian [1 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
来源
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 8 | 2024年
关键词
HEURISTIC-SEARCH;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Decision trees remain one of the most popular machine learning models today, largely due to their out-of-the-box performance and interpretability. In this work, we present a Bayesian approach to decision tree induction via maximum a posteriori inference of a posterior distribution over trees. We first demonstrate a connection between maximum a posteriori inference of decision trees and AND/OR search. Using this connection, we propose an AND/OR search algorithm, dubbed MAPTree, which is able to recover the maximum a posteriori tree. Lastly, we demonstrate the empirical performance of the maximum a posteriori tree both on synthetic data and in real world settings. On 16 real world datasets, MAPTree either outperforms baselines or demonstrates comparable performance but with much smaller trees. On a synthetic dataset, MAPTree also demonstrates greater robustness to noise and better generalization than existing approaches. Finally, MAPTree recovers the maxiumum a posteriori tree faster than existing sampling approaches and, in contrast with those algorithms, is able to provide a certificate of optimality. The code for our experiments is available at https://github.com/ThrunGroup/maptree.
引用
收藏
页码:9019 / 9026
页数:8
相关论文
共 25 条
  • [1] Aglin G, 2020, AAAI CONF ARTIF INTE, V34, P3146
  • [2] ADMISSIBLE HEURISTIC-SEARCH IN AND OR GRAPHS
    BAGCHI, A
    MAHANTI, A
    [J]. THEORETICAL COMPUTER SCIENCE, 1983, 24 (02) : 207 - 219
  • [3] Optimal classification trees
    Bertsimas, Dimitris
    Dunn, Jack
    [J]. MACHINE LEARNING, 2017, 106 (07) : 1039 - 1082
  • [4] Breiman L., 2001, MACH LEARN, V45, P5
  • [5] XGBoost: A Scalable Tree Boosting System
    Chen, Tianqi
    Guestrin, Carlos
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 785 - 794
  • [6] Bayesian CART model search
    Chipman, HA
    George, EI
    McCulloch, RE
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1998, 93 (443) : 935 - 948
  • [7] Denison DGT, 1998, BIOMETRIKA, V85, P363
  • [8] Devroye L., 1995, INT S GRAPH DRAW NET
  • [9] Geels V., 2022, Journal of Statistical Computation and Simulation, V1, P22
  • [10] Grinsztajn L., 2022, WHY TREE BASED MODEL