Investigating the Evolution of Tree Boosting Models with Visual Analytics

被引:10
作者
Wang, Junpeng [1 ]
Zhang, Wei [1 ]
Wang, Liang [1 ]
Yang, Hao [1 ]
机构
[1] Visa Res, Palo Alto, CA 94306 USA
来源
2021 IEEE 14TH PACIFIC VISUALIZATION SYMPOSIUM (PACIFICVIS 2021) | 2021年
关键词
NEURAL-NETWORKS;
D O I
10.1109/PacificVis52677.2021.00032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tree boosting models are widely adopted predictive models and have demonstrated superior performance than other conventional and even deep learning models, especially since the recent release of their parallel and distributed implementations, e.g., XGBoost, LightGMB, and CatBoost. Tree boosting uses a group of sequentially generated weak learners (i.e., decision trees), each learns from the mistakes of its predecessor. to push the model's decision boundary towards the true boundary. As the number of trees keeps increasing over training, it is important to reveal how the newly-added trees change the predictions of individual data instances, and how the impacts of different data features evolve. To accomplish these goals, in this paper, we introduce a new design of the temporal confusion matrix. providing users with an effective interface to track data instances' predictions across the tree boosting process. Also, we present an i mproved visualization to better illustrate and compare the impacts of individual data features (based on their SHAP values) across training iterations. Integrating these components with a tree structure visualization component, we propose a visual analytics system for tree boosting models. Through case studies with domain experts using real-world datasets, we validated the system's effectiveness.
引用
收藏
页码:186 / 195
页数:10
相关论文
共 35 条
  • [11] Hinterreiter A., 2020, IEEE T VIS COMPUT GR
  • [12] GBRTVis: online analysis of gradient boosting regression tree
    Huang, Yifei
    Liu, Yuhua
    Li, Chenhui
    Wang, Changbo
    [J]. JOURNAL OF VISUALIZATION, 2019, 22 (01) : 125 - 140
  • [13] AcTiVis: Visual Exploration of Industry-Scale Deep Neural Network Models
    Kahng, Minsuk
    Andrews, Pierre Y.
    Kalro, Aditya
    Chau, Duen Horng
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2018, 24 (01) : 88 - 97
  • [14] Ke GL, 2017, ADV NEUR IN, V30
  • [15] CNNPruner: Pruning Convolutional Neural Networks with Visual Analytics
    Li, Guan
    Wang, Junpeng
    Shen, Han-Wei
    Chen, Kaixin
    Shan, Guihua
    Lu, Zhonghua
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2021, 27 (02) : 1364 - 1373
  • [16] Visual Diagnosis of Tree Boosting Methods
    Liu, Shixia
    Xiao, Jiannan
    Liu, Junlin
    Wang, Xiting
    Wu, Jing
    Zhu, Jun
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2018, 24 (01) : 163 - 173
  • [17] Liu XT, 2016, IEEE CONF VIS ANAL, P71, DOI 10.1109/VAST.2016.7883513
  • [18] Lundberg SM, 2017, ADV NEUR IN, V30
  • [19] Explainable machine-learning predictions for the prevention of hypoxaemia during surgery
    Lundberg, Scott M.
    Nair, Bala
    Vavilala, Monica S.
    Horibe, Mayumi
    Eisses, Michael J.
    Adams, Trevor
    Liston, David E.
    Low, Daniel King-Wai
    Newman, Shu-Fang
    Kim, Jerry
    Lee, Su-In
    [J]. NATURE BIOMEDICAL ENGINEERING, 2018, 2 (10): : 749 - 760
  • [20] Lundberg SM, 2018, Consistent Individualized Feature Attribution for Tree Ensembles