Investigating the Evolution of Tree Boosting Models with Visual Analytics

被引:10
作者
Wang, Junpeng [1 ]
Zhang, Wei [1 ]
Wang, Liang [1 ]
Yang, Hao [1 ]
机构
[1] Visa Res, Palo Alto, CA 94306 USA
来源
2021 IEEE 14TH PACIFIC VISUALIZATION SYMPOSIUM (PACIFICVIS 2021) | 2021年
关键词
NEURAL-NETWORKS;
D O I
10.1109/PacificVis52677.2021.00032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tree boosting models are widely adopted predictive models and have demonstrated superior performance than other conventional and even deep learning models, especially since the recent release of their parallel and distributed implementations, e.g., XGBoost, LightGMB, and CatBoost. Tree boosting uses a group of sequentially generated weak learners (i.e., decision trees), each learns from the mistakes of its predecessor. to push the model's decision boundary towards the true boundary. As the number of trees keeps increasing over training, it is important to reveal how the newly-added trees change the predictions of individual data instances, and how the impacts of different data features evolve. To accomplish these goals, in this paper, we introduce a new design of the temporal confusion matrix. providing users with an effective interface to track data instances' predictions across the tree boosting process. Also, we present an i mproved visualization to better illustrate and compare the impacts of individual data features (based on their SHAP values) across training iterations. Integrating these components with a tree structure visualization component, we propose a visual analytics system for tree boosting models. Through case studies with domain experts using real-world datasets, we validated the system's effectiveness.
引用
收藏
页码:186 / 195
页数:10
相关论文
共 35 条
[11]   Stochastic gradient boosting [J].
Friedman, JH .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2002, 38 (04) :367-378
[12]  
Hinterreiter A., 2020, IEEE T VIS COMPUT GR
[13]   GBRTVis: online analysis of gradient boosting regression tree [J].
Huang, Yifei ;
Liu, Yuhua ;
Li, Chenhui ;
Wang, Changbo .
JOURNAL OF VISUALIZATION, 2019, 22 (01) :125-140
[14]   AcTiVis: Visual Exploration of Industry-Scale Deep Neural Network Models [J].
Kahng, Minsuk ;
Andrews, Pierre Y. ;
Kalro, Aditya ;
Chau, Duen Horng .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2018, 24 (01) :88-97
[15]  
Ke GL, 2017, ADV NEUR IN, V30
[16]   CNNPruner: Pruning Convolutional Neural Networks with Visual Analytics [J].
Li, Guan ;
Wang, Junpeng ;
Shen, Han-Wei ;
Chen, Kaixin ;
Shan, Guihua ;
Lu, Zhonghua .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2021, 27 (02) :1364-1373
[17]   Visual Diagnosis of Tree Boosting Methods [J].
Liu, Shixia ;
Xiao, Jiannan ;
Liu, Junlin ;
Wang, Xiting ;
Wu, Jing ;
Zhu, Jun .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2018, 24 (01) :163-173
[18]  
Liu XT, 2016, IEEE CONF VIS ANAL, P71, DOI 10.1109/VAST.2016.7883513
[19]  
Lundberg S. M., 2018, Consistent individualized feature attribution for tree ensembles
[20]  
Lundberg SM, 2017, ADV NEUR IN, V30