Interactive Visualization of Ensemble Decision Trees based on the Relations among Weak Learners

被引:2
作者
Kashiyama, Miyu [1 ]
Hirokawa, Masakazu [2 ]
Matsuno, Ryuta [2 ]
Sakuma, Keita [2 ]
Itoh, Takayuki [1 ]
机构
[1] Ochanomizu Univ, Tokyo, Japan
[2] NEC Corp Ltd, Tokyo, Japan
来源
2024 28TH INTERNATIONAL CONFERENCE INFORMATION VISUALISATION, IV 2024 | 2024年
关键词
Visualization; Machine learning; Ensemble model;
D O I
10.1109/IV64223.2024.00028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ensemble learning that combines multiple weak learners for enhanced performance, is widely used but suffers from low interpretability/explainability. This leads challenges not only in operational aspects like model maintenance and quality assurance but also in addressing societal needs such as fairness and privacy. To tackle this, we propose a new visualization method focusing on the relationship among weak learners in ensemble models to improve understanding of the model structure and its learning processes. In this paper, we defined the relation between weak learners based on a "common sample" in gradient-boosting decision trees, and a visualization method as a three-dimensional graph structure was proposed. Ensemble models trained with synthetic data sets that include typical distribution shifts and real-world open data sets were visualized. As a result, we demonstrated that this approach enables a more accessible understanding of the behavior and structure of ensemble models comprising multiple weak learners, facilitating the identification of overfitting and underfitting through visualization of changes during the training and validation processes.
引用
收藏
页码:105 / 110
页数:6
相关论文
共 15 条
[1]   StackGenVis: Alignment of Data, Algorithms, and Models for Stacking Ensemble Learning Using Performance Metrics [J].
Chatzimparmpas, Angelos ;
Martins, Rafael M. ;
Kucher, Kostiantyn ;
Kerren, Andreas .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2021, 27 (02) :1547-1557
[2]  
Fan R.-E., Libsvm data: Classification, regression, and multi-label
[3]   Greedy function approximation: A gradient boosting machine [J].
Friedman, JH .
ANNALS OF STATISTICS, 2001, 29 (05) :1189-1232
[4]   GBDT4CTRVis: visual analytics of gradient boosting decision tree for advertisement click-through rate prediction [J].
Gao, Wenwen ;
Liu, Shangsong ;
Zhou, Yi ;
Wang, Fengjie ;
Zhou, Feng ;
Zhu, Min .
JOURNAL OF VISUALIZATION, 2024, 27 (04) :639-659
[5]   Colorful Trees: Visualizing Random Forests for Analysis and Interpretation [J].
Haensch, Ronny ;
Wiesner, Philipp ;
Wendler, Sophie ;
Hellwich, Olaf .
2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, :294-302
[6]  
Ke GL, 2017, ADV NEUR IN, V30
[7]  
Kovalerchuk B, 2023, Arxiv, DOI arXiv:2305.18432
[8]   Machine Learning Operations (MLOps): Overview, Definition, and Architecture [J].
Kreuzberger, Dominik ;
Kuehl, Niklas ;
Hirschl, Sebastian .
IEEE ACCESS, 2023, 11 :31866-31879
[9]  
Lin J., 2020, PMLR, V119, P6150
[10]  
McTavish H, 2022, AAAI CONF ARTIF INTE, P9604, DOI 10.1609/aaai.v36i9.21194