Visualisation of Random Forest classification

被引:1
作者
Macas, Catarina [1 ]
Campos, Joao R. [1 ]
Lourenco, Nuno [1 ]
Machado, Penousal [1 ]
机构
[1] Univ Coimbra, Ctr Informat & Syst, Coimbra 3030290, Portugal
关键词
Visual analytics; information visualisation; Random Forests; Decision Trees;
D O I
10.1177/14738716241260745
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Decision Trees (DTs) stand out as a prevalent choice among supervised Machine Learning algorithms. These algorithms form binary structures, effectively dividing data into smaller segments based on distinct rules. Consequently, DTs serve as a learning mechanism to identify optimal rules for the separation and classification of all elements within a dataset. Due to their resemblance to rule-based decisions, DTs are easy to interpret. Additionally, their minimal need for data pre-processing and versatility in handling various data types make DTs highly practical and user-friendly across diverse domains. Nevertheless, when confronted with extensive datasets or ensembles involving multiple trees, such as Random Forests, its analysis can become challenging. To facilitate the examination and validation of these models, we have developed a visual tool that incorporates a range of visualisations providing both an overview and detailed insights into a set of DTs. Our tool is designed to offer diverse perspectives on the same data, enabling a deeper understanding of the decision-making process. This article outlines our design approach, introduces various visualisation models, and details the iterative validation process. We validate our methodology through a telecommunications use case, specifically employing the visual tool to decipher how a DT-based model determines the optimal communication channel (i.e. phone call, email, SMS) for a telecommunication operator to use when contacting a client.
引用
收藏
页码:312 / 327
页数:16
相关论文
共 31 条
[1]  
Ankerst M., 2001, ACM SIGKDD WORKSH VI, P23
[2]   Case study: Visualization for decision tree analysis in data mining [J].
Barlow, T ;
Neville, P .
IEEE SYMPOSIUM ON INFORMATION VISUALIZATION 2001, PROCEEDINGS, 2001, :149-152
[3]   Matrix Reordering Methods for Table and Network Visualization [J].
Behrisch, Michael ;
Bach, Benjamin ;
Riche, Nathalie Henry ;
Schreck, Tobias ;
Fekete, Jean-Daniel .
COMPUTER GRAPHICS FORUM, 2016, 35 (03) :693-716
[4]  
Biran O, 2017, PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P1461
[5]  
Bostock Mike, 2012, D3.js-data-driven documents
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]  
Di Castro F., 2019, CEUR_Workshop_Proceedings, V2327
[8]  
Friendly M., 2000, P 25 ANN SAS US GROU
[9]  
Heer J, 2009, CHI2009: PROCEEDINGS OF THE 27TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, VOLS 1-4, P1303
[10]   Visualizing dynamic data with heat triangles [J].
Hu, Ya Ting ;
Burch, Michael ;
van de Wetering, Huub .
JOURNAL OF VISUALIZATION, 2022, 25 (01) :15-29