A Hierarchical Tree Distance Measure for Classification

被引:2
作者
Caspersen, Kent Munthe [1 ]
Madsen, Martin Bjeldbak [1 ]
Eriksen, Andreas Berre [1 ]
Thiesson, Bo [1 ]
机构
[1] Aalborg Univ, Dept Comp Sci, Aalborg, Denmark
来源
ICPRAM: PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS | 2017年
关键词
Machine Learning; Multi-class Classification; Hierarchical Classification; Tree Distance Measures; Multi-output Regression; Multidimensional Scaling; Process Automation; UNSPSC;
D O I
10.5220/0006198505020509
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we explore the problem of classification where class labels exhibit a hierarchical tree structure. Many multiclass classification algorithms assume a flat label space, where hierarchical structures are ignored. We take advantage of hierarchical structures and the interdependencies between labels. In our setting, labels are structured in a product and service hierarchy, with a focus on spend analysis. We define a novel distance measure between classes in a hierarchical label tree. This measure penalizes paths though high levels in the hierarchy. We use a known classification algorithm that aims to minimize distance between labels, given any symmetric distance measure. The approach is global in that it constructs a single classifier for an entire hierarchy by embedding hierarchical distances into a lower-dimensional space. Results show that combining our novel distance measure with the classifier induces a trade-off between accuracy and lower hierarchical distances on misclassifications. This is useful in a setting where erroneous predictions vastly change the context of a label.
引用
收藏
页码:502 / 509
页数:8
相关论文
共 8 条
[1]  
Bishop C.M., 2006, PATTERN RECOGN, V4, P738, DOI DOI 10.1117/1.2819119
[2]  
Chen YW, 2006, STUD FUZZ SOFT COMP, V207, P315
[3]  
Dumais S, 2000, P 23 ANN INT ACM SIG, P256, DOI [10.1145/345508.345593, DOI 10.1145/345508.345593]
[4]   Yahoo! as an ontology - Using Yahoo! categories to describe documents [J].
Labrou, Y ;
Finin, T .
PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON INFORMATION KNOWLEDGE MANAGEMENT, CIKM'99, 1999, :180-187
[5]  
Pedregosa F, 2011, J MACH LEARN RES, V12, P2825
[6]   A survey of hierarchical classification across different application domains [J].
Silla, Carlos N., Jr. ;
Freitas, Alex A. .
DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 22 (1-2) :31-72
[7]  
Wang K, 1999, PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, P363
[8]  
Weinberger K.Q., 2009, ADV NEURAL INFORM PR, P1737