Pitfalls of assessing extracted hierarchies for multi-class classification

被引：3

作者：

del Moral, Pablo ^{[1
]}

Nowaczyk, Slawomir ^{[1
]}

Sant'Anna, Anita ^{[1
]}

Pashami, Sepideh ^{[1
]}

机构：

[1] Halmstad Univ, CAISR, Kristian IVs Vag 3, S-30118 Halmstad, Sweden

来源：

PATTERN RECOGNITION | 2023年 / 136卷

关键词：

Hierarchical multi -class classification; Multi -class classification; Class hierarchies; SUPPORT VECTOR MACHINES;

D O I：

10.1016/j.patcog.2022.109225

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Using hierarchies of classes is one of the standard methods to solve multi-class classification problems. In the literature, selecting the right hierarchy is considered to play a key role in improving classification performance. Although different methods have been proposed, there is still a lack of understanding of what makes a hierarchy good and what makes a method to extract hierarchies perform better or worse. To this effect, we analyze and compare some of the most popular approaches to extracting hierarchies. We identify some common pitfalls that may lead practitioners to make misleading conclusions about their methods. To address some of these problems, we demonstrate that using random hierarchies is an appropriate benchmark to assess how the hierarchy's quality affects the classification performance. In particular, we show how the hierarchy's quality can become irrelevant depending on the experimental setup: when using powerful enough classifiers, the final performance is not affected by the quality of the hierarchy. We also show how comparing the effect of the hierarchies against non-hierarchical approaches might incorrectly indicate their superiority. Our results confirm that datasets with a high number of classes generally present complex structures in how these classes relate to each other. In these datasets, the right hierarchy can dramatically improve classification performance. (c) 2022 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )

引用

页数：13

共 34 条

[21] On the effectiveness of heuristics for learning nested dichotomies: an empirical analysis [J].

Melnikov, Vitalik ;

Huellermeier, Eyke .

MACHINE LEARNING, 2018, 107 (8-10) :1537-1560

[22]

Mnih A., 2009, P 21 INT C NEURAL IN, P1081

[23]

Morin F., 2005, AISTATS, V5, P246

[24] Inference for the generalization error [J].

Nadeau, C ;

Bengio, Y .

MACHINE LEARNING, 2003, 52 (03) :239-281

[25] Improving large-scale hierarchical classification by rewiring: a data-driven filter based approach [J].

Naik, Azad ;

Rangwala, Huzefa .

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2019, 52 (01) :141-164

[26] Parabel: Partitioned Label Trees for Extreme Classification with Application to Dynamic Search Advertising [J].

Prabhu, Yashoteja ;

Kag, Anil ;

Harsola, Shrutendra ;

Agrawal, Rahul ;

Varma, Manik .

WEB CONFERENCE 2018: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW2018), 2018, :993-1002

[27] Joint Hierarchical Category Structure Learning and Large-Scale Image Classification [J].

Qu, Yanyun ;

Lin, Li ;

Shen, Fumin ;

Lu, Chang ;

Wu, Yang ;

Xie, Yuan ;

Tao, Dacheng .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (09) :4331-4346

[28]

Rifkin R, 2004, J MACH LEARN RES, V5, P101

[29] Balanced binary search tree multiclass decomposition method with possible non-outliers [J].

Sevakula, Rahul Kumar ;

Verma, Nishchal Kumar .

SN APPLIED SCIENCES, 2020, 2 (06)

[30] A survey of hierarchical classification across different application domains [J].

Silla, Carlos N., Jr. ;

Freitas, Alex A. .

DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 22 (1-2) :31-72

← 1 2 3 4 →