Prototype selection for multi-label data based on label correlation

被引：3

作者：

Li, Haikun ^{[1
]}

Fang, Min ^{[1
]}

Li, Hang ^{[1
]}

Wang, Peng ^{[1
]}

机构：

[1] Xidian Univ, Sch Comp Sci & Technol, Xian, Shaanxi, Peoples R China

来源：

NEURAL COMPUTING & APPLICATIONS | 2024年 / 36卷 / 05期

基金：

中国国家自然科学基金;

关键词：

Multi-label learning; Multi-label data; Instance reduction; Prototype selection; Label correlation; CLASSIFICATION; KNN;

D O I：

10.1007/s00521-023-08617-7

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In multi-label learning, the training data is typically large-scale and contains numerous noisy and redundant instances. Directly inducing a classifier with raw data can result in higher memory overhead and lower classification performance. One effective method to alleviate these problems is prototype selection, which reduces the number of instances. However, most existing multi-label prototype selection algorithms transform the multi-label data set into a single-label one using problem transformation methods, which may ignore the label correlation and lead to suboptimal prototype selection. To overcome this limitation, we propose a new method called CO-GCNN, i.e., multi-label prototype selection with Co-Occurrence and Generalized Condensed Nearest Neighbor. The CO-GCNN represents label correlation by calculating the co-occurrence rate of pairwise labels and dividing the original data into positive and negative classes. Then, the prototype selection process is performed using the generalized condensed nearest neighbor rule to obtain a reduced set of instances. Experiments on six multi-label benchmark datasets show that the classifier derived from the reduced set outperforms the classifier derived from the original data, confirming the effectiveness of the proposed method.

引用

页码：2121 / 2130

页数：10

共 20 条

[1] Study of data transformation techniques for adapting single-label prototype selection algorithms to multi-label learning
Arnaiz-Gonzalez, Alvar
Diez-Pastor, Jose-Francisco
Rodriguez, Juan J.
Garcia-Osorio, Cesar
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2018, 109 : 114 - 130
[2] Prototypes Generation from Multi-label Datasets Based on Granular Computing
Bello, Marilyn
Napoles, Gonzalon
Vanhoof, Koen
Bello, Rafael
[J]. PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS (CIARP 2019), 2019, 11896 : 142 - 151
[3] Learning multi-label scene classification
Boutell, MR
Luo, JB
Shen, XP
Brown, CM
[J]. PATTERN RECOGNITION, 2004, 37 (09) : 1757 - 1771
[4] Improving kNN multi-label classification in Prototype Selection scenarios using class proposals
Calvo-Zaragoza, Jorge
Valero-Mas, Jose J.
Rico-Juan, Juan R.
[J]. PATTERN RECOGNITION, 2015, 48 (05) : 1608 - 1622
[5] Charte F, 2014, LECT NOTES COMPUT SC, V8669, P1, DOI 10.1007/978-3-319-10840-7_1
[6] Chou CH, 2006, INT C PATT RECOG, P556
[7] Elisseeff A, 2002, ADV NEUR IN, V14, P681
[8] Multilabel classification via calibrated label ranking
Fuernkranz, Johannes
Huellermeier, Eyke
Mencia, Eneldo Loza
Brinker, Klaus
[J]. MACHINE LEARNING, 2008, 73 (02) : 133 - 153
[9] Label ranking by learning pairwise preferences
Huellermeier, Eyke
Fuernkranz, Johannes
Cheng, Weiwei
Brinker, Klaus
[J]. ARTIFICIAL INTELLIGENCE, 2008, 172 (16-17) : 1897 - 1916
[10] Joint Label-Specific Features and Correlation Information for Multi-Label Learning
Jia, Xiu-Yi
Zhu, Sai-Sai
Li, Wei-Wei
[J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2020, 35 (02) : 247 - 258

← 1 2 →