Prototype selection for multi-label data based on label correlation

被引:3
作者
Li, Haikun [1 ]
Fang, Min [1 ]
Li, Hang [1 ]
Wang, Peng [1 ]
机构
[1] Xidian Univ, Sch Comp Sci & Technol, Xian, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-label learning; Multi-label data; Instance reduction; Prototype selection; Label correlation; CLASSIFICATION; KNN;
D O I
10.1007/s00521-023-08617-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In multi-label learning, the training data is typically large-scale and contains numerous noisy and redundant instances. Directly inducing a classifier with raw data can result in higher memory overhead and lower classification performance. One effective method to alleviate these problems is prototype selection, which reduces the number of instances. However, most existing multi-label prototype selection algorithms transform the multi-label data set into a single-label one using problem transformation methods, which may ignore the label correlation and lead to suboptimal prototype selection. To overcome this limitation, we propose a new method called CO-GCNN, i.e., multi-label prototype selection with Co-Occurrence and Generalized Condensed Nearest Neighbor. The CO-GCNN represents label correlation by calculating the co-occurrence rate of pairwise labels and dividing the original data into positive and negative classes. Then, the prototype selection process is performed using the generalized condensed nearest neighbor rule to obtain a reduced set of instances. Experiments on six multi-label benchmark datasets show that the classifier derived from the reduced set outperforms the classifier derived from the original data, confirming the effectiveness of the proposed method.
引用
收藏
页码:2121 / 2130
页数:10
相关论文
共 20 条
  • [1] Study of data transformation techniques for adapting single-label prototype selection algorithms to multi-label learning
    Arnaiz-Gonzalez, Alvar
    Diez-Pastor, Jose-Francisco
    Rodriguez, Juan J.
    Garcia-Osorio, Cesar
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2018, 109 : 114 - 130
  • [2] Prototypes Generation from Multi-label Datasets Based on Granular Computing
    Bello, Marilyn
    Napoles, Gonzalon
    Vanhoof, Koen
    Bello, Rafael
    [J]. PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS (CIARP 2019), 2019, 11896 : 142 - 151
  • [3] Learning multi-label scene classification
    Boutell, MR
    Luo, JB
    Shen, XP
    Brown, CM
    [J]. PATTERN RECOGNITION, 2004, 37 (09) : 1757 - 1771
  • [4] Improving kNN multi-label classification in Prototype Selection scenarios using class proposals
    Calvo-Zaragoza, Jorge
    Valero-Mas, Jose J.
    Rico-Juan, Juan R.
    [J]. PATTERN RECOGNITION, 2015, 48 (05) : 1608 - 1622
  • [5] Charte F, 2014, LECT NOTES COMPUT SC, V8669, P1, DOI 10.1007/978-3-319-10840-7_1
  • [6] Chou CH, 2006, INT C PATT RECOG, P556
  • [7] Elisseeff A, 2002, ADV NEUR IN, V14, P681
  • [8] Multilabel classification via calibrated label ranking
    Fuernkranz, Johannes
    Huellermeier, Eyke
    Mencia, Eneldo Loza
    Brinker, Klaus
    [J]. MACHINE LEARNING, 2008, 73 (02) : 133 - 153
  • [9] Label ranking by learning pairwise preferences
    Huellermeier, Eyke
    Fuernkranz, Johannes
    Cheng, Weiwei
    Brinker, Klaus
    [J]. ARTIFICIAL INTELLIGENCE, 2008, 172 (16-17) : 1897 - 1916
  • [10] Joint Label-Specific Features and Correlation Information for Multi-Label Learning
    Jia, Xiu-Yi
    Zhu, Sai-Sai
    Li, Wei-Wei
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2020, 35 (02) : 247 - 258