Data reduction via multi-label prototype generation

被引:4
作者
Ougiaroglou, Stefanos [1 ]
Filippakis, Panagiotis [1 ]
Fotiadou, Georgia [1 ]
Evangelidis, Georgios [2 ]
机构
[1] Int Hellen Univ, Dept Informat & Elect Engn, Sch Engn, Sindos 57400, Greece
[2] Univ Macedonia, Dept Appl Informat, Sch Informat Sci, Thessaloniki 54636, Greece
关键词
Multi -label classification; Data reduction techniques; Prototype generation; k -NN classification; Binary relevance; RHC; RSP3; BRkNN; INSTANCE SELECTION; LOCAL SETS;
D O I
10.1016/j.neucom.2023.01.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Avery common practice to speed up instance based classifiers is to reduce the size of their training set, that is, replace it by a condensing set, hoping that their accuracy will not worsen. This can be achieved by applying a Prototype Selection or Generation algorithm, also referred to as a Data Reduction Technique. Most of these techniques cannot be applied on multi-label problems, where an instance may belong to more than one classes. Reduction through Homogeneous Clustering (RHC) and Reduction by Space Partitioning (RSP3) are parameter-free single-label Prototype Generation algorithms. Both are based on recursive data partitioning procedures that identify homogeneous clusters of training data, which they replace by their representatives. This paper proposes variations of these algorithms for multi-label training datasets. The proposed methods generate multi-label prototypes and inherit all the desirable properties of their single-label versions. They consider clusters that contain instances that share at least one common label as homogeneous clusters. It is shown via an experimental study based on nine multilabel datasets that the proposed algorithms achieve good reduction rates without negatively affecting classification accuracy.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 25 条
  • [1] Study of data transformation techniques for adapting single-label prototype selection algorithms to multi-label learning
    Arnaiz-Gonzalez, Alvar
    Diez-Pastor, Jose-Francisco
    Rodriguez, Juan J.
    Garcia-Osorio, Cesar
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2018, 109 : 114 - 130
  • [2] Local sets for multi-label instance selection
    Arnaiz-Gonzalez, Alvar
    Diez-Pastor, Jose-Francisco
    Rodriguez, Juan J.
    Garcia-Osorio, Cesar
    [J]. APPLIED SOFT COMPUTING, 2018, 68 : 651 - 666
  • [3] Advances in instance selection for instance-based learning algorithms
    Brighton, H
    Mellish, C
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2002, 6 (02) : 153 - 172
  • [4] Charte F, 2014, LECT NOTES COMPUT SC, V8669, P1, DOI 10.1007/978-3-319-10840-7_1
  • [5] A sample set condensation algorithm for the class sensitive artificial neural network
    Chen, CH
    Jozwik, A
    [J]. PATTERN RECOGNITION LETTERS, 1996, 17 (08) : 819 - 823
  • [6] NEAREST NEIGHBOR PATTERN CLASSIFICATION
    COVER, TM
    HART, PE
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) : 21 - +
  • [7] Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study
    Garcia, Salvador
    Derrac, Joaquin
    Ramon Cano, Jose
    Herrera, Francisco
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (03) : 417 - 435
  • [8] Fast data reduction by space partitioning via convex hull and MBR computation
    Giorginis, Thomas
    Ougiaroglou, Stefanos
    Evangelidis, Georgios
    Dervos, Dimitris A.
    [J]. PATTERN RECOGNITION, 2022, 126
  • [9] Editing training data for multi-label classification with the k-nearest neighbor rule
    Kanj, Sawsan
    Abdallah, Fahed
    Denoeux, Thierry
    Tout, Kifah
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2016, 19 (01) : 145 - 161
  • [10] Three new instance selection methods based on local sets: A comparative study with several approaches from a bi-objective perspective
    Leyva, Enrique
    Gonzalez, Antonio
    Perez, Raul
    [J]. PATTERN RECOGNITION, 2015, 48 (04) : 1523 - 1537