Multi-label borderline oversampling technique

被引:10
|
作者
Teng, Zeyu [1 ]
Cao, Peng [2 ,3 ]
Huang, Min [1 ]
Gao, Zheming [1 ]
Wang, Xingwei [2 ]
机构
[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Liaoning, Peoples R China
[2] Northeastern Univ, Coll Comp Sci & Engn, Shenyang 110169, Liaoning, Peoples R China
[3] Northeastern Univ, Key Lab Intelligent Comp Med Image, Minist Educ, Shenyang 110169, Liaoning, Peoples R China
关键词
Multi-label learning; Class imbalance; Borderline sample; Oversampling; CLASSIFICATION; IMBALANCE; RANKING; MACHINE; SMOTE;
D O I
10.1016/j.patcog.2023.109953
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Class imbalance problem commonly exists in multi-label classification (MLC) tasks. It has non-negligible im-pacts on the classifier performance and has drawn extensive attention in recent years. Borderline oversampling has been widely used in single-label learning as a competitive technique in dealing with class imbalance. Nevertheless, the borderline samples in multi-label data sets (MLDs) have not been studied. Hence, this paper deeply discussed the borderline samples in MLDs and found they have different neighboring relationships with class borders, which makes their roles different in the classifier training. For that, they are divided into two types named the self-borderline samples and the cross-borderline samples. Further, a novel MLDs resampling approach called Multi-Label Borderline Oversampling Technique (MLBOTE) is proposed for multi -label imbalanced learning. MLBOTE identifies three types of seed samples, including interior, self-borderline, and cross-borderline samples, and different oversampling mechanisms are designed for them, respectively. Meanwhile, it regards not only the minority classes but also the classes suffering from one-vs-rest imbalance as those in need of oversampling. Experiments on eight data sets with nine MLC algorithms and three base classifiers are carried out to compare MLBOTE with some state-of-art MLDs resampling techniques. The results show MLBOTE outperforms other methods in various scenarios.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Multi-label feature selection considering label supplementation
    Zhang, Ping
    Liu, Guixia
    Gao, Wanfu
    Song, Jiazhi
    PATTERN RECOGNITION, 2021, 120 (120)
  • [22] Multi-Label Learning with Global and Local Label Correlation
    Zhu, Yue
    Kwok, James T.
    Zhou, Zhi-Hua
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (06) : 1081 - 1094
  • [23] Multi-label learning with kernel local label information
    Fu, Xiaozhen
    Li, Deyu
    Zhai, Yanhui
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 207
  • [24] A Review on Multi-Label Learning Algorithms
    Zhang, Min-Ling
    Zhou, Zhi-Hua
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (08) : 1819 - 1837
  • [25] Generative Multi-Label Correlation Learning
    Wang, Lichen
    Ding, Zhengming
    Lee, Kasey
    Han, Seungju
    Han, Jae-Joon
    Choi, Changkyu
    Fu, Yun
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2023, 17 (02)
  • [26] Multi-label Learning via Codewords
    Sedghi, Mahlagha
    Huang, Yinjie
    Georgiopoulos, Michael
    Anagnostopoulos, Georgios
    2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 221 - 228
  • [27] Multi-Label Learning With Hidden Labels
    Huang, Jun
    Rui, Haowei
    Li, Guorong
    Qu, Xiwen
    Tao, Tao
    Zheng, Xiao
    IEEE ACCESS, 2020, 8 : 29667 - 29676
  • [28] Multi-Label Quantification
    Moreo, Alejandro
    Francisco, Manuel
    Sebastiani, Fabrizio
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2024, 18 (01)
  • [29] LAIM discretization for multi-label data
    Cano, Alberto
    Maria Luna, Jose
    Gibaja, Eva L.
    Ventura, Sebastian
    INFORMATION SCIENCES, 2016, 330 : 370 - 384
  • [30] Multi-label classification with label clusters
    Gatto, Elaine Cecilia
    Ferrandin, Mauri
    Cerri, Ricardo
    KNOWLEDGE AND INFORMATION SYSTEMS, 2025, 67 (02) : 1741 - 1785