Class imbalance problem commonly exists in multi-label classification (MLC) tasks. It has non-negligible im-pacts on the classifier performance and has drawn extensive attention in recent years. Borderline oversampling has been widely used in single-label learning as a competitive technique in dealing with class imbalance. Nevertheless, the borderline samples in multi-label data sets (MLDs) have not been studied. Hence, this paper deeply discussed the borderline samples in MLDs and found they have different neighboring relationships with class borders, which makes their roles different in the classifier training. For that, they are divided into two types named the self-borderline samples and the cross-borderline samples. Further, a novel MLDs resampling approach called Multi-Label Borderline Oversampling Technique (MLBOTE) is proposed for multi -label imbalanced learning. MLBOTE identifies three types of seed samples, including interior, self-borderline, and cross-borderline samples, and different oversampling mechanisms are designed for them, respectively. Meanwhile, it regards not only the minority classes but also the classes suffering from one-vs-rest imbalance as those in need of oversampling. Experiments on eight data sets with nine MLC algorithms and three base classifiers are carried out to compare MLBOTE with some state-of-art MLDs resampling techniques. The results show MLBOTE outperforms other methods in various scenarios.
机构:
Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R ChinaNanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China
Zhu, Yue
Kwok, James T.
论文数: 0引用数: 0
h-index: 0
机构:
Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Hong Kong, Peoples R ChinaNanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China
Kwok, James T.
Zhou, Zhi-Hua
论文数: 0引用数: 0
h-index: 0
机构:
Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R ChinaNanjing Univ, Natl Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China
机构:
CNR, Ist Sci & Tecnol Informaz, Via Giuseppe Moruzzi 1, I-56124 Pisa, ItalyCNR, Ist Sci & Tecnol Informaz, Via Giuseppe Moruzzi 1, I-56124 Pisa, Italy
Moreo, Alejandro
Francisco, Manuel
论文数: 0引用数: 0
h-index: 0
机构:
Univ Granada, Dept Comp Sci & Artificial Intelligence, C Periodista Daniel Saucedo Aranda S-N, Granada 18071, SpainCNR, Ist Sci & Tecnol Informaz, Via Giuseppe Moruzzi 1, I-56124 Pisa, Italy
Francisco, Manuel
Sebastiani, Fabrizio
论文数: 0引用数: 0
h-index: 0
机构:
CNR, Ist Sci & Tecnol Informaz, Via Giuseppe Moruzzi 1, I-56124 Pisa, ItalyCNR, Ist Sci & Tecnol Informaz, Via Giuseppe Moruzzi 1, I-56124 Pisa, Italy