Parameter-free classification in multi-class imbalanced data sets

被引:20
作者
Cerf, Loic [1 ]
Gay, Dominique [2 ]
Selmaoui-Folcher, Nazha [3 ]
Cremilleux, Bruno [4 ]
Boulicaut, Jean-Francois [5 ]
机构
[1] Univ Fed Minas Gerais, Dept Comp Sci, Belo Horizonte, MG, Brazil
[2] Orange Labs, F-22307 Lannion, France
[3] Univ New Caledonia, PPME EA3325, Noumea, New Caledonia
[4] Univ Caen, GREYC CNRS UMR6072, F-14032 Caen, France
[5] Univ Lyon, CNRS, INRIA, INSA Lyon,LIRIS,UMR5205, F-69621 Villeurbanne, France
关键词
Classification; Association rules; Multi-class context; Imbalanced data set; One-Versus-Each framework; DISCOVERY; PATTERNS; SMOTE;
D O I
10.1016/j.datak.2013.06.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many applications deal with classification in multi-class imbalanced contexts. In such difficult situations, classical CBA-like approaches (Classification Based on Association rules) show their limits. Most CBA-like methods actually are One-Vs-All approaches (OVA), i.e., the selected classification rules are relevant for one class and irrelevant for the union of the other classes. In this paper, we point out recurrent problems encountered by OVA approaches applied to multi-class imbalanced data sets (e.g., improper bias towards majority classes, conflicting rules). That is why we propose a new One-Versus-Each (OVE) framework. In this framework, a rule has to be relevant for one class and irrelevant for every other class taken separately. Our approach, called fitcare, is empirically validated on various benchmark data sets and our theoretical findings are confirmed. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:109 / 129
页数:21
相关论文
共 50 条
[41]   Multi-class and feature selection extensions of Roughly Balanced Bagging for imbalanced data [J].
Lango, Mateusz ;
Stefanowski, Jerzy .
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2018, 50 (01) :97-127
[42]   The Text Classification for Imbalanced Data Sets [J].
Li, Yanling ;
Zhu, Yehang ;
Yang, Ping .
ISISE 2008: INTERNATIONAL SYMPOSIUM ON INFORMATION SCIENCE AND ENGINEERING, VOL 2, 2008, :778-+
[43]   Classification of Imbalanced data sets using Multi Objective Genetic Programming [J].
Maheta, Hardik H. ;
Dabhi, Vipul K. .
2015 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS (ICCCI), 2015,
[44]   On the class overlap problem in imbalanced data classification [J].
Vuttipittayamongkol, Pattaramon ;
Elyan, Eyad ;
Petrovski, Andrei .
KNOWLEDGE-BASED SYSTEMS, 2021, 212 (212)
[45]   Logical Analysis of Multi-Class Data [J].
Felix Avila-Herrera, Juan ;
Subasi, Munevver Mine .
2015 XLI LATIN AMERICAN COMPUTING CONFERENCE (CLEI), 2015, :276-285
[46]   AMDO: An Over-Sampling Technique for Multi-Class Imbalanced Problems [J].
Yang, Xuebing ;
Kuang, Qiuming ;
Zhang, Wensheng ;
Zhang, Guoping .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (09) :1672-1685
[47]   Multi-Class Imbalanced Data Handling with Concept Drift in Fog Computing: A Taxonomy, Review, and Future Directions [J].
Sharief, Farhana ;
Ijaz, Humaira ;
Shojafar, Mohammad ;
Naeem, Muhammad Asif .
ACM COMPUTING SURVEYS, 2025, 57 (01)
[48]   Re-sampling of multi-class imbalanced data using belief function theory and ensemble learning [J].
Grina, Fares ;
Elouedi, Zied ;
Lefevre, Eric .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2023, 156 :1-15
[49]   Combating Mutuality with Difficulty Factors in Multi-class Imbalanced Data: A Similarity-based Hybrid Sampling [J].
Zheng, Zhong ;
Yan, Yuanting ;
Zhang, Yiwen ;
Zhang, Yanping .
2022 IEEE 9TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2022, :387-396
[50]   An approach to multi-class imbalanced problem in ecology using machine learning [J].
Sidumo, Bonelwa ;
Sonono, Energy ;
Takaidza, Isaac .
ECOLOGICAL INFORMATICS, 2022, 71