Adaptive multi-objective swarm fusion for imbalanced data classification

被引:56
|
作者
li, Jinyan [1 ]
Fong, Simon [1 ]
Wong, Raymond K. [2 ]
Chu, Victor W. [2 ]
机构
[1] Univ Macau, Dept Comp Informat Sci, Macau, Peoples R China
[2] Univ New South Wales, Sch Comp Sci & Engn, Sydney, NSW, Australia
关键词
Swarm fusion; Swarm intelligence algorithm; Multi-objective; Crossover rebalancing; Imbalanced data classification; OPTIMIZATION; ALGORITHMS; PERFORMANCE; AGREEMENT; DESIGN; POWER;
D O I
10.1016/j.inffus.2017.03.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning a classifier from an imbalanced dataset is an important problem in data mining and machine learning. Since there is more information from the majority classes than the minorities in an imbalanced dataset, the classifier would become over-fitted to the former and under-fitted to the latter classes. Previous attempts to address the problem have been focusing on increasing the learning sensitivity to the minorities and/or rebalancing sample sizes among classes before learning. However, how to efficiently identify their optimal mix in rebalancing is still an unresolved problem. Due to non-linear relationships between attributes and class labels, merely to rebalance sample sizes rarely comes up with optimal results. Moreover, brute-force search for the perfect combination is known to be NP-hard and hence a smarter heuristic is required. In this paper, we propose a notion of swarm fusion to address the problem using stochastic swarm heuristics to cooperatively optimize the mixtures. Comparing with conventional rebalancing methods, e.g., linear search, our novel fusion approach is able to find a close to optimal mix with improved accuracy and reliability. Most importantly, it has found to be with higher computational speed than other coupled swarm optimization techniques and iteration methods. In our experiments, we first compared our proposed solution with traditional methods on thirty publicly available imbalanced datasets. Using neural network as base learner, our proposed method is found to outperform other traditional methods by up to 69% in terms of the credibility of the learned classifiers. Secondly, we wrapped our proposed swarm fusion method with decision tree. Notably, it defeated six state-of-the-art methods on ten imbalanced datasets in all evolution metrics that we considered. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 24
页数:24
相关论文
共 50 条
  • [1] An adaptive multi-objective particle swarm optimization for color image fusion
    Niu, Yifeng
    Shen, Lincheng
    SIMULATED EVOLUTION AND LEARNING, PROCEEDINGS, 2006, 4247 : 473 - 480
  • [2] Multi-objective Automatic Algorithm Configuration for the Classification Problem of Imbalanced Data
    Tari, Sara
    Szczepanski, Nicolas
    Mousin, Lucien
    Jacques, Julie
    Kessaci, Marie-Eleonore
    Jourdan, Laetitia
    2020 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2020,
  • [3] Multi-objective evolution of oblique decision trees for imbalanced data binary classification
    Chabbouh, Marwa
    Bechikh, Slim
    Hung, Chih-Cheng
    Ben Said, Lamjed
    SWARM AND EVOLUTIONARY COMPUTATION, 2019, 49 : 1 - 22
  • [4] A Multi-Objective Evolutionary Approach to Imbalanced Classification Problems
    Chira, Camelia
    Lemnaru, Camelia
    2015 IEEE 11TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING (ICCP), 2015, : 149 - 154
  • [5] Adaptive Multi-objective Search in a Swarm vs Swarm Context
    Farid, Ali Moltajaei
    Egerton, Simon
    Barca, Jan Carlo
    Kamal, Md Abdus Samad
    2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2018, : 3641 - 3646
  • [6] Adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique algorithm for tackling binary imbalanced datasets in biomedical data classification
    Li, Jinyan
    Fong, Simon
    Sung, Yunsick
    Cho, Kyungeun
    Wong, Raymond
    Wong, Kelvin K. L.
    BIODATA MINING, 2016, 9 : 1 - 15
  • [7] Adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique algorithm for tackling binary imbalanced datasets in biomedical data classification
    Jinyan Li
    Simon Fong
    Yunsick Sung
    Kyungeun Cho
    Raymond Wong
    Kelvin K. L. Wong
    BioData Mining, 9
  • [8] SVM ensemble training for imbalanced data classification using multi-objective optimization techniques
    Joanna Grzyb
    Michał Woźniak
    Applied Intelligence, 2023, 53 : 15424 - 15441
  • [9] SVM ensemble training for imbalanced data classification using multi-objective optimization techniques
    Grzyb, Joanna
    Wozniak, Michal
    APPLIED INTELLIGENCE, 2023, 53 (12) : 15424 - 15441
  • [10] A Hybrid CP/MOLS Approach for Multi-Objective Imbalanced Classification
    Szczepanski, Nicolas
    Audemard, Gilles
    Jourdan, Laetitia
    Lecoutre, Christophe
    Mousin, Lucien
    Veerapen, Nadarajen
    PROCEEDINGS OF THE 2021 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'21), 2021, : 723 - 731