Hybrid Firefly Optimised Ensemble Classification for Drifting Data Streams with Imbalance

被引:2
作者
Pepsi, M. Blessa Binolin [1 ]
Kumar, N. Senthil [1 ]
机构
[1] Mepco Schlenk Engn Coll, Sivakasi, Tamil Nadu, India
关键词
Class Imbalance; Firefly Optimisation; Oversampling; Classification; Data Stream; Concept Drift; MINORITY CLASS; ALGORITHM; SMOTE;
D O I
10.1016/j.knosys.2024.111500
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Classification learning on non-stationary data may face dynamic changes from time to time. The major problem lies in addressing the imbalance among classes and the substantial cost associated with labeling instances, especially in the presence of drifts. Imbalance is due to a lower number of samples in the minority class than in the majority class. Imbalanced data results in the misclassification of data points. This paper proposes a technique for rebalancing data with an oversampling approach using imputation methods and Hybrid Firefly Optimisation algorithm as a novel classifier to perform classification. Imputation methods improve the number of minority samples on a data chunk. Firefly algorithm is optimised as a classification technique with tuned weights using boosting ensemble classifiers. The proposed system is tested on seven synthetic data and five data stream generators. The evaluation metrics like F-measure, AUC, and G-mean are analyzed to investigate the performance. For weather data with an imbalance ratio of 5%, the G-mean value increases by an average of 0.24% comparatively than existing methods. The statistical Friedman - Nemenyi test proves the stability of the proposed algorithm.
引用
收藏
页数:13
相关论文
共 63 条
  • [1] Alfhaid M.A., 2021, Artif Intell, V9, P36
  • [2] An Investigation of SMOTE Based Methods for Imbalanced Datasets With Data Complexity Analysis
    Azhar, Nur Athirah
    Pozi, Muhammad Syafiq Mohd
    Din, Aniza Mohamed
    Jatowt, Adam
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (07) : 6651 - 6672
  • [3] Basha Shaik Johny, 2022, INT C ADV COMP TECHN, P1
  • [4] I-SiamIDS: an improved Siam-IDS for handling class imbalance in network-based intrusion detection systems
    Bedi, Punam
    Gupta, Neha
    Jindal, Vinita
    [J]. APPLIED INTELLIGENCE, 2021, 51 (02) : 1133 - 1151
  • [5] Ferreira LEB, 2019, IEEE IJCNN
  • [6] Robust twin bounded support vector machines for outliers and imbalanced data
    Borah, Parashjyoti
    Gupta, Deepak
    [J]. APPLIED INTELLIGENCE, 2021, 51 (08) : 5314 - 5343
  • [7] On the Dynamics of Classification Measures for Imbalanced and Streaming Data
    Brzezinski, Dariusz
    Stefanowski, Jerzy
    Susmaga, Robert
    Szczech, Izabela
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (08) : 2868 - 2878
  • [8] Combining block-based and online methods in learning ensembles from concept drifting data streams
    Brzezinski, Dariusz
    Stefanowski, Jerzy
    [J]. INFORMATION SCIENCES, 2014, 265 : 50 - 67
  • [9] ROSE: robust online self-adjusting ensemble for continual learning on imbalanced drifting data streams
    Cano, Alberto
    Krawczyk, Bartosz
    [J]. MACHINE LEARNING, 2022, 111 (07) : 2561 - 2599
  • [10] Kappa Updated Ensemble for drifting data stream mining
    Cano, Alberto
    Krawczyk, Bartosz
    [J]. MACHINE LEARNING, 2020, 109 (01) : 175 - 218