A novel approach using incremental oversampling for data stream mining

被引:5
作者
Anupama, N. [1 ]
Jena, Sudarson [2 ]
机构
[1] GITAM Univ, Hyderabad, India
[2] Sambalpur Univ, Inst Informat Technol, Sambalpur, India
关键词
Knowledge discovery; Data streams; Imbalanced data; Oversampling; Increment over sampling for data streams (IOSDS); CLASSIFICATION;
D O I
10.1007/s12530-018-9249-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data stream mining is very popular in recent years with advanced electronic devices generating continuous data streams. The performance of standard learning algorithms is been compromised with imbalance nature present in real world data streams. In this paper we propose a novel algorithm dubbed as increment over sampling for data streams (IOSDS) which uses an unique over sampling technique to almost balance the data sets to minimize the effect of imbalance in stream mining process. The experimental analysis is conducted on 15 data chunks of data streams with varied sizes and different imbalance ratios. The results suggests that the proposed IOSDS algorithm improves the knowledge discovery over benchmark algorithms like C4.5 and Hoeffding tree in terms of standard performance measures namely accuracy, AUC, precision, recall and F-measure.
引用
收藏
页码:351 / 362
页数:12
相关论文
共 50 条
  • [1] A novel approach using incremental oversampling for data stream mining
    N. Anupama
    Sudarson Jena
    Evolving Systems, 2019, 10 : 351 - 362
  • [2] A NOVEL RULE-BASED OVERSAMPLING APPROACH FOR IMBALANCED DATA CLASSIFICATION
    Zhang, Xiao
    Paz, Ivan
    Nebot, Angela
    37TH ANNUAL EUROPEAN SIMULATION AND MODELLING CONFERENCE 2023, ESM 2023, 2023, : 208 - 212
  • [3] Novel Oversampling Algorithm for Handling Imbalanced Data Classification Novel Oversampling Algorithm
    More, Anjali S.
    Rana, Dipti P.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (08) : 491 - 496
  • [4] Radial-Based Approach to Imbalanced Data Oversampling
    Koziarski, Michal
    Krawczyk, Bartosz
    Wozniak, Michal
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2017, 2017, 10334 : 318 - 327
  • [5] Imbalanced Data Mining Using Oversampling and Cellular GEP Ensemble
    Jedrzejowicz, Joanna
    Jedrzejowicz, Piotr
    COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2021), 2021, 12876 : 360 - 372
  • [6] An oversampling approach for mining program specifications
    Chen, Deng
    Zhang, Yan-duo
    Wei, Wei
    Wang, Rong-cun
    Li, Xiao-lin
    Liu, Wei
    Wang, Shi-xun
    Zhu, Rui
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2018, 19 (06) : 737 - 754
  • [7] An oversampling approach for mining program specifications
    Deng Chen
    Yan-duo Zhang
    Wei Wei
    Rong-cun Wang
    Xiao-lin Li
    Wei Liu
    Shi-xun Wang
    Rui Zhu
    Frontiers of Information Technology & Electronic Engineering, 2018, 19 : 737 - 754
  • [8] BIT STREAM ADDER FOR OVERSAMPLING CODED DATA
    OLEARY, P
    MALOBERTI, F
    ELECTRONICS LETTERS, 1990, 26 (20) : 1708 - 1709
  • [9] Batch -Incremental Classification of Stream Data Using Storage
    Ponkiya, Parita
    Srivastava, Rohit
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2015, 15 (04): : 95 - 99
  • [10] Batch -Incremental Classification of Stream Data Using Storage
    Ponkiya, Parita
    Srivastava, Rohit
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2015, 15 (05): : 91 - 95