A novel approach using incremental oversampling for data stream mining

被引:5
作者
Anupama, N. [1 ]
Jena, Sudarson [2 ]
机构
[1] GITAM Univ, Hyderabad, India
[2] Sambalpur Univ, Inst Informat Technol, Sambalpur, India
关键词
Knowledge discovery; Data streams; Imbalanced data; Oversampling; Increment over sampling for data streams (IOSDS); CLASSIFICATION;
D O I
10.1007/s12530-018-9249-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data stream mining is very popular in recent years with advanced electronic devices generating continuous data streams. The performance of standard learning algorithms is been compromised with imbalance nature present in real world data streams. In this paper we propose a novel algorithm dubbed as increment over sampling for data streams (IOSDS) which uses an unique over sampling technique to almost balance the data sets to minimize the effect of imbalance in stream mining process. The experimental analysis is conducted on 15 data chunks of data streams with varied sizes and different imbalance ratios. The results suggests that the proposed IOSDS algorithm improves the knowledge discovery over benchmark algorithms like C4.5 and Hoeffding tree in terms of standard performance measures namely accuracy, AUC, precision, recall and F-measure.
引用
收藏
页码:351 / 362
页数:12
相关论文
共 50 条
  • [21] An Active Learning Approach for Ensemble-based Data Stream Mining
    Alabdulrahman, Rabaa
    Viktor, Herna
    Paquet, Eric
    KDIR: PROCEEDINGS OF THE 8TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT - VOL. 1, 2016, : 275 - 282
  • [22] Mining data stream using improved Fisher Discriminate Analysis
    Zou, Lingjun
    Ling, Chen
    Li, Tu
    Qu, Hongyu
    2008 PROCEEDINGS OF INFORMATION TECHNOLOGY AND ENVIRONMENTAL SYSTEM SCIENCES: ITESS 2008, VOL 2, 2008, : 951 - 955
  • [23] Kappa Updated Ensemble for drifting data stream mining
    Cano, Alberto
    Krawczyk, Bartosz
    MACHINE LEARNING, 2020, 109 (01) : 175 - 218
  • [24] Kappa Updated Ensemble for drifting data stream mining
    Alberto Cano
    Bartosz Krawczyk
    Machine Learning, 2020, 109 : 175 - 218
  • [25] Efficient data perturbation for privacy preserving and accurate data stream mining
    Chamikara, M. A. P.
    Bertok, P.
    Liu, D.
    Camtepe, S.
    Khalil, I
    PERVASIVE AND MOBILE COMPUTING, 2018, 48 : 1 - 19
  • [26] Incremental Decision Rules Algorithm: A Probabilistic and Dynamic Approach to Decisional Data Stream Problems
    Molla, Nuria
    Rabasa, Alejandro
    Rodriguez-Sala, Jesus J.
    Sanchez-Soriano, Joaquin
    Ferrandiz, Antonio
    MATHEMATICS, 2022, 10 (01)
  • [27] A novel oversampling and feature selection hybrid algorithm for imbalanced data classification
    Feng, Fang
    Li, Kuan-Ching
    Yang, Erfu
    Zhou, Qingguo
    Han, Lihong
    Hussain, Amir
    Cai, Mingjiang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (03) : 3231 - 3267
  • [28] A novel oversampling and feature selection hybrid algorithm for imbalanced data classification
    Fang Feng
    Kuan-Ching Li
    Erfu Yang
    Qingguo Zhou
    Lihong Han
    Amir Hussain
    Mingjiang Cai
    Multimedia Tools and Applications, 2023, 82 : 3231 - 3267
  • [29] Mining Data Stream from a Higher Level of Abstraction: A Class Window Approach
    Abdullah-Al-Mamun
    Abedin, Md. Anowarul
    Al Arman, Md.
    Mottalib, M. A.
    Huq, Mohammad Rezwanul
    INFORMATICS ENGINEERING AND INFORMATION SCIENCE, PT IV, 2011, 254 : 461 - 469
  • [30] A Novel Approach to Data Mining in Wireless Sensor Networks
    Rakocevic, Goran
    Tafa, Zilbert
    Milutinovic, Veljko
    AD HOC & SENSOR WIRELESS NETWORKS, 2014, 22 (1-2) : 21 - 40