A novel approach using incremental oversampling for data stream mining

被引:5
作者
Anupama, N. [1 ]
Jena, Sudarson [2 ]
机构
[1] GITAM Univ, Hyderabad, India
[2] Sambalpur Univ, Inst Informat Technol, Sambalpur, India
关键词
Knowledge discovery; Data streams; Imbalanced data; Oversampling; Increment over sampling for data streams (IOSDS); CLASSIFICATION;
D O I
10.1007/s12530-018-9249-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data stream mining is very popular in recent years with advanced electronic devices generating continuous data streams. The performance of standard learning algorithms is been compromised with imbalance nature present in real world data streams. In this paper we propose a novel algorithm dubbed as increment over sampling for data streams (IOSDS) which uses an unique over sampling technique to almost balance the data sets to minimize the effect of imbalance in stream mining process. The experimental analysis is conducted on 15 data chunks of data streams with varied sizes and different imbalance ratios. The results suggests that the proposed IOSDS algorithm improves the knowledge discovery over benchmark algorithms like C4.5 and Hoeffding tree in terms of standard performance measures namely accuracy, AUC, precision, recall and F-measure.
引用
收藏
页码:351 / 362
页数:12
相关论文
共 50 条
  • [41] Identifying beta-thalassemia carriers using a data mining approach: The case of the Gaza Strip, Palestine
    AlAgha, Alaa S.
    Faris, Hossam
    Hammo, Bassam H.
    Al-Zoubi, Ala M.
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2018, 88 : 70 - 83
  • [42] Smoclust: synthetic minority oversampling based on stream clustering for evolving data streams
    Chiu, Chun Wai
    Minku, Leandro L.
    MACHINE LEARNING, 2024, 113 (07) : 4671 - 4721
  • [43] Anomaly detection and oversampling approach for classifying imbalanced data using CLUBS technique in IoT healthcare data
    Subha, S.
    Sathiaseelan, J. G. R.
    INTERNATIONAL JOURNAL OF INTELLIGENT ENGINEERING INFORMATICS, 2023, 11 (03) : 255 - 271
  • [44] Incremental Learning Algorithms for Fast Classification in Data Stream
    Fong, Simon
    Luo, Zhicong
    Yap, Bee Wah
    2013 INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL AND BUSINESS INTELLIGENCE (ISCBI), 2013, : 186 - +
  • [45] Supervised Adaptive Incremental Clustering for data stream of chunks
    Zheng, Laiwen
    Huo, Hong
    Guo, Yiyou
    Fang, Tao
    NEUROCOMPUTING, 2017, 219 : 502 - 517
  • [46] A novel frequent pattern mining technique for prediction of user behavior on web stream data
    Dhanalakshmi P.
    Ingenierie des Systemes d'Information, 2019, 24 (01): : 51 - 56
  • [47] A novel switching function approach for data mining classification problems
    Mohammed Hussein Ibrahim
    Mehmet Hacibeyoglu
    Soft Computing, 2020, 24 : 4941 - 4957
  • [48] A novel switching function approach for data mining classification problems
    Ibrahim, Mohammed Hussein
    Hacibeyoglu, Mehmet
    SOFT COMPUTING, 2020, 24 (07) : 4941 - 4957
  • [49] KAPPA as Drift Detector in Data Stream Mining
    Mahdi, Osama A.
    Pardede, Eric
    Ali, Nawfal
    12TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 4TH INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2021, 184 : 314 - 321
  • [50] Survey and Research Issues in Data Stream Mining
    Agrawal, Lalit
    Adane, Dattatraya
    BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2020, 13 (14): : 146 - 149