Scalable concept drift adaptation for stream data mining

被引:0
作者
Hu, Lisha [1 ]
Li, Wenxiu [1 ]
Lu, Yaru [1 ]
Hu, Chunyu [2 ,3 ]
机构
[1] Hebei Univ Econ & Business, Shijiazhuang 050061, Peoples R China
[2] Qilu Univ Technol, Shandong Acad Sci, Shandong Comp Sci Ctr,Minist Educ, Natl Supercomp Ctr Jinan,Key Lab Comp Power Networ, Jinan 250353, Peoples R China
[3] Shandong Fundamental Res Ctr Comp Sci, Shandong Prov Key Lab Comp Networks, Jinan 250353, Peoples R China
基金
中国国家自然科学基金;
关键词
Online learning; Stream data mining; Minimum enclosing ball; Concept drift;
D O I
10.1007/s40747-024-01524-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stream data mining aims to handle the continuous and ongoing generation of data flows (e.g. weather, stock and traffic data), which often encounters concept drift as time progresses. Traditional offline algorithms struggle with learning from real-time data, making online algorithms more fitting for mining the stream data with dynamic concepts. Among families of the online learning algorithms, single pass stands out for its efficiency in processing one sample point at a time, and inspecting it only once at most. Currently, there exist online algorithms tailored for single pass over the stream data by converting the problems of classification into minimum enclosing ball. However, these methods mainly focus on expanding the ball to enclose the new data. An excessively large ball might overwrite data of the new concept, creating difficulty in triggering the model updating process. This paper proposes a new online single pass framework for stream data mining, namely Scalable Concept Drift Adaptation (SCDA), and presents three distinct online methods (SCDA-I, SCDA-II and SCDA-III) based on that framework. These methods dynamically adjust the ball by expanding or contracting when new sample points arrive, thereby effectively avoiding the issue of excessively large balls. To evaluate their performance, we conduct the experiments on 7 synthetic and 5 real-world benchmark datasets and compete with the state-of-the-arts. The experiments demonstrate the applicability and flexibility of the SCDA methods in stream data mining by comparing three aspects: predictive performance, memory usage and scalability of the ball. Among them, the SCDA-III method performs best in all these aspects.
引用
收藏
页码:6725 / 6743
页数:19
相关论文
共 38 条
  • [1] Bhattacharyya distance based concept drift detection method for evolving data stream
    Baidari, Ishwar
    Honnikoll, Nagaraj
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 183
  • [2] Bifet A, 2010, JMLR WORKSH CONF PRO, V11, P44
  • [3] Bifet A, 2007, PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, P443
  • [4] Cloud-based email phishing attack using machine and deep learning algorithm
    Butt, Umer Ahmed
    Amin, Rashid
    Aldabbas, Hamza
    Mohan, Senthilkumar
    Alouffi, Bader
    Ahmadian, Ali
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (03) : 3043 - 3070
  • [5] da Silva CAS, 2019, IEEE IJCNN
  • [6] Intelligent decision-making of online shopping behavior based on internet of things
    Fu, Hanliang
    Manogaran, Gunasekaran
    Wu, Kuang
    Cao, Ming
    Jiang, Song
    Yang, Aimin
    [J]. INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 2020, 50 (50) : 515 - 525
  • [7] Weighted Incremental-Decremental Support Vector Machines for concept drift with shifting window
    Galmeanu, Honorius
    Andonie, Razvan
    [J]. NEURAL NETWORKS, 2022, 152 : 528 - 541
  • [8] On evaluating stream learning algorithms
    Gama, Joao
    Sebastiao, Raquel
    Rodrigues, Pedro Pereira
    [J]. MACHINE LEARNING, 2013, 90 (03) : 317 - 346
  • [9] Passive concept drift handling via variations of learning vector quantization
    Heusinger, Moritz
    Raab, Christoph
    Schleif, Frank-Michael
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (01) : 89 - 100
  • [10] Online Support Vector Machine with a Single Pass for Streaming Data
    Hu, Lisha
    Hu, Chunyu
    Huo, Zheng
    Jiang, Xinlong
    Wang, Suzhen
    [J]. MATHEMATICS, 2022, 10 (17)