Online learning from capricious data streams via shared and new feature spaces

被引:0
作者
Zhou, Peng [1 ,2 ,3 ]
Zhang, Shuai [4 ]
Mu, Lin [1 ,2 ,3 ]
Yan, Yuanting [1 ,2 ,3 ]
机构
[1] Anhui Univ, Minist Educ, Key Lab Intelligent Comp & Signal Proc, Hefei 230601, Anhui, Peoples R China
[2] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Anhui, Peoples R China
[3] Informat Mat & Intelligent Sensing Lab Anhui Prov, Hefei 230601, Anhui, Peoples R China
[4] Xuzhou Vocat Coll Ind Technol, Sch Informat Engn, Xuzhou 221140, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Online learning; Dynamic feature spaces; Capricious data streams;
D O I
10.1007/s10489-024-05681-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data streams refer to data sequences generated at a high rate over a continuous period, such as social media analysis, financial transaction monitoring, and sensor data processing. Most existing data stream mining methods make assumptions about the feature space, assuming it is either fixed or undergoes regular changes, such as trapezoidal or evolving data streams. However, these restrictions do not hold for real-world applications where data streams may exhibit arbitrary missing features. To address the issue of arbitrary missing features in the feature space, we propose the Online Learning from Capricious Data Streams (OLCDS) algorithm and its variant, OLCDS-I. Specifically, OLCDS first identifies the higher uncertainty features that can provide more information for the optimization model. Then, based on the shared and new feature space, we formulate the constrained optimization problem using the soft margin technique. We deduce the update rules and use model sparsity to retain the essential features for classifier learning. Compared to existing online learning approaches, our new method eliminates the need for feature space assumptions and avoids generating missing features. Extensive experiments compared with five state-of-the-art methods on ten real-world datasets demonstrate the effectiveness and efficiency of our new algorithms
引用
收藏
页码:9429 / 9445
页数:17
相关论文
共 45 条
  • [1] Concept Drift Detection in Data Stream Mining : A literature review
    Agrahari, Supriya
    Singh, Anil Kumar
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (10) : 9523 - 9540
  • [2] Beyazit E, 2019, AAAI CONF ARTIF INTE, P3232
  • [3] Bhatia K., 2020, Adv. Neural. Inf. Process. Syst, P15020
  • [4] Crammer K., 2009, Adv Neural Inf Process Syst, V22
  • [5] Crammer K, 2010, ADV NEURAL INF PROCE, V23
  • [6] Crammer K., 2009, P 2009 C EMP METH NA, P496
  • [7] Continual Prototype Evolution: Learning Online from Non-Stationary Data Streams
    De lange, Matthias
    Tuytelaars, Tinne
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8230 - 8239
  • [8] Frei Spencer, 2023, 36 ANN C LEARNING TH, V195, P3173
  • [9] Unsupervised statistical concept drift detection for behaviour abnormality detection
    Friedrich, Bjoern
    Sawabe, Taishi
    Hein, Andreas
    [J]. APPLIED INTELLIGENCE, 2023, 53 (03) : 2527 - 2537
  • [10] Link Prediction Under Imperfect Detection: Collaborative Filtering for Ecological Networks
    Fu, Xiao
    Seo, Eugene
    Clarke, Justin
    Hutchinson, Rebecca A.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (08) : 3117 - 3128