共 43 条
An adaptive XGBoost-based optimized sliding window for concept drift handling in non-stationary spatiotemporal data streams classifications
被引:3
|作者:
Angbera, Ature
[1
,2
]
Chan, Huah Yong
[1
]
机构:
[1] Univ Sains Malaysia, Sch Comp Sci, Minden 11800, Pulau Pinang, Malaysia
[2] Joseph Sarwuan Tarka Univ, Dept Comp Sci, Makurdi, Nigeria
来源:
关键词:
Concept drift;
Machine learning;
Sliding windows;
Spatiotemporal data streams;
Bayesian optimization;
D O I:
10.1007/s11227-023-05729-8
中图分类号:
TP3 [计算技术、计算机技术];
学科分类号:
0812 ;
摘要:
In recent years, the popularity of using data science for decision-making has grown significantly. This rise in popularity has led to a significant learning challenge known as concept drifting, primarily due to the increasing use of spatial and temporal data streaming applications. Concept drift can have highly negative consequences, leading to the degradation of models used in these applications. A new model called BOASWIN-XGBoost (Bayesian Optimized Adaptive Sliding Window and XGBoost) has been introduced in this work to handle concept drift. This model is designed explicitly for classifying streaming data and comprises three main procedures: pre-processing, concept drift detection, and classification. The BOASWIN-XGBoost model utilizes a method called Bayesian-Optimized Adaptive Sliding Window (BOASWIN) to identify the presence of concept drift in the streaming data. Additionally, it employs an optimized XGBoost (eXtreme Gradient Boosting) model for classification purposes. The hyperparameter tuning approach known as BO-TPE (Bayesian Optimization with Tree-structured Parzen Estimator) is employed to fine-tune the XGBoost model's parameters, thus enhancing the classifier's performance. Seven streaming datasets were used to evaluate the proposed approach's performance, including Agrawal_a, Agrawal_g, SEA_a, SEA_g, Hyperplane, Phishing, and Weather. The simulation results demonstrate that the suggested model achieves impressive accuracy values of 70.83%, 71.02%, 76.76%, 76.96%, 84.26%, 95.53%, and 78.35% on the corresponding datasets, affirming its superior performance in handling concept drift and classifying streaming data.
引用
收藏
页码:7781 / 7811
页数:31
相关论文