Fusion k-means clustering and multi-head self-attention mechanism for a multivariate time prediction model with feature selection

被引:2
作者
Cai, Mingwei [1 ]
Zhan, Jianming [1 ]
Zhang, Chao [2 ]
Liu, Qi [1 ]
机构
[1] Hubei Minzu Univ, Sch Math & Stat, Enshi 445000, Hubei, Peoples R China
[2] Shanxi Univ, Sch Comp & Informat Technol, Key Lab Computat Intelligence & Chinese Informat P, Taiyuan 030006, Shanxi, Peoples R China
关键词
k-means clustering; Multi-head self-attention mechanism; Feature Selection; LSTM; TRANSFORMER;
D O I
10.1007/s13042-024-02490-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the demand for precise predictions grows across various industries due to advancements in sensor technology and computer hardware, multi-feature time series prediction shows significant promise in fields such as information fusion, finance, energy, and meteorology. However, traditional machine learning methods often struggle to forecast future events given the increasing complexity of the data. To address this challenge, the paper introduces an innovative approach that combines an improved k-means clustering with a multi-head self-attention mechanism. This method utilizes long and short-term memory (LSTM) neural networks to filter and identify the most effective feature subset for prediction. In the enhanced k-means clustering algorithm, a novel similarity formula named Feature Vector Similarity (FVS) and a method for automatically determining the number of clustering centers are proposed. This advancement improves the rationality of cluster center selection and enhances overall clustering performance. The multi-head self-attention mechanism calculates the clustering centers and attention weights of objects within the cluster partitions, optimizing feature selection and enhancing computational efficiency. The fusion of k-means clustering, the multi-head self-attention mechanism, and LSTM networks results in a new feature selection method, referred to as KMAL. To further refine the prediction process, we integrate KMAL with LSTM, known for its strong performance in predicting long-term time series, to develop a novel prediction model: KMAL-LSTM. In the subsequent comparative experiments, the prediction performance of the models is assessed using mean absolute error (MAE), mean bias error (MBE), and root mean square error (RMSE). The proposed KMAL-LSTM model consistently exhibits superior validity, stability, and performance when compared to seven other prediction models across six distinct datasets.
引用
收藏
页数:19
相关论文
共 50 条
[31]   Prediction of new active cases of coronavirus disease (COVID-19) pandemic using multiple linear regression model [J].
Rath, Smita ;
Tripathy, Alakananda ;
Tripathy, Alok Ranjan .
DIABETES & METABOLIC SYNDROME-CLINICAL RESEARCH & REVIEWS, 2020, 14 (05) :1467-1474
[32]   Multivariate Time Series Predictor With Parameter Optimization and Feature Selection Based on Modified Binary Salp Swarm Algorithm [J].
Ren, Weijie ;
Ma, Dewei ;
Han, Min .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (04) :6150-6159
[33]   Deep Learning-Based Weather Prediction: A Survey [J].
Ren, Xiaoli ;
Li, Xiaoyong ;
Ren, Kaijun ;
Song, Junqiang ;
Xu, Zichen ;
Deng, Kefeng ;
Wang, Xiang .
BIG DATA RESEARCH, 2021, 23
[34]   Effects of data smoothing and recurrent neural network (RNN) algorithms for real-time forecasting of tunnel boring machine (TBM) performance [J].
Shan, Feng ;
He, Xuzhen ;
Armaghani, Danial Jahed ;
Sheng, Daichao .
JOURNAL OF ROCK MECHANICS AND GEOTECHNICAL ENGINEERING, 2024, 16 (05) :1538-1551
[35]   Hierarchical co-clustering with augmented matrices from external domains [J].
Sugahara, Kai ;
Okamoto, Kazushi .
PATTERN RECOGNITION, 2023, 142
[36]   Multi-scale cross-attention transformer via graph embeddings for few-shot molecular property prediction [J].
Torres, Luis H. M. ;
Ribeiro, Bernardete ;
Arrais, Joel P. .
APPLIED SOFT COMPUTING, 2024, 153
[37]  
Vaswani A, 2017, ADV NEUR IN, V30
[38]   A deep learning framework combining CNN and GRU for improving wheat yield estimates using time series remotely sensed multi-variables [J].
Wang, Jie ;
Wang, Pengxin ;
Tian, Huiren ;
Tansey, Kevin ;
Liu, Junming ;
Quan, Wenting .
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2023, 206
[39]   Long-Term Traffic Prediction Based on LSTM Encoder-Decoder Architecture [J].
Wang, Zhumei ;
Su, Xing ;
Ding, Zhiming .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (10) :6561-6571
[40]   MBSSA-Bi-AESN: Classification prediction of bi-directional adaptive echo state network based on modified binary salp swarm algorithm and feature selection [J].
Wu, Xunjin ;
Zhan, Jianming ;
Li, Tianrui ;
Ding, Weiping ;
Pedrycz, Witold .
APPLIED INTELLIGENCE, 2024, 54 (02) :1706-1733