A graph-based big data optimization approach using hidden Markov model and constraint satisfaction problem

被引:0
作者
Imad Sassi
Samir Anter
Abdelkrim Bekkhoucha
机构
[1] Computer Science Laboratory (LIM),
[2] FSTM,undefined
[3] Hassan II University,undefined
来源
Journal of Big Data | / 8卷
关键词
Machine learning; Big data analytics; Optimization; Metaheuristics; Time series forecasting; Graphical modeling;
D O I
暂无
中图分类号
学科分类号
摘要
To address the challenges of big data analytics, several works have focused on big data optimization using metaheuristics. The constraint satisfaction problem (CSP) is a fundamental concept of metaheuristics that has shown great efficiency in several fields. Hidden Markov models (HMMs) are powerful machine learning algorithms that are applied especially frequently in time series analysis. However, one issue in forecasting time series using HMMs is how to reduce the search space (state and observation space). To address this issue, we propose a graph-based big data optimization approach using a CSP to enhance the results of learning and prediction tasks of HMMs. This approach takes full advantage of both HMMs, with the richness of their algorithms, and CSPs, with their many powerful and efficient solver algorithms. To verify the validity of the model, the proposed approach is evaluated on real-world data using the mean absolute percentage error (MAPE) and other metrics as measures of the prediction accuracy. The conducted experiments show that the proposed model outperforms the conventional model. It reduces the MAPE by 0.71% and offers a particularly good trade-off between computational costs and the quality of results for large datasets. It is also competitive with benchmark models in terms of the running time and prediction accuracy. Further comparisons substantiate these experimental findings.
引用
收藏
相关论文
共 150 条
  • [41] Madsen H(2019)A study on big data frameworks and machine learning tool kits Int Conf Big Data Anal Data Mining Comput Intel 6 26-undefined
  • [42] Gao J(2021)An analysis of the graph processing landscape J Big Data. 9 4-undefined
  • [43] Wang J(2019)HPCC based framework for COPD readmission risk analysis J Big Data. 33 1923-undefined
  • [44] Wu K(2019)ParSoDA: high-level parallel programming for social data mining Soc Netw Anal Min. 8 66989-undefined
  • [45] Chen R(2021)An efficient optimization approach for designing machine learning models based on genetic algorithm Neural Comput Appl 74 1-undefined
  • [46] Lember J(2020)Improved feature selection model for big data analytics IEEE Access. undefined undefined-undefined
  • [47] Sova J(2020)Effective use of the McNemar test Behav Ecol Sociobiol undefined undefined-undefined
  • [48] Wang R(undefined)undefined undefined undefined undefined-undefined
  • [49] Yap RH(undefined)undefined undefined undefined undefined-undefined
  • [50] Shen J(undefined)undefined undefined undefined undefined-undefined