Comparative Analysis of Machine Learning Models for Performance Prediction of the SPEC Benchmarks

被引:5
作者
Tousi, Ashkan [1 ]
Lujan, Mikel [1 ]
机构
[1] Univ Manchester, Dept Comp Sci, Manchester M13 9PL, Lancs, England
基金
英国工程与自然科学研究理事会;
关键词
Benchmark testing; Predictive models; Data models; Feature extraction; Software; Hardware; Analytical models; Machine learning; performance analysis; predictive models; SPEC CPU2017; supervised learning; REGRESSION; SELECTION;
D O I
10.1109/ACCESS.2022.3142240
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Simulation-based performance prediction is cumbersome and time-consuming. An alternative approach is to consider supervised learning as a means of predicting the performance scores of Standard Performance Evaluation Corporation (SPEC) benchmarks. SPEC CPU2017 contains a public dataset of results obtained by executing 43 standardised performance benchmarks organised into 4 suites on various system configurations. This paper analyses the dataset and aims to answer the following questions: I) can we accurately predict the SPEC results based on the configurations provided in the dataset, without having to actually run the benchmarks? II) what are the most important hardware and software features? III) what are the best predictive models and hyperparameters, in terms of prediction error and time? and IV) can we predict the performance of future systems using the past data? We present how to prepare data, select features, tune hyperparameters and evaluate regression models based on Multi-Task Elastic-Net, Decision Tree, Random Forest, and Multi-Layer Perceptron neural networks estimators. Feature selection is performed in three steps: removing zero variance features, removing highly correlated features, and Recursive Feature Elimination based on different feature importance metrics: elastic-net coefficients, tree-based importance measures and Permutation Importance. We select the best models using grid search on the hyperparameter space, and finally, compare and evaluate the performance of the models. We show that tree-based models with the original 29 features provide accurate predictions with an average error of less than 4%. The average error of faster Decision Tree and Random Forest models with 10 features is still below 6% and 5% respectively.
引用
收藏
页码:11994 / 12011
页数:18
相关论文
共 50 条
  • [1] A Comparative Analysis of Machine Learning Models in Prediction of Mortar Compressive Strength
    Gayathri, Rajakumaran
    Rani, Shola Usha
    Cepova, Lenka
    Rajesh, Murugesan
    Kalita, Kanak
    PROCESSES, 2022, 10 (07)
  • [2] Explainability of Machine Learning Models for Bankruptcy Prediction
    Park, Min Sue
    Son, Hwijae
    Hyun, Chongseok
    Hwang, Hyung Ju
    IEEE ACCESS, 2021, 9 : 124887 - 124899
  • [3] Performance Analysis of Machine Learning Centered Workload Prediction Models for Cloud
    Saxena, Deepika
    Kumar, Jitendra
    Singh, Ashutosh Kumar
    Schmid, Stefan
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (04) : 1313 - 1330
  • [4] Analyzing the Performance of Univariate and Multivariate Machine Learning Models in Soil Movement Prediction: A Comparative Study
    Kumar, Praveen
    Priyanka, P.
    Dhanya, J.
    Uday, Kala Venkata
    Dutt, Varun
    IEEE ACCESS, 2023, 11 : 62368 - 62381
  • [5] Performance Metrics for the Comparative Analysis of Clinical Risk Prediction Models Employing Machine Learning
    Huang, Chenxi
    Li, Shu-Xia
    Caraballo, Cesar
    Masoudi, Frederick A.
    Rumsfeld, John S.
    Spertus, John A.
    Normand, Sharon-Lise T.
    Mortazavi, Bobak J.
    Krumholz, Harlan M.
    CIRCULATION-CARDIOVASCULAR QUALITY AND OUTCOMES, 2021, 14 (10): : 1076 - 1086
  • [6] Analyzing Effective Factors of Online Learning Performance by Interpreting Machine Learning Models
    Xiao, Wen
    Hu, Juan
    IEEE ACCESS, 2023, 11 : 132435 - 132447
  • [7] Comparative Analysis of Machine Learning Algorithms for CKD Risk Prediction
    Yang, Weilin
    Ahmed, Nasim
    Barczak, Andre L. C.
    IEEE ACCESS, 2024, 12 : 171205 - 171220
  • [8] Comparative Analysis of Rainfall Prediction Models Using Machine Learning in Islands with Complex Orography: Tenerife Island
    Aguasca-Colomo, Ricardo
    Castellanos-Nieves, Dagoberto
    Mendez, Maximo
    APPLIED SCIENCES-BASEL, 2019, 9 (22):
  • [9] Enhancing Rice Production Prediction in Indonesia Using Advanced Machine Learning Models
    Erlin
    Yunianta, Arda
    Wulandhari, Lili Ayu
    Desnelita, Yenny
    Nasution, Nurliana
    Junadhi
    IEEE ACCESS, 2024, 12 : 151161 - 151177
  • [10] A Comparative Analysis of Machine Learning Models for the Prediction of Insurance Uptake in Kenya
    Yego, Nelson Kemboi
    Kasozi, Juma
    Nkurunziza, Joseph
    DATA, 2021, 6 (11)