Effects of Different Training Datasets on Machine Learning Models for Pavement Performance Prediction

被引:1
|
作者
Aranha, Ana Luisa [1 ]
Bernucci, Liedi Legi Bariani [1 ]
Vasconcelos, Kamilla L. [1 ]
机构
[1] Univ Sao Paulo, Dept Transportat Engn, Sao Paulo, Brazil
关键词
data and data science; machine learning (artificial intelligence); infrastructure; infrastructure management and system preservation; pavement management systems; pavement performance;
D O I
10.1177/03611981231155902
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
With improvements in data collection, storage, and processing, machine learning (ML) is gaining momentum as a behavior prediction method in the field of engineering. Several studies have evaluated these algorithms' potential to predict pavement serviceability, however some challenges limit its use. Training data preprocessing has a great impact on the model's predictive performance, is highly dependent on the modeler's experience, and is not typically reported in engineering-related literature. The objective of this study was to assess the effects of data preprocessing, hyperparameter selection, and time series size on the model's evaluation metrics. Therefore, this paper analyzes the performance of three ML algorithms on maximum deflection (D-0) and international roughness index (IRI) prediction: support vector machine, random forest (RF), and artificial neural network (ANN). An R-2 and mean square error (MSE) analysis was conducted on 12 training datasets, with two sizes of historical data and five stages of data preprocessing. The results indicated that ANN was the most accurate technique with an R-2 of 0.99 and MSE of 20 x10(-3) mm on the D-0 prediction and an R-2 of 0.91 and MSE of 0.03 m/km on the IRI prediction. RF was also identified as an effective technique, generating similar results with less data preprocessing. The addition of structural and traffic categorical features to the training dataset resulted in the most significant improvement of the support vector regression and ANN performance metrics; the hyperparameter selection was effective only on IRI prediction, especially with the ANN algorithm.
引用
收藏
页码:196 / 206
页数:11
相关论文
共 50 条
  • [1] Review on Machine Learning Techniques for Developing Pavement Performance Prediction Models
    Justo-Silva, Rita
    Ferreira, Adelino
    Flintsch, Gerardo
    SUSTAINABILITY, 2021, 13 (09)
  • [2] Machine learning approach for pavement performance prediction
    Marcelino, Pedro
    Antunes, Maria de Lurdes
    Fortunato, Eduardo
    Gomes, Marta Castilho
    INTERNATIONAL JOURNAL OF PAVEMENT ENGINEERING, 2021, 22 (03) : 341 - 354
  • [3] Machine learning modeling of pavement performance and IRI prediction in flexible pavement
    Alnaqbi, Ali
    Zeiada, Waleed
    Al-Khateeb, Ghazi G.
    INNOVATIVE INFRASTRUCTURE SOLUTIONS, 2024, 9 (10)
  • [4] An interpretation framework of machine learning models: prediction of pavement long-term performance
    Wu, Jiantao
    Chen, Jiaqi
    Liu, Quan
    Ma, Xinyuan
    INTERNATIONAL JOURNAL OF PAVEMENT ENGINEERING, 2024, 25 (01)
  • [5] Evaluating machine learning models for building risk prediction models in complex datasets
    Cook, James P.
    Goulermas, Yannis
    Morris, Andrew P.
    GENETIC EPIDEMIOLOGY, 2020, 44 (05) : 477 - 477
  • [6] Machine Learning Models for Early Prediction of Sepsis on Large Healthcare Datasets
    Camacho-Cogollo, Javier Enrique
    Bonet, Isis
    Gil, Bladimir
    Iadanza, Ernesto
    ELECTRONICS, 2022, 11 (09)
  • [7] Prediction of breast cancer using machine learning algorithms on different datasets
    Yavuz, Omer Cagri
    Calp, M. Hanefi
    Erkengel, Hazel Ceren
    INGENIERIA SOLIDARIA, 2023, 19 (01):
  • [8] Machine Learning Models for Multirotor Performance Prediction
    Cornelius, Jason
    Schmitz, Sven
    JOURNAL OF AIRCRAFT, 2024, 61 (04): : 1303 - 1313
  • [9] Performance of Quantum Annealing Machine Learning Classification Models on ADMET Datasets
    Salloum, Hadi
    Sabbagh, Kamil
    Savchuk, Vladislav
    Lukin, Ruslan
    Orabi, Osama
    Isangulov, Marat
    Mazzara, Manuel
    IEEE ACCESS, 2025, 13 : 16263 - 16287
  • [10] Landslide Susceptibility Prediction Considering Spatio-Temporal Division Principle of Training/Testing Datasets in Machine Learning Models
    Huang F.
    Ouyang W.
    Jiang S.
    Fan X.
    Lian Z.
    Zhou C.
    Diqiu Kexue - Zhongguo Dizhi Daxue Xuebao/Earth Science - Journal of China University of Geosciences, 2024, 49 (05): : 1607 - 1618