Effects of Different Training Datasets on Machine Learning Models for Pavement Performance Prediction

被引:1
|
作者
Aranha, Ana Luisa [1 ]
Bernucci, Liedi Legi Bariani [1 ]
Vasconcelos, Kamilla L. [1 ]
机构
[1] Univ Sao Paulo, Dept Transportat Engn, Sao Paulo, Brazil
关键词
data and data science; machine learning (artificial intelligence); infrastructure; infrastructure management and system preservation; pavement management systems; pavement performance;
D O I
10.1177/03611981231155902
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
With improvements in data collection, storage, and processing, machine learning (ML) is gaining momentum as a behavior prediction method in the field of engineering. Several studies have evaluated these algorithms' potential to predict pavement serviceability, however some challenges limit its use. Training data preprocessing has a great impact on the model's predictive performance, is highly dependent on the modeler's experience, and is not typically reported in engineering-related literature. The objective of this study was to assess the effects of data preprocessing, hyperparameter selection, and time series size on the model's evaluation metrics. Therefore, this paper analyzes the performance of three ML algorithms on maximum deflection (D-0) and international roughness index (IRI) prediction: support vector machine, random forest (RF), and artificial neural network (ANN). An R-2 and mean square error (MSE) analysis was conducted on 12 training datasets, with two sizes of historical data and five stages of data preprocessing. The results indicated that ANN was the most accurate technique with an R-2 of 0.99 and MSE of 20 x10(-3) mm on the D-0 prediction and an R-2 of 0.91 and MSE of 0.03 m/km on the IRI prediction. RF was also identified as an effective technique, generating similar results with less data preprocessing. The addition of structural and traffic categorical features to the training dataset resulted in the most significant improvement of the support vector regression and ANN performance metrics; the hyperparameter selection was effective only on IRI prediction, especially with the ANN algorithm.
引用
收藏
页码:196 / 206
页数:11
相关论文
共 50 条
  • [21] Creating Rutting Prediction Models through Machine Learning Techniques Utilizing the Long-Term Pavement Performance Database
    Alnaqbi, Ali Juma
    Zeiada, Waleed
    Al-Khateeb, Ghazi G.
    Hamad, Khaled
    Barakat, Samer
    SUSTAINABILITY, 2023, 15 (18)
  • [22] Comparative Analysis of Machine Learning Models for Prediction of Remaining Service Life of Flexible Pavement
    Nabipour, Narjes
    Karballaeezadeh, Nader
    Dineva, Adrienn
    Mosavi, Amir
    Mohammadzadeh, Danial S.
    Shamshirband, Shahaboddin
    MATHEMATICS, 2019, 7 (12)
  • [23] Performance Comparison of Machine Learning Models for Annual Precipitation Prediction Using Different Decomposition Methods
    Song, Chao
    Chen, Xiaohong
    REMOTE SENSING, 2021, 13 (05) : 1 - 27
  • [24] Prediction of Therapeutic Peptides Using Machine Learning: Computational Models, Datasets, and Feature Encodings
    Attique, Muhammad
    Farooq, Muhammad Shoaib
    Khelifi, Adel
    Abid, Adnan
    IEEE ACCESS, 2020, 8 (08): : 148570 - 148594
  • [25] Elastic Modulus Prediction of Ultra-High-Performance Concrete with Different Machine Learning Models
    Zhang, Chaohui
    Liu, Peng
    Song, Tiantian
    He, Bin
    Li, Wei
    Peng, Yuansheng
    BUILDINGS, 2024, 14 (10)
  • [26] The Impact of Multi-Institution Datasets on the Generalizability of Machine Learning Prediction Models in the ICU
    Rockenschaub, Patrick
    Hilbert, Adam
    Kossen, Tabea
    Elbers, Paul
    von Dincklage, Falk
    Madai, Vince Istvan
    Frey, Dietmar
    CRITICAL CARE MEDICINE, 2024, 52 (11) : 1710 - 1721
  • [27] Different learning predictors and their effects for Moodle Machine Learning models
    Bognar, Laszlo
    Fauszt, Tibor
    2020 11TH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM 2020), 2020, : 405 - 409
  • [28] Machine Learning-Based Prediction Models for Different Clinical Risks in Different Hospitals: Evaluation of Live Performance
    Sun, Hong
    Depraetere, Kristof
    Meesseman, Laurent
    Silva, Patricia Cabanillas
    Szymanowsky, Ralph
    Fliegenschmidt, Janis
    Hulde, Nikolai
    von Dossow, Vera
    Vanbiervliet, Martijn
    De Baerdemaeker, Jos
    Roccaro-Waldmeyer, Diana M.
    Stieg, Jorg
    Hidalgo, Manuel Dominguez
    Dahlweid, Fried-Michael
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2022, 24 (06)
  • [29] Comparison of Machine Learning Algorithms on Different Datasets
    Uysal, Elif
    Ozturk, Ali
    2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
  • [30] Enhancing Machine Learning Training Performance in Smart Agriculture Datasets Using a Mobile App
    Zarymkanov, Temirlan
    Kargar, Amin
    Pinotti, Cristina M.
    O'Flynn, Brendan
    Zorbas, Dimitrios
    PROCEEDINGS OF 2023 IEEE INTERNATIONAL WORKSHOP ON METROLOGY FOR AGRICULTURE AND FORESTRY, METROAGRIFOR, 2023, : 455 - 460