Predicting and Analyzing Road Traffic Injury Severity Using Boosting-Based Ensemble Learning Models with SHAPley Additive exPlanations

被引:49
|
作者
Dong, Sheng [1 ]
Khattak, Afaq [2 ]
Ullah, Irfan [3 ]
Zhou, Jibiao [4 ]
Hussain, Arshad [5 ]
机构
[1] Ningbo Univ Technol, Sch Civil & Transportat Engn, Fenghua Rd 201, Ningbo 315211, Peoples R China
[2] Tongji Univ, Minist Educ, Key Lab Rd & Traff Engn, 4800 Caoan Rd, Shanghai 201804, Peoples R China
[3] Int Islamic Univ, Dept Civil Engn, Sect H-10, Islamabad 1243, Pakistan
[4] Tongji Univ, Coll Transportat Engn, 4800 Caoan Rd, Shanghai 201804, Peoples R China
[5] Natl Univ Sci & Technol, NUST Inst Civil Engn, Sect H-12, Islamabad 44000, Pakistan
基金
中国国家自然科学基金;
关键词
traffic safety; road traffic injuries; boosting-based ensemble models; SHapley Additive exPlanations; DECISION TREE; CRASH COUNTS; DRIVERS; CLASSIFICATION; ACCIDENTS; TIME;
D O I
10.3390/ijerph19052925
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Road traffic accidents are one of the world's most serious problems, as they result in numerous fatalities and injuries, as well as economic losses each year. Assessing the factors that contribute to the severity of road traffic injuries has proven to be insightful. The findings may contribute to a better understanding of and potential mitigation of the risk of serious injuries associated with crashes. While ensemble learning approaches are capable of establishing complex and non-linear relationships between input risk variables and outcomes for the purpose of injury severity prediction and classification, most of them share a critical limitation: their "black-box" nature. To develop interpretable predictive models for road traffic injury severity, this paper proposes four boosting-based ensemble learning models, namely a novel Natural Gradient Boosting, Adaptive Gradient Boosting, Categorical Gradient Boosting, and Light Gradient Boosting Machine, and uses a recently developed SHapley Additive exPlanations analysis to rank the risk variables and explain the optimal model. Among four models, LightGBM achieved the highest classification accuracy (73.63%), precision (72.61%), and recall (70.09%), F1-scores (70.81%), and AUC (0.71) when tested on 2015-2019 Pakistan's National Highway N-5 (Peshawar to Rahim Yar Khan Section) accident data. By incorporating the SHapley Additive exPlanations approach, we were able to interpret the model's estimation results from both global and local perspectives. Following interpretation, it was determined that the Month_of_Year, Cause_of_Accident, Driver_Age and Collision_Type all played a significant role in the estimation process. According to the analysis, young drivers and pedestrians struck by a trailer have a higher risk of suffering fatal injuries. The combination of trailers and passenger vehicles, as well as driver at-fault, hitting pedestrians and rear-end collisions, significantly increases the risk of fatal injuries. This study suggests that combining LightGBM and SHAP has the potential to develop an interpretable model for predicting road traffic injury severity.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Predicting and analyzing flood susceptibility using boosting-based ensemble machine learning algorithms with SHapley Additive exPlanations
    Halit Enes Aydin
    Muzaffer Can Iban
    Natural Hazards, 2023, 116 : 2957 - 2991
  • [2] Predicting and analyzing flood susceptibility using boosting-based ensemble machine learning algorithms with SHapley Additive exPlanations
    Aydin, Halit Enes
    Iban, Muzaffer Can
    NATURAL HAZARDS, 2023, 116 (03) : 2957 - 2991
  • [3] Predicting and Interpreting Student Performance Using Ensemble Models and Shapley Additive Explanations
    Sahlaoui, Hayat
    Alaoui, El Arbi Abdellaoui
    Nayyar, Anand
    Agoujil, Said
    Jaber, Mustafa Musa
    IEEE ACCESS, 2021, 9 : 152688 - 152703
  • [4] Interpretable predictive modelling of outlet temperatures in Central Alberta's hydrothermal system using boosting-based ensemble learning incorporating Shapley Additive exPlanations approach
    Yu, Ruyang
    Zhang, Kai
    Li, Tao
    Jiang, Shu
    ENERGY, 2025, 318
  • [5] Analyzing Pile-Up Crash Severity: Insights from Real-Time Traffic and Environmental Factors Using Ensemble Machine Learning and Shapley Additive Explanations Method
    Samerei, Seyed Alireza
    Aghabayk, Kayvan
    Montella, Alfonso
    SAFETY, 2024, 10 (01)
  • [6] Electricity Consumption Forecasting: An Approach Using Cooperative Ensemble Learning with SHapley Additive exPlanations
    Alba, Eduardo Luiz
    Oliveira, Gilson Adamczuk
    Ribeiro, Matheus Henrique Dal Molin
    Rodrigues, erick Oliveira
    FORECASTING, 2024, 6 (03): : 839 - 863
  • [7] Boosting-based ensemble machine learning models for predicting unconfined compressive strength of geopolymer stabilized clayey soil
    Abdullah, Gamil M. S.
    Ahmad, Mahmood
    Babur, Muhammad
    Badshah, Muhammad Usman
    Al-Mansob, Ramez A.
    Gamil, Yaser
    Fawad, Muhammad
    SCIENTIFIC REPORTS, 2024, 14 (01)
  • [8] Shapley-Additive-Explanations-Based Factor Analysis for Dengue Severity Prediction using Machine Learning
    Chowdhury, Shihab Uddin
    Sayeed, Sanjana
    Rashid, Iktisad
    Alam, Md Golam Rabiul
    Masum, Abdul Kadar Muhammad
    Dewan, M. Ali Akber
    JOURNAL OF IMAGING, 2022, 8 (09)
  • [9] Boosting-based ensemble machine learning models for predicting unconfined compressive strength of geopolymer stabilized clayey soil
    Gamil M. S. Abdullah
    Mahmood Ahmad
    Muhammad Babur
    Muhammad Usman Badshah
    Ramez A. Al-Mansob
    Yaser Gamil
    Muhammad Fawad
    Scientific Reports, 14
  • [10] Explaining deep learning-based activity schedule models using SHapley Additive exPlanations
    Koushik, Anil
    Manoj, M.
    Nezamuddin, N.
    TRANSPORTATION LETTERS-THE INTERNATIONAL JOURNAL OF TRANSPORTATION RESEARCH, 2024,