Predicting and Analyzing Road Traffic Injury Severity Using Boosting-Based Ensemble Learning Models with SHAPley Additive exPlanations

被引:53
|
作者
Dong, Sheng [1 ]
Khattak, Afaq [2 ]
Ullah, Irfan [3 ]
Zhou, Jibiao [4 ]
Hussain, Arshad [5 ]
机构
[1] Ningbo Univ Technol, Sch Civil & Transportat Engn, Fenghua Rd 201, Ningbo 315211, Peoples R China
[2] Tongji Univ, Minist Educ, Key Lab Rd & Traff Engn, 4800 Caoan Rd, Shanghai 201804, Peoples R China
[3] Int Islamic Univ, Dept Civil Engn, Sect H-10, Islamabad 1243, Pakistan
[4] Tongji Univ, Coll Transportat Engn, 4800 Caoan Rd, Shanghai 201804, Peoples R China
[5] Natl Univ Sci & Technol, NUST Inst Civil Engn, Sect H-12, Islamabad 44000, Pakistan
基金
中国国家自然科学基金;
关键词
traffic safety; road traffic injuries; boosting-based ensemble models; SHapley Additive exPlanations; DECISION TREE; CRASH COUNTS; DRIVERS; CLASSIFICATION; ACCIDENTS; TIME;
D O I
10.3390/ijerph19052925
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Road traffic accidents are one of the world's most serious problems, as they result in numerous fatalities and injuries, as well as economic losses each year. Assessing the factors that contribute to the severity of road traffic injuries has proven to be insightful. The findings may contribute to a better understanding of and potential mitigation of the risk of serious injuries associated with crashes. While ensemble learning approaches are capable of establishing complex and non-linear relationships between input risk variables and outcomes for the purpose of injury severity prediction and classification, most of them share a critical limitation: their "black-box" nature. To develop interpretable predictive models for road traffic injury severity, this paper proposes four boosting-based ensemble learning models, namely a novel Natural Gradient Boosting, Adaptive Gradient Boosting, Categorical Gradient Boosting, and Light Gradient Boosting Machine, and uses a recently developed SHapley Additive exPlanations analysis to rank the risk variables and explain the optimal model. Among four models, LightGBM achieved the highest classification accuracy (73.63%), precision (72.61%), and recall (70.09%), F1-scores (70.81%), and AUC (0.71) when tested on 2015-2019 Pakistan's National Highway N-5 (Peshawar to Rahim Yar Khan Section) accident data. By incorporating the SHapley Additive exPlanations approach, we were able to interpret the model's estimation results from both global and local perspectives. Following interpretation, it was determined that the Month_of_Year, Cause_of_Accident, Driver_Age and Collision_Type all played a significant role in the estimation process. According to the analysis, young drivers and pedestrians struck by a trailer have a higher risk of suffering fatal injuries. The combination of trailers and passenger vehicles, as well as driver at-fault, hitting pedestrians and rear-end collisions, significantly increases the risk of fatal injuries. This study suggests that combining LightGBM and SHAP has the potential to develop an interpretable model for predicting road traffic injury severity.
引用
收藏
页数:23
相关论文
共 19 条
  • [1] Predicting and analyzing flood susceptibility using boosting-based ensemble machine learning algorithms with SHapley Additive exPlanations
    Halit Enes Aydin
    Muzaffer Can Iban
    Natural Hazards, 2023, 116 : 2957 - 2991
  • [2] Predicting and analyzing flood susceptibility using boosting-based ensemble machine learning algorithms with SHapley Additive exPlanations
    Aydin, Halit Enes
    Iban, Muzaffer Can
    NATURAL HAZARDS, 2023, 116 (03) : 2957 - 2991
  • [3] Predicting and Interpreting Student Performance Using Ensemble Models and Shapley Additive Explanations
    Sahlaoui, Hayat
    Alaoui, El Arbi Abdellaoui
    Nayyar, Anand
    Agoujil, Said
    Jaber, Mustafa Musa
    IEEE ACCESS, 2021, 9 : 152688 - 152703
  • [4] Analyzing Pile-Up Crash Severity: Insights from Real-Time Traffic and Environmental Factors Using Ensemble Machine Learning and Shapley Additive Explanations Method
    Samerei, Seyed Alireza
    Aghabayk, Kayvan
    Montella, Alfonso
    SAFETY, 2024, 10 (01)
  • [5] Prediction model for the compressive strength of rock based on stacking ensemble learning and shapley additive explanations
    Wu, Luyuan
    Li, Jianhui
    Zhang, Jianwei
    Wang, Zifa
    Tong, Jingbo
    Ding, Fei
    Li, Meng
    Feng, Yi
    Li, Hui
    BULLETIN OF ENGINEERING GEOLOGY AND THE ENVIRONMENT, 2024, 83 (11)
  • [6] Predicting Critical Path of Labor Dispute Resolution in Legal Domain by Machine Learning Models Based on SHapley Additive exPlanations and Soft Voting Strategy
    Guan, Jianhua
    Yu, Zuguo
    Liao, Yongan
    Tang, Runbin
    Duan, Ming
    Han, Guosheng
    MATHEMATICS, 2024, 12 (02)
  • [7] An interpretable framework for modeling global solar radiation using tree-based ensemble machine learning and Shapley additive explanations methods
    Song, Zhe
    Cao, Sunliang
    Yang, Hongxing
    APPLIED ENERGY, 2024, 364
  • [8] A model for predicting academic performance on standardised tests for lagging regions based on machine learning and Shapley additive explanations
    Suaza-Medina, Mario
    Penabaena-Niebles, Rita
    Jubiz-Diaz, Maria
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [9] Predicting and analyzing injury severity: A machine learning-based approach using class-imbalanced proactive and reactive data
    Sarkar, Sobhan
    Pramanik, Anima
    Maiti, J.
    Reniers, Genserik
    SAFETY SCIENCE, 2020, 125
  • [10] Credit risk assessment of automobile loans using machine learning-based SHapley Additive exPlanations approach
    Lin, Shuoyan
    Song, Dandan
    Cao, Boyi
    Gu, Xin
    Li, Jiazhan
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 147