Predicting and Analyzing Road Traffic Injury Severity Using Boosting-Based Ensemble Learning Models with SHAPley Additive exPlanations

被引:49
|
作者
Dong, Sheng [1 ]
Khattak, Afaq [2 ]
Ullah, Irfan [3 ]
Zhou, Jibiao [4 ]
Hussain, Arshad [5 ]
机构
[1] Ningbo Univ Technol, Sch Civil & Transportat Engn, Fenghua Rd 201, Ningbo 315211, Peoples R China
[2] Tongji Univ, Minist Educ, Key Lab Rd & Traff Engn, 4800 Caoan Rd, Shanghai 201804, Peoples R China
[3] Int Islamic Univ, Dept Civil Engn, Sect H-10, Islamabad 1243, Pakistan
[4] Tongji Univ, Coll Transportat Engn, 4800 Caoan Rd, Shanghai 201804, Peoples R China
[5] Natl Univ Sci & Technol, NUST Inst Civil Engn, Sect H-12, Islamabad 44000, Pakistan
基金
中国国家自然科学基金;
关键词
traffic safety; road traffic injuries; boosting-based ensemble models; SHapley Additive exPlanations; DECISION TREE; CRASH COUNTS; DRIVERS; CLASSIFICATION; ACCIDENTS; TIME;
D O I
10.3390/ijerph19052925
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Road traffic accidents are one of the world's most serious problems, as they result in numerous fatalities and injuries, as well as economic losses each year. Assessing the factors that contribute to the severity of road traffic injuries has proven to be insightful. The findings may contribute to a better understanding of and potential mitigation of the risk of serious injuries associated with crashes. While ensemble learning approaches are capable of establishing complex and non-linear relationships between input risk variables and outcomes for the purpose of injury severity prediction and classification, most of them share a critical limitation: their "black-box" nature. To develop interpretable predictive models for road traffic injury severity, this paper proposes four boosting-based ensemble learning models, namely a novel Natural Gradient Boosting, Adaptive Gradient Boosting, Categorical Gradient Boosting, and Light Gradient Boosting Machine, and uses a recently developed SHapley Additive exPlanations analysis to rank the risk variables and explain the optimal model. Among four models, LightGBM achieved the highest classification accuracy (73.63%), precision (72.61%), and recall (70.09%), F1-scores (70.81%), and AUC (0.71) when tested on 2015-2019 Pakistan's National Highway N-5 (Peshawar to Rahim Yar Khan Section) accident data. By incorporating the SHapley Additive exPlanations approach, we were able to interpret the model's estimation results from both global and local perspectives. Following interpretation, it was determined that the Month_of_Year, Cause_of_Accident, Driver_Age and Collision_Type all played a significant role in the estimation process. According to the analysis, young drivers and pedestrians struck by a trailer have a higher risk of suffering fatal injuries. The combination of trailers and passenger vehicles, as well as driver at-fault, hitting pedestrians and rear-end collisions, significantly increases the risk of fatal injuries. This study suggests that combining LightGBM and SHAP has the potential to develop an interpretable model for predicting road traffic injury severity.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] Traffic Speed Prediction of Urban Road Network Based on High Importance Links Using XGB and Shapley Additive Explanation
    Lee, Eun Hak
    IEEE ACCESS, 2023, 11 : 113217 - 113226
  • [32] An Automated Approach for Predicting Road Traffic Accident Severity Using Transformer Learning and Explainable AI Technique
    Aboulola, Omar Ibrahim
    Alabdulqader, Ebtisam Abdullah
    Alarfaj, Aisha Ahmed
    Alsubai, Shtwai
    Kim, Tai-Hoon
    IEEE ACCESS, 2024, 12 : 61062 - 61072
  • [33] Predicting pedestrian crash occurrence and injury severity in Texas using tree-based machine learning models
    Zhao, Bo
    Zuniga-Garcia, Natalia
    Xing, Lu
    Kockelman, Kara M.
    TRANSPORTATION PLANNING AND TECHNOLOGY, 2024, 47 (08) : 1205 - 1226
  • [34] Predicting and analyzing injury severity: A machine learning-based approach using class-imbalanced proactive and reactive data
    Sarkar, Sobhan
    Pramanik, Anima
    Maiti, J.
    Reniers, Genserik
    SAFETY SCIENCE, 2020, 125
  • [35] Predicting and analyzing injury severity: A machine learning-based approach using class-imbalanced proactive and reactive data
    Sarkar, Sobhan
    Pramanik, Anima
    Maiti, J.
    Reniers, Genserik
    Sarkar, Sobhan (sobhan.sarkar@gmail.com), 1600, Elsevier B.V., Netherlands (125):
  • [36] Interpretable prediction of acute respiratory infection disease among under-five children in Ethiopia using ensemble machine learning and Shapley additive explanations (SHAP)
    Tadese, Zinabu Bekele
    Hailu, Debela Tsegaye
    Abebe, Aschale Wubete
    Kebede, Shimels Derso
    Walle, Agmasie Damtew
    Seifu, Beminate Lemma
    Nimani, Teshome Demis
    DIGITAL HEALTH, 2024, 10
  • [37] Machine learning-based heat deflection temperature prediction and effect analysis in polypropylene composites using catboost and shapley additive explanations
    Joo, Chonghyo
    Park, Hyundo
    Lim, Jongkoo
    Cho, Hyungtae
    Kim, Junghwan
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [38] Assessing the Suitability of Boosting Machine-Learning Algorithms for Classifying Arsenic-Contaminated Waters: A Novel Model-Explainable Approach Using SHapley Additive exPlanations
    Ibrahim, Bemah
    Ewusi, Anthony
    Ahenkorah, Isaac
    WATER, 2022, 14 (21)
  • [39] Evaluating the relevance of eggshell and glass powder for cement-based materials using machine learning and SHapley Additive exPlanations (SHAP) analysis
    Amin, Muhammad Nasir
    Ahmad, Waqas
    Khan, Kaffayatullah
    Nazar, Sohaib
    Abu Arab, Abdullah Mohammad
    Deifalla, Ahmed Farouk
    CASE STUDIES IN CONSTRUCTION MATERIALS, 2023, 19
  • [40] Developing machine-learning-based models to diminish the severity of injuries sustained by pedestrians in road traffic incidents
    Elalouf, Amir
    Birfir, Slava
    Rosenbloom, Tova
    HELIYON, 2023, 9 (11)