Predicting and Analyzing Road Traffic Injury Severity Using Boosting-Based Ensemble Learning Models with SHAPley Additive exPlanations

被引:49
|
作者
Dong, Sheng [1 ]
Khattak, Afaq [2 ]
Ullah, Irfan [3 ]
Zhou, Jibiao [4 ]
Hussain, Arshad [5 ]
机构
[1] Ningbo Univ Technol, Sch Civil & Transportat Engn, Fenghua Rd 201, Ningbo 315211, Peoples R China
[2] Tongji Univ, Minist Educ, Key Lab Rd & Traff Engn, 4800 Caoan Rd, Shanghai 201804, Peoples R China
[3] Int Islamic Univ, Dept Civil Engn, Sect H-10, Islamabad 1243, Pakistan
[4] Tongji Univ, Coll Transportat Engn, 4800 Caoan Rd, Shanghai 201804, Peoples R China
[5] Natl Univ Sci & Technol, NUST Inst Civil Engn, Sect H-12, Islamabad 44000, Pakistan
基金
中国国家自然科学基金;
关键词
traffic safety; road traffic injuries; boosting-based ensemble models; SHapley Additive exPlanations; DECISION TREE; CRASH COUNTS; DRIVERS; CLASSIFICATION; ACCIDENTS; TIME;
D O I
10.3390/ijerph19052925
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Road traffic accidents are one of the world's most serious problems, as they result in numerous fatalities and injuries, as well as economic losses each year. Assessing the factors that contribute to the severity of road traffic injuries has proven to be insightful. The findings may contribute to a better understanding of and potential mitigation of the risk of serious injuries associated with crashes. While ensemble learning approaches are capable of establishing complex and non-linear relationships between input risk variables and outcomes for the purpose of injury severity prediction and classification, most of them share a critical limitation: their "black-box" nature. To develop interpretable predictive models for road traffic injury severity, this paper proposes four boosting-based ensemble learning models, namely a novel Natural Gradient Boosting, Adaptive Gradient Boosting, Categorical Gradient Boosting, and Light Gradient Boosting Machine, and uses a recently developed SHapley Additive exPlanations analysis to rank the risk variables and explain the optimal model. Among four models, LightGBM achieved the highest classification accuracy (73.63%), precision (72.61%), and recall (70.09%), F1-scores (70.81%), and AUC (0.71) when tested on 2015-2019 Pakistan's National Highway N-5 (Peshawar to Rahim Yar Khan Section) accident data. By incorporating the SHapley Additive exPlanations approach, we were able to interpret the model's estimation results from both global and local perspectives. Following interpretation, it was determined that the Month_of_Year, Cause_of_Accident, Driver_Age and Collision_Type all played a significant role in the estimation process. According to the analysis, young drivers and pedestrians struck by a trailer have a higher risk of suffering fatal injuries. The combination of trailers and passenger vehicles, as well as driver at-fault, hitting pedestrians and rear-end collisions, significantly increases the risk of fatal injuries. This study suggests that combining LightGBM and SHAP has the potential to develop an interpretable model for predicting road traffic injury severity.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] An interpretable framework for modeling global solar radiation using tree-based ensemble machine learning and Shapley additive explanations methods
    Song, Zhe
    Cao, Sunliang
    Yang, Hongxing
    APPLIED ENERGY, 2024, 364
  • [22] Deep Learning Model for Crash Injury Severity Analysis Using Shapley Additive Explanation Values
    Kang, Yashu
    Khattak, Aemal J.
    TRANSPORTATION RESEARCH RECORD, 2022, 2676 (12) : 242 - 254
  • [23] A model for predicting academic performance on standardised tests for lagging regions based on machine learning and Shapley additive explanations
    Suaza-Medina, Mario
    Penabaena-Niebles, Rita
    Jubiz-Diaz, Maria
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [24] Predicting egg production rate and egg weight of broiler breeders based on machine learning and Shapley additive explanations
    Ji, Hengyi
    Xu, Yidan
    Teng, Ganghui
    POULTRY SCIENCE, 2025, 104 (01)
  • [25] An Explainable Prediction Model for Aerodynamic Noise of an Engine Turbocharger Compressor Using an Ensemble Learning and Shapley Additive Explanations Approach
    Huang, Rong
    Ni, Jimin
    Qiao, Pengli
    Wang, Qiwei
    Shi, Xiuyong
    Yin, Qi
    SUSTAINABILITY, 2023, 15 (18)
  • [26] Predicting Critical Path of Labor Dispute Resolution in Legal Domain by Machine Learning Models Based on SHapley Additive exPlanations and Soft Voting Strategy
    Guan, Jianhua
    Yu, Zuguo
    Liao, Yongan
    Tang, Runbin
    Duan, Ming
    Han, Guosheng
    MATHEMATICS, 2024, 12 (02)
  • [27] Predicting Road Traffic Collisions Using a Two-Layer Ensemble Machine Learning Algorithm
    Oyoo, James Oduor
    Wekesa, Jael Sanyanda
    Ogada, Kennedy Odhiambo
    APPLIED SYSTEM INNOVATION, 2024, 7 (02)
  • [28] Machine Learning Models Based on Grid-Search Optimization and Shapley Additive Explanations (SHAP) for Early Stroke Prediction
    Al Mamlook, Rabia Emhamed
    Lahwal, Fathia
    Elgeberi, Najat
    Obeidat, Muhammad
    Al-Na'amneh, Qais
    Nasayreh, Ahmad
    Gharaibeh, Hasan
    Gharaibeh, Tasnim
    Bzizi, Hanin
    4TH INTERDISCIPLINARY CONFERENCE ON ELECTRICS AND COMPUTER, INTCEC 2024, 2024,
  • [29] Credit risk assessment of automobile loans using machine learning-based SHapley Additive exPlanations approach
    Lin, Shuoyan
    Song, Dandan
    Cao, Boyi
    Gu, Xin
    Li, Jiazhan
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 147
  • [30] Comparison of Explainable Machine-Learning Models for Decision-Making in Health Intensive Care Using SHapley Additive exPlanations
    Vidal, Igor Pereira
    Pereira, Marluce Rodrigues
    Freire, Andre Pimenta
    Resende, Uanderson
    Maziero, Erick Galani
    PROCEEDINGS OF THE 19TH BRAZILIAN SYMPOSIUM ON INFORMATION SYSTEMS, 2023, : 300 - 307