Predicting and Analyzing Road Traffic Injury Severity Using Boosting-Based Ensemble Learning Models with SHAPley Additive exPlanations

被引：49

作者：

Dong, Sheng ^{[1
]}

Khattak, Afaq ^{[2
]}

Ullah, Irfan ^{[3
]}

Zhou, Jibiao ^{[4
]}

Hussain, Arshad ^{[5
]}

机构：

[1] Ningbo Univ Technol, Sch Civil & Transportat Engn, Fenghua Rd 201, Ningbo 315211, Peoples R China

[2] Tongji Univ, Minist Educ, Key Lab Rd & Traff Engn, 4800 Caoan Rd, Shanghai 201804, Peoples R China

[3] Int Islamic Univ, Dept Civil Engn, Sect H-10, Islamabad 1243, Pakistan

[4] Tongji Univ, Coll Transportat Engn, 4800 Caoan Rd, Shanghai 201804, Peoples R China

[5] Natl Univ Sci & Technol, NUST Inst Civil Engn, Sect H-12, Islamabad 44000, Pakistan

来源：

INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH | 2022年 / 19卷 / 05期

基金：

中国国家自然科学基金;

关键词：

traffic safety; road traffic injuries; boosting-based ensemble models; SHapley Additive exPlanations; DECISION TREE; CRASH COUNTS; DRIVERS; CLASSIFICATION; ACCIDENTS; TIME;

D O I：

10.3390/ijerph19052925

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

Road traffic accidents are one of the world's most serious problems, as they result in numerous fatalities and injuries, as well as economic losses each year. Assessing the factors that contribute to the severity of road traffic injuries has proven to be insightful. The findings may contribute to a better understanding of and potential mitigation of the risk of serious injuries associated with crashes. While ensemble learning approaches are capable of establishing complex and non-linear relationships between input risk variables and outcomes for the purpose of injury severity prediction and classification, most of them share a critical limitation: their "black-box" nature. To develop interpretable predictive models for road traffic injury severity, this paper proposes four boosting-based ensemble learning models, namely a novel Natural Gradient Boosting, Adaptive Gradient Boosting, Categorical Gradient Boosting, and Light Gradient Boosting Machine, and uses a recently developed SHapley Additive exPlanations analysis to rank the risk variables and explain the optimal model. Among four models, LightGBM achieved the highest classification accuracy (73.63%), precision (72.61%), and recall (70.09%), F1-scores (70.81%), and AUC (0.71) when tested on 2015-2019 Pakistan's National Highway N-5 (Peshawar to Rahim Yar Khan Section) accident data. By incorporating the SHapley Additive exPlanations approach, we were able to interpret the model's estimation results from both global and local perspectives. Following interpretation, it was determined that the Month_of_Year, Cause_of_Accident, Driver_Age and Collision_Type all played a significant role in the estimation process. According to the analysis, young drivers and pedestrians struck by a trailer have a higher risk of suffering fatal injuries. The combination of trailers and passenger vehicles, as well as driver at-fault, hitting pedestrians and rear-end collisions, significantly increases the risk of fatal injuries. This study suggests that combining LightGBM and SHAP has the potential to develop an interpretable model for predicting road traffic injury severity.

引用

页数：23

共 50 条

[21] An interpretable framework for modeling global solar radiation using tree-based ensemble machine learning and Shapley additive explanations methods
Song, Zhe
Cao, Sunliang
Yang, Hongxing
APPLIED ENERGY, 2024, 364
[22] Deep Learning Model for Crash Injury Severity Analysis Using Shapley Additive Explanation Values
Kang, Yashu
Khattak, Aemal J.
TRANSPORTATION RESEARCH RECORD, 2022, 2676 (12) : 242 - 254
[23] A model for predicting academic performance on standardised tests for lagging regions based on machine learning and Shapley additive explanations
Suaza-Medina, Mario
Penabaena-Niebles, Rita
Jubiz-Diaz, Maria
SCIENTIFIC REPORTS, 2024, 14 (01):
[24] Predicting egg production rate and egg weight of broiler breeders based on machine learning and Shapley additive explanations
Ji, Hengyi
Xu, Yidan
Teng, Ganghui
POULTRY SCIENCE, 2025, 104 (01)
[25] An Explainable Prediction Model for Aerodynamic Noise of an Engine Turbocharger Compressor Using an Ensemble Learning and Shapley Additive Explanations Approach
Huang, Rong
Ni, Jimin
Qiao, Pengli
Wang, Qiwei
Shi, Xiuyong
Yin, Qi
SUSTAINABILITY, 2023, 15 (18)
[26] Predicting Critical Path of Labor Dispute Resolution in Legal Domain by Machine Learning Models Based on SHapley Additive exPlanations and Soft Voting Strategy
Guan, Jianhua
Yu, Zuguo
Liao, Yongan
Tang, Runbin
Duan, Ming
Han, Guosheng
MATHEMATICS, 2024, 12 (02)
[27] Predicting Road Traffic Collisions Using a Two-Layer Ensemble Machine Learning Algorithm
Oyoo, James Oduor
Wekesa, Jael Sanyanda
Ogada, Kennedy Odhiambo
APPLIED SYSTEM INNOVATION, 2024, 7 (02)
[28] Machine Learning Models Based on Grid-Search Optimization and Shapley Additive Explanations (SHAP) for Early Stroke Prediction
Al Mamlook, Rabia Emhamed
Lahwal, Fathia
Elgeberi, Najat
Obeidat, Muhammad
Al-Na'amneh, Qais
Nasayreh, Ahmad
Gharaibeh, Hasan
Gharaibeh, Tasnim
Bzizi, Hanin
4TH INTERDISCIPLINARY CONFERENCE ON ELECTRICS AND COMPUTER, INTCEC 2024, 2024,
[29] Credit risk assessment of automobile loans using machine learning-based SHapley Additive exPlanations approach
Lin, Shuoyan
Song, Dandan
Cao, Boyi
Gu, Xin
Li, Jiazhan
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 147
[30] Comparison of Explainable Machine-Learning Models for Decision-Making in Health Intensive Care Using SHapley Additive exPlanations
Vidal, Igor Pereira
Pereira, Marluce Rodrigues
Freire, Andre Pimenta
Resende, Uanderson
Maziero, Erick Galani
PROCEEDINGS OF THE 19TH BRAZILIAN SYMPOSIUM ON INFORMATION SYSTEMS, 2023, : 300 - 307

← 1 2 3 4 5 →