Self-Paced Ensemble-SHAP Approach for the Classification and Interpretation of Crash Severity in Work Zone Areas

被引:8
作者
Asadi, Roksana [1 ]
Khattak, Afaq [2 ]
Vashani, Hossein [3 ]
Almujibah, Hamad R. [4 ]
Rabie, Helia [5 ]
Asadi, Seyedamirhossein [6 ]
Dimitrijevic, Branislav [1 ]
机构
[1] New Jersey Inst Technol, Dept Civil & Environm Engn, Newark, NJ 07102 USA
[2] Tongji Univ, Key Lab Infrastruct Durabil & Operat Safety Airfie, 4800 Caoan Rd, Shanghai 201804, Peoples R China
[3] Rutgers State Univ, Rutgers Business Sch, Newark, NJ 07102 USA
[4] Taif Univ, Coll Engn, Dept Civil Engn, Taif City 21974, Saudi Arabia
[5] CUNY, Grad Ctr, Dept Econ, New York, NY 10016 USA
[6] KN Toosi Univ Technol, Dept Civil Engn, Tehran 1543319967, Iran
关键词
work zones crashes; machine learning; self-paced ensemble; Shapley additive explanations; INJURY SEVERITY; REGRESSION; DRIVERS; RISK;
D O I
10.3390/su15119076
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The identification of causative factors and implementation of measures to mitigate work zone crashes can significantly improve overall road safety. This study introduces a Self-Paced Ensemble (SPE) framework, which is utilized in conjunction with the Shapley additive explanations (SHAP) interpretation system, to predict and interpret the severity of work-zone-related crashes. The proposed methodology is an ensemble learning approach that aims to mitigate the issue of imbalanced classification in datasets of significant magnitude. The proposed solution provides an intuitive way to tackle issues related to imbalanced classes, demonstrating remarkable computational efficacy, praiseworthy accuracy, and extensive adaptability to various machine learning models. The study employed work zone crash data from the state of New Jersey spanning a period of two years (2017 and 2018) to train and evaluate the model. The study compared the prediction outcomes of the SPE model with various tree-based machine learning models, such as Light Gradient Boosting Machine, adaptive boosting, and classification and regression tree, along with binary logistic regression. The performance of the SPE model was superior to that of tree-based machine learning models and binary logistic regression. According to the SHAP interpretation, the variables that exhibited the highest degree of influence were crash type, road system, and road median type. According to the model, on highways with barrier-type medians, it is expected that crashes that happen in the same direction and those that happen at a right angle will be the most severe crashes. Additionally, this study found that severe injuries were more likely to result from work zone crashes that happened at night on state highways with localized street lighting.
引用
收藏
页数:23
相关论文
共 34 条
[1]   Review on forecasting of photovoltaic power generation based on machine learning and metaheuristic techniques [J].
Akhter, Muhammad Naveed ;
Mekhilef, Saad ;
Mokhlis, Hazlie ;
Shah, Noraisyah Mohamed .
IET RENEWABLE POWER GENERATION, 2019, 13 (07) :1009-1023
[2]   Determining the effective location of a portable changeable message sign on reducing the risk of truck-related crashes in work zones [J].
Bai, Yong ;
Yang, Yarong ;
Li, Yue .
ACCIDENT ANALYSIS AND PREVENTION, 2015, 83 :197-202
[3]   Big Data and Machine Learning in Health Care [J].
Beam, Andrew L. ;
Kohane, Isaac S. .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2018, 319 (13) :1317-1318
[4]   SMOTEBoost: Improving prediction of the minority class in boosting [J].
Chawla, NV ;
Lazarevic, A ;
Hall, LO ;
Bowyer, KW .
KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2003, PROCEEDINGS, 2003, 2838 :107-119
[5]   Modeling safety of highway work zones with random parameters and random effects models [J].
Chen, Erdong ;
Tarko, Andrew P. .
ANALYTIC METHODS IN ACCIDENT RESEARCH, 2014, 1 :86-95
[6]   Injury severities of truck drivers in single- and multi-vehicle accidents on rural highways [J].
Chen, Feng ;
Chen, Suren .
ACCIDENT ANALYSIS AND PREVENTION, 2011, 43 (05) :1677-1688
[7]  
Dimitrijevic B., 2020, Segment-Level Crash Risk Analysis for New Jersey Highways Using Advanced Data Modeling
[8]   Application of hybrid support vector Machine models in analysis of work zone crash injury severity [J].
Dimitrijevic, Branislav ;
Asadi, Roksana ;
Spasovic, Lazar .
TRANSPORTATION RESEARCH INTERDISCIPLINARY PERSPECTIVES, 2023, 19
[9]   Short-Term Segment-Level Crash Risk Prediction Using Advanced Data Modeling with Proactive and Reactive Crash Data [J].
Dimitrijevic, Branislav ;
Khales, Sina Darban ;
Asadi, Roksana ;
Lee, Joyoung .
APPLIED SCIENCES-BASEL, 2022, 12 (02)
[10]  
Dixon M.F., 2020, Machine Learning in Finance, DOI [10.1007/978-3-030-41068-1, DOI 10.1007/978-3-030-41068-1]