Predicting intersection crash frequency using connected vehicle data: A framework for geographical random forest

被引:40
作者
Gu, Yangsong [1 ]
Liu, Diyi [1 ]
Arvin, Ramin [1 ]
Khattak, Asad J. [1 ]
Han, Lee D. [1 ]
机构
[1] Univ Tennessee, Dept Civil & Environm Engn, Knoxville, TN 37996 USA
关键词
Connected vehicle; Crash frequency prediction; Geographical random forest; Variable importance; NEGATIVE BINOMIAL REGRESSION; WEIGHTED REGRESSION; SAFETY; MODEL; VOLATILITY;
D O I
10.1016/j.aap.2022.106880
中图分类号
TB18 [人体工程学];
学科分类号
1201 ;
摘要
Accurate crash frequency prediction is critical for proactive safety management. The emerging connected ve-hicles technology provides us with a wealth of vehicular motion data, which enables a better connection between crash frequency and driving behaviors. However, appropriately dealing with the spatial dependence of crash frequency and multitudinous driving features has been a difficult but critical challenge in the prediction process. To this end, this study aims to investigate a new Artificial Intelligence technique called Geographical Random Forest (GRF) that can address spatial heterogeneity and retain all potential predictors. By harnessing more than 2.2 billion high-resolution connected vehicle Basic Safety Message (BSM) observations from the Safety Pilot Model Deployment in Ann Arbor, MI, 30 indicators of driving volatility are extracted, including speed, longi-tudinal and lateral acceleration, and yaw rate. The developed GRF was implemented to predict rear-end crash frequency at intersections. The results show that: 1) rear-end crashes are more likely to happen at intersections connecting minor roads compared to major roads; 2) a higher number of hard acceleration and deceleration events beyond two standard deviations in the longitudinal direction is a leading indicator of rear-end crashes; 3) the optimal GRF significantly outperforms Global Random Forest, with a 9% lower test error and a substantially better fit; and 4) geographical visualization of variable importance highlights the presence of spatial non-stationarity. The proposed framework can proactively identify at-risk intersections and alert drivers when leading indicators of driving volatility tend to worsen.
引用
收藏
页数:12
相关论文
共 58 条
[1]   Analyzing angle crashes at unsignalized intersections using machine learning techniques [J].
Abdel-Aty, Mohamed ;
Haleem, Kirolos .
ACCIDENT ANALYSIS AND PREVENTION, 2011, 43 (01) :461-470
[2]  
Abdulhafedh Azad., 2008, Journal of Transportation Technologies, V6, P169, DOI DOI 10.4236/JTTS.2016.64017
[3]   A note on modeling vehicle accident frequencies with random-parameters count models [J].
Anastasopoulos, Panagiotis Ch. ;
Mannering, Fred .
ACCIDENT ANALYSIS AND PREVENTION, 2009, 41 (01) :153-159
[4]   Safety critical event prediction through unified analysis of driver and vehicle volatilities: Application of deep learning methods [J].
Arvin, Ramin ;
Khattak, Asad J. ;
Qi, Hairong .
ACCIDENT ANALYSIS AND PREVENTION, 2021, 151
[5]   The role of pre-crash driving instability in contributing to crash intensity using naturalistic driving data [J].
Arvin, Ramin ;
Kamrani, Mohsen ;
Khattak, Asad J. .
ACCIDENT ANALYSIS AND PREVENTION, 2019, 132
[6]   How instantaneous driving behavior contributes to crashes at intersections: Extracting useful information from connected vehicle message data [J].
Arvin, Ramin ;
Kamrani, Mohsen ;
Khattak, Asad J. .
ACCIDENT ANALYSIS AND PREVENTION, 2019, 127 :118-133
[7]   Empirical comparison of tree ensemble variable importance measures [J].
Auret, Lidia ;
Aldrich, Chris .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2011, 105 (02) :157-170
[8]   Confidence interval for a coefficient of quartile variation [J].
Bonett, DG .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 50 (11) :2953-2957
[9]  
Brown C.E., 1998, Applied multivariate statistics in geohydrology and related sciences, P155
[10]   Geographically weighted regression - modelling spatial non-stationarity [J].
Brunsdon, C ;
Fotheringham, S ;
Charlton, M .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES D-THE STATISTICIAN, 1998, 47 :431-443