Investigating macro-level hotzone identification and variable importance using big data: A random forest models approach

被引:44
作者
Jiang, Ximiao [1 ]
Abdel-Aty, Mohamed [2 ]
Hu, Jia [1 ]
Lee, Jaeyoung [2 ]
机构
[1] Fed Highway Adm, Off Operat R&D, Mclean, VA 22101 USA
[2] Univ Cent Florida, Dept Civil Environm & Construct Engn, Orlando, FL 32816 USA
关键词
Hotzone identification; Big data; Connected Vehicle; Variable importance; Random forest; Wilcoxon test; TRAFFIC ACCIDENTS; SPATIAL-ANALYSIS; INJURY SEVERITY; SAFETY ANALYSIS; ROAD CRASHES; LAND-USE; CLASSIFICATION; LEVEL; HETEROGENEITY; COLLISIONS;
D O I
10.1016/j.neucom.2015.08.097
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As Connected Vehicle technologies begin to be deployed along roadway networks, they will be providing massive amount of data. This big data can be useful in identifying safety hazardous zones, which can be complicated and unreliable today. Without sufficient data, past studies had to focus mostly on the micro level networks. Research on macro-level hotzone identification is limited, and until this point, the contribution of various macroscopic features on the macro-level crash risks is still in dispute. This paper, with the help of massive amount of data, investigates the feasibility of using random forest for hotzone identification at macro-level- the Traffic Analysis Zone (TAZ) level. At the same time, the most influential macro-level crash risk determinants were identified by applying a series of random forest models in combination with the cross validation methods. The differences of all features between hotzones and normal TAZs were also recognized through Wilcoxon tests. Crash data of three counties in Florida during 2008 and 2009 were employed. Crash risks by different injury levels and collision types were investigated separately. Finally, the significance of various macroscopic variables was determined by different types of crash risks using variable importance analysis. The research results suggest that the distribution of road network and socio-economics are the two most important factors when proactively alleviating traffic safety issues. For developed urban areas, it is desirable to formulate specific traffic safety management strategies that accounts for zone-level socioeconomics and development of road infrastructure. For zones with a higher percentage of school enrollment, pedestrian and bicycle friendly roadway system design are most beneficial. It is also desirable to take efficient countermeasures such as law enforcement and driving school training to regulate young drivers' behavior in school zones. For areas with high minority residence, there might be a need to use awareness campaigns in multiple languages to relieve pedestrian safety issues. Finally, additional attention should be paid to improve intersection design and management during the planning and operation processes. Published by Elsevier B.V.
引用
收藏
页码:53 / 63
页数:11
相关论文
共 50 条
[1]   Driving speed and the risk of road crashes: A review [J].
Aarts, L ;
van Schagen, I .
ACCIDENT ANALYSIS AND PREVENTION, 2006, 38 (02) :215-224
[2]   Integrating Trip and Roadway Characteristics to Manage Safety in Traffic Analysis Zones [J].
Abdel-Aty, Mohamed ;
Siddiqui, Chowdhury ;
Huang, Helai ;
Wang, Xuesong .
TRANSPORTATION RESEARCH RECORD, 2011, (2213) :20-28
[3]   Assessing Safety on Dutch Freeways with Data from Infrastructure-Based Intelligent Transportation Systems [J].
Abdel-Aty, Mohamed ;
Pande, Anurag ;
Das, Abhishek ;
Knibbe, Willem Jan .
TRANSPORTATION RESEARCH RECORD, 2008, (2083) :153-161
[4]   Artificial neural networks and logit models for traffic safety analysis of toll plazas [J].
Abdelwahab, HT ;
Abdel-Aty, MA .
STATISTICAL METHODOLOGY: APPLICATIONS TO DESIGN, DATA ANALYSIS, AND EVALUATION: SAFETY AND HUMAN PERFORMANCE, 2002, (1784) :115-125
[5]   Development of artificial neural network models to predict driver injury severity in traffic accidents at signalized intersections [J].
Abdelwahab, HT ;
Abdel-Aty, MA .
HIGHWAY SAFETY: MODELING, ANALYSIS, MANAGEMENT, STATISTICAL METHODS, AND CRASH LOCATION: SAFETY AND HUMAN PERFORMANCE, 2001, (1746) :6-13
[6]   Spatial analysis of fatal and injury crashes in Pennsylvania [J].
Aguero-Valverde, J ;
Jovanis, PP .
ACCIDENT ANALYSIS AND PREVENTION, 2006, 38 (03) :618-625
[7]   Comparison of road crashes incidence and severity between some French counties [J].
Amoros, E ;
Martin, JL ;
Laumon, B .
ACCIDENT ANALYSIS AND PREVENTION, 2003, 35 (04) :537-547
[8]  
[Anonymous], 2006, 546 NCHRP
[9]   Predicting severity of foreign body injuries in children in upper airways: An approach based on regression trees [J].
Berchialla, Paola ;
Snidero, Silvia ;
Stancu, Alexandru ;
Scarinzi, Cecilia ;
Corradetti, Roberto ;
Gregori, Dario .
RISK ANALYSIS, 2007, 27 (05) :1255-1263
[10]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32