Vehicle involvements in hydroplaning crashes: Applying interpretable machine learning

被引:11
作者
Das, Subasish [1 ]
Dutta, Anandi [2 ]
Dey, Kakan [3 ]
Jalayer, Mohammad [4 ]
Mudgal, Abhisek [5 ]
机构
[1] Texas A&M Transportat Inst, 1111 Rellis Pkwy, Bryan, TX 77807 USA
[2] Univ Texas San Antonio, Dept Comp Sci, One UTSA Circle, San Antonio, TX 78249 USA
[3] West Virginia Univ, Civil & Environm Engn, 1374 Evansdale Dr, Morgantown, WV 26506 USA
[4] Rowan Univ, Dept Civil & Environm Engn, Glassboro, NJ 08028 USA
[5] Indian Inst Technol BHU, Dept Civil Engn, Varanasi 221005, Uttar Pradesh, India
关键词
Hydroplaning crash; Crash narrative; Machine learning; Precision; Recall; NARRATIVE TEXT ANALYSIS; INJURY;
D O I
10.1016/j.trip.2020.100176
中图分类号
U [交通运输];
学科分类号
08 ; 0823 ;
摘要
Although hydroplaning is a major contributor to roadway crashes, it is not typically reported in conventional crash da-tabases. Hence, a framework to classify various crash attributes from police reports and to identify hydroplaning crashes is strongly needed. This study applied natural language processing (NLP) tools to seven years (2010-2016) of crash data from the Louisiana traffic crash database to identify hydroplaning related crashes. This research focused on the development of a framework to apply interpretable machine learning models to unstructured textual content in order to classify the number of vehicle involvements ina crash. This approach evaluated the effectiveness of keywords in determining the classification. This study used three machine learning algorithms. Of these algorithms, the eXtreme Gradient Boosting (XGBoost) model was found to be the most effective classifier. This research provided a platform to understand the application of interpretability in machine learning models. The outcomes of this study prove that un-derlying trends or precursors can be revealed and analyzed through these models. Furthermore, this indicates that quantitative modeling techniques can be used to address safety concerns.
引用
收藏
页数:10
相关论文
共 39 条
[1]   Extracting recurrent scenarios from narrative texts using a Bayesian network: Application to serious occupational accidents with movement disturbance [J].
Abdat, F. ;
Leclercq, S. ;
Cuny, X. ;
Tissot, C. .
ACCIDENT ANALYSIS AND PREVENTION, 2014, 70 :155-166
[2]  
Abramson N, 2006, Pattern recognition and machine learning, V103, P886, DOI [DOI 10.1117/1.2819119, 10.1117/1.2819119, DOI 10.1117/1]
[3]  
[Anonymous], 2018, Model class reliance: Variable importance measures for any machine learning model class, from the 'rashomon' perspective'
[4]  
Aycock E., 2008, INT COMPENDIUM CRASH, V3, P78
[5]   Comparison of methods for auto-coding causation of injury narratives [J].
Bertke, S. J. ;
Meyers, A. R. ;
Wurzelbacher, S. J. ;
Measure, A. ;
Lampl, M. P. ;
Robins, D. .
ACCIDENT ANALYSIS AND PREVENTION, 2016, 88 :117-123
[6]  
Black GW, 2000, ITE J, V70, P32
[7]   Methods for using narrative text from injury reports to identify factors contributing to construction injury [J].
Bondy, J ;
Lipscomb, H ;
Guarini, K ;
Glazner, JE .
AMERICAN JOURNAL OF INDUSTRIAL MEDICINE, 2005, 48 (05) :373-380
[8]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[9]   Text Mining the Contributors to Rail Accidents [J].
Brown, Donald E. .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2016, 17 (02) :346-355
[10]   Narrative text analysis of Kentucky tractor fatality reports [J].
Bunn, Terry L. ;
Slavova, Svetla ;
Hall, Laura .
ACCIDENT ANALYSIS AND PREVENTION, 2008, 40 (02) :419-425