Data-theoretic approach for socio-technical risk analysis: Text mining licensee event reports of US nuclear power plants

被引:20
作者
Pence, Justin [1 ,2 ,3 ,5 ]
Farshadmanesh, Pegah [1 ,2 ]
Kim, Jinmo [1 ,6 ]
Blake, Cathy [1 ,5 ,6 ]
Mohaghegh, Zahra [1 ,2 ,3 ,4 ,5 ,6 ]
机构
[1] Univ Illinois, Urbana, IL 61801 USA
[2] Univ Illinois, IAP, Sociotech Risk Anal SoTeRiA, Urbana, IL 61801 USA
[3] Univ Illinois, Beckman Inst Adv Sci & Technol, Urbana, IL 61801 USA
[4] Univ Illinois, Dept Nucl Plasma & Radiol Engn, Urbana, IL 61801 USA
[5] Univ Illinois, Illinois Informat Inst, Urbana, IL 61801 USA
[6] Univ Illinois, Sch Informat Sci, Urbana, IL USA
基金
美国国家科学基金会;
关键词
Machine learning; Text mining; Probabilistic risk assessment; Organizational factors; INCORPORATING ORGANIZATIONAL-FACTORS; RELIABILITY; MODEL; PRA; AGREEMENT; PERFORMANCE; TECHNOLOGY; ACCIDENTS; KNOWLEDGE; MACHINE;
D O I
10.1016/j.ssci.2019.104574
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
This paper is a product of a line of research that uses the Socio-Technical Risk Analysis (SoTeRiA) theoretical framework and Integrated PRA (I-PRA) methodological framework to theorize and quantify underlying organizational mechanisms contributing to socio-technical system risk scenarios. I-PRA has an input module that executes the Data-Theoretic (DT) approach, where "data analytics" can be guided by "theory." The DT input module of I-PRA has two sub-modules: (1) DT-BASE, for developing detailed grounded theory-based causal relationships in SoTeRiA, equipped with a software-supported BASEline quantification utilizing information extracted from academic articles, industry procedures, and regulatory standards, and (2) DT-SITE, using data analytics to refine and measure the causal factors of SoTeRiA based on industry event databases and using Bayesian analysis to update the baseline quantification. This paper focuses on the advancement of DT-SITE, contributing to the integration of text mining with the measurement of organizational factors for PRA, and demonstrating the following methodological elements and steps in DT-SITE: (Element 2.1) Text mining: (Step i) collect and pre-process unstructured text data, (Step ii) identify theory-based seed terms based on DT-BASE causal model, (Step iii) generate features, and (Step iv) build and evaluate classifiers (e.g., by using Support Vector Machine [SVM]); and (Element 2.2) Estimating probabilities and their associated uncertainties. The DT-SITE methodology is applied in a case study targeting the "training system" in Nuclear Power Plants (NPPs) and using Licensee Event Reports (LERs) from the U.S. nuclear power industry, where LER-specific data extraction and pre-processing tools are developed.
引用
收藏
页数:21
相关论文
共 119 条
[1]   Annotated Chemical Patent Corpus: A Gold Standard for Text Mining [J].
Akhondi, Saber A. ;
Klenner, Alexander G. ;
Tyrchan, Christian ;
Manchala, Anil K. ;
Boppana, Kiran ;
Lowe, Daniel ;
Zimmermann, Marc ;
Jagarlapudi, Sarma A. R. P. ;
Sayle, Roger ;
Kors, Jan A. ;
Muresan, Sorel .
PLOS ONE, 2014, 9 (09)
[2]  
Al-Dahidi Sameer, 2015, International Journal of Prognostics and Health Management, V6, P1
[3]  
Anguita Davide, 2009, Proceedings of the 2009 International Conference on Data Mining. DMIN 2009, P291
[4]  
[Anonymous], TECHNOLOGIES DETECTI
[5]  
[Anonymous], DEV TECHNOLOGY ROADM
[6]  
[Anonymous], 12 INT TOP M PROB SA
[7]  
[Anonymous], P INT PROB WORKSH IP
[8]  
[Anonymous], PROB SAF ASS MAN C S
[9]  
[Anonymous], KNOWL RISK ASSESS MA
[10]  
[Anonymous], COMPUTER SCI