LLTFI: Framework Agnostic Fault Injection for Machine Learning Applications (Tools and Artifact Track)

被引:11
作者
Agarwal, Udit Kumar [1 ]
Chan, Abraham [1 ]
Pattabiraman, Karthik [1 ]
机构
[1] Univ British Columbia, Vancouver, BC, Canada
来源
2022 IEEE 33RD INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE 2022) | 2022年
基金
加拿大自然科学与工程研究理事会;
关键词
Error resilience; Machine learning; Testing; ERROR-DETECTION; DUPLICATION; COMPILER;
D O I
10.1109/ISSRE55969.2022.00036
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
As machine learning (ML) has become more prevalent across many critical domains, so has the need to understand ML applications' resilience. While prior work like TensorFI [1], MindFI [2], and PyTorchFI [3] has focused on building ML fault injectors for specific ML frameworks, there has been little work on performing fault injection (FI) for ML applications written in multiple frameworks. We present LLTFI, a framework-agnostic fault injection tool for ML applications, allowing users to run FI experiments on ML applications at the LLVM IR level. LLTFI provides users with finer FI granularity at the level of instructions, and a better understanding of how faults manifest and propagate between different ML components. We evaluate LLTFI on six ML programs and compare it with TensorFI. We found significant differences in the Silent Data Corruption (SDC) rates for similar faults between the two tools. Finally, we use LLTFI to evaluate the efficacy of selective instruction duplication - an error mitigation technique - for ML programs.
引用
收藏
页码:286 / 296
页数:11
相关论文
共 47 条
[1]  
Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2]  
[Anonymous], 1957, An interpersonal diagnosis of personality
[3]  
[Anonymous], COMMA AI OPENPILOT O
[4]  
Bai J., 2019, ONNX: Open Neural Network Exchange
[5]  
Baidu Apollo team, 2017, Apollo: Open Source Autonomous Driving
[6]  
Beyer M, 2020, PROC ESREL 20
[7]  
Bojarski M, 2016, Arxiv, DOI [arXiv:1604.07316, DOI 10.48550/ARXIV.1604.07316]
[8]   (WiP) LLTFI: Low-Level Tensor Fault Injector [J].
Chan, Abraham ;
Agarwal, Udit Kumar ;
Pattabiraman, Karthik .
2021 IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING WORKSHOPS (ISSREW 2021), 2021, :64-68
[9]  
Chen S, 2021, LABELED CAR DRIVING
[10]  
Chen TQ, 2018, PROCEEDINGS OF THE 13TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P579