Testing Deep Learning Models: A First Comparative Study of Multiple Testing Techniques

被引：3

作者：

Ahuja, Mohit Kumar ^{[1
]}

Gotlieb, Arnaud ^{[1
]}

Spieker, Helge ^{[1
]}

机构：

[1] Simula Res Lab, Oslo, Norway

来源：

2022 IEEE 15TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION WORKSHOPS (ICSTW 2022) | 2022年

关键词：

D O I：

10.1109/ICSTW55395.2022.00035

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Deep Learning (DL) has revolutionized the capabilities of vision-based systems (VBS) in critical applications such as autonomous driving, robotic surgery, critical infrastructure surveillance, air and maritime traffic control, etc. By analyzing images, voice, videos, or any type of complex signals, DL has considerably increased the situation awareness of these systems. At the same time, while relying more and more on trained DL models, the reliability and robustness of VBS have been challenged and it has become crucial to test thoroughly these models to assess their capabilities and potential errors. To discover faults in DL models, existing software testing methods have been adapted and refined accordingly. In this article, we provide an overview of these software testing methods, namely differential, metamorphic, mutation, and combinatorial testing, as well as adversarial perturbation testing and review some challenges in their deployment for boosting perception systems used in VBS. We also provide a first experimental comparative study on a classical benchmark used in VBS and discuss its results.

引用

页码：130 / 137

页数：8

共 35 条

[1] Software Engineering for Machine Learning: A Case Study [J].

Amershi, Saleema ;

Begel, Andrew ;

Bird, Christian ;

DeLine, Robert ;

Gall, Harald ;

Kamar, Ece ;

Nagappan, Nachiappan ;

Nushi, Besmira ;

Zimmermann, Thomas .

2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP 2019), 2019, :291-300

[2]

Nguyen A, 2015, PROC CVPR IEEE, P427, DOI 10.1109/CVPR.2015.7298640

[3] The Oracle Problem in Software Testing: A Survey [J].

Barr, Earl T. ;

Harman, Mark ;

McMinn, Phil ;

Shahbaz, Muzammil ;

Yoo, Shin .

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2015, 41 (05) :507-525

[4] On testing machine learning programs [J].

Ben Braiek, Houssem ;

Khomh, Foutse .

JOURNAL OF SYSTEMS AND SOFTWARE, 2020, 164

[5]

Breck E, 2017, IEEE INT CONF BIG DA, P1123, DOI 10.1109/BigData.2017.8258038

[6] Variable Strength Combinatorial Testing for Deep Neural Networks [J].

Chen, Yanshan ;

Wang, Ziyuan ;

Wang, Dong ;

Fang, Chunrong ;

Chen, Zhenyu .

2019 IEEE 12TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION WORKSHOPS (ICSTW 2019), 2019, :281-284

[7] Validating a Deep Learning Framework by Metamorphic Testing [J].

Ding, Junhua ;

Kang, Xiaojun ;

Hu, Xin-Hua .

2017 IEEE/ACM 2ND INTERNATIONAL WORKSHOP ON METAMORPHIC TESTING (MET 2017), 2017, :28-34

[8] Identifying Implementation Bugs in Machine Learning Based Image Classifiers using Metamorphic Testing [J].

Dwarakanath, Anurag ;

Ahuja, Manish ;

Sikand, Samarth ;

Rao, Raghotham M. ;

Bose, R. P. Jagadeesh Chandra ;

Dubash, Neville ;

Podder, Sanjay .

ISSTA'18: PROCEEDINGS OF THE 27TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, 2018, :118-128

[9]

Herbold S., 2020, Smoke Testing for Machine Learning: Simple Tests to Discover Severe Defects

[10] An Empirical Evaluation of Mutation Operators for Deep Learning Systems [J].

Jahangirova, Gunel ;

Tonella, Paolo .

2020 IEEE 13TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VALIDATION AND VERIFICATION (ICST 2020), 2020, :74-84

← 1 2 3 4 →