Test Case Selection for Neural Network via Data Mutation

被引:0
|
作者
Cao, Xue-Jie [1 ]
Chen, Jun-Jie [1 ]
Yan, Ming [1 ]
You, Han-Mo [1 ]
Wu, Zhuo [2 ]
Wang, Zan [1 ,2 ]
机构
[1] College of Intelligence and Computing, Tianjin University, Tianjin
[2] School of New Media and Communication, Tianjin University, Tianjin
来源
Ruan Jian Xue Bao/Journal of Software | 2024年 / 35卷 / 11期
关键词
data mutation; deep learning; software testing; test case selection;
D O I
10.13328/j.cnki.jos.007005
中图分类号
学科分类号
摘要
Nowadays, deep neural network (DNN) is widely used in autonomous driving, medical diagnosis, speech recognition, face recognition, and other safety-critical fields. Therefore, DNN testing is critical to ensure the quality of DNN. However, labeling test cases to judge whether the DNN model predictions are correct is costly. Therefore, selecting test cases that reveal incorrect behavior of DNN models and labeling them earlier can help developers debug DNN models as soon as possible, thus improving the efficiency of DNN testing and ensuring the quality of DNN models. This study proposes a test case selection method based on data mutation, namely DMS. In this method, a data mutation operator is designed and implemented to generate a mutation model to simulate model defects and capture the dynamic pattern of test case bug-revealing, so as to evaluate the ability of test case bug-revealing. Experiments are conducted on the combination of 25 deep learning test sets and models. The results show that DMS is significantly better than the existing test case selection methods in terms of both the proportion of bug-revealing and the diversity of bug-revealing directions in the selected samples. Specifically, taking the original test set as the candidate set, DMS can filter out 53.85%–99.22% of all bug-revealing test cases when selecting 10% of the test cases. Moreover, when 5% of the test cases are selected, the selected cases by DMS can cover almost all bug-revealing directions. Compared with the eight comparison methods, DMS finds 12.38%–71.81% more bug-revealing cases on average, which proves the significant effectiveness of DMS in the task of test case selection. © 2024 Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:4973 / 4992
页数:19
相关论文
共 60 条
  • [51] Li Z, Harman M, Hierons RM., Search algorithms for regression test case prioritization, IEEE Trans. on Software Engineering, 33, 4, pp. 225-237, (2007)
  • [52] Chen JJ, Lou YL, Zhang LM, Zhou JY, Wang XL, Hao D, Zhang L., Optimizing test prioritization via test distribution analysis, Proc. of the 26th ACM Joint Meeting on European Software Engineering Conf. and Symp. on the Foundations of Software Engineering, pp. 656-667, (2018)
  • [53] Harrold MJ, Gupta R, Soffa ML., A methodology for controlling the size of a test suite, ACM Trans. on Software Engineering and Methodology, 2, 3, pp. 270-285, (1993)
  • [54] Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow IJ, Fergus R., Intriguing properties of neural networks, Proc. of the 2nd Int’l Conf. on Learning Representations, (2014)
  • [55] Tian YC, Pei KX, Jana S, Ray B., DeepTest: Automated testing of deep-neural-network-driven autonomous cars, Proc. of the 40th Int’l Conf. on Software Engineering, pp. 303-314, (2018)
  • [56] Xie XF, Ma L, Juefei-Xu F, Xue MH, Chen HX, Liu Y, Zhao JJ, Li B, Yin JX, See S., DeepHunter: A coverage-guided fuzz testing framework for deep neural networks, Proc. of the 28th ACM SIGSOFT Int’l Symp. on Software Testing and Analysis, pp. 146-157, (2019)
  • [57] Guo A, Feng Y, Chen ZY., LiRTest: Augmenting LiDAR point clouds for automated testing of autonomous driving systems, Proc. of the 31st ACM SIGSOFT Int’l Symp. on Software Testing and Analysis, pp. 480-492, (2022)
  • [58] Yang Z, Shi JK, He JD, Lo D., Natural attack for pre-trained models of code, Proc. of the 44th Int’l Conf. on Software Engineering, pp. 1482-1493, (2022)
  • [59] Kim J, Feldt R, Yoo S., Guiding deep learning system testing using surprise adequacy, Proc. of the 41st IEEE/ACM Int’l Conf. on Software Engineering (ICSE), pp. 1039-1049, (2019)
  • [60] Gerasimou S, Eniser HF, Sen A, Cakan A., Importance-driven deep learning system testing, Proc. of the 42nd IEEE/ACM Int’l Conf. on Software Engineering: Companion Proc. (ICSE-Companion), pp. 322-323, (2020)