Manifesting Bugs in Machine Learning Code: An Explorative Study with Mutation Testing

被引:24
作者
Cheng, Dawei [1 ]
Cao, Chun [1 ]
Xu, Chang [1 ]
Ma, Xiaoxing [1 ]
机构
[1] Nanjing Univ, Inst Comp Software, State Key Lab Novel Software Technol, Nanjing, Peoples R China
来源
2018 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY (QRS 2018) | 2018年
基金
国家重点研发计划;
关键词
machine learning programs; mutation testing; explorative study;
D O I
10.1109/QRS.2018.00044
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Nowadays statistical machine learning is widely adopted in various domains such as data mining, image recognition and automated driving. However, software quality assurance for machine learning is still in its infancy. While recent efforts have been put into improving the quality of training data and trained models, this paper focuses on code-level bugs in the implementations of machine learning algorithms. In this explorative study we simulated program bugs by mutating Weka implementations of several classification algorithms. We observed that 8%-40% of the logically non-equivalent executable mutants were statistically indistinguishable from their golden versions. Moreover, other 15%-36% of the mutants were stubborn, as they performed not significantly worse than a reference classifier on at least one natural data set. We also experimented with several approaches to killing those stubborn mutants. Preliminary results indicate that bugs in machine learning code may have negative impacts on statistical properties such as robustness and learning curves, but they could be very difficult to detect, due to the lack of effective oracles.
引用
收藏
页码:313 / 324
页数:12
相关论文
共 49 条
  • [1] Mantra: Mutation Testing of Hardware Design Code Based on Real Bugs
    Wu, Jiang
    Lei, Yan
    Zhang, Zhuo
    Meng, Xiankai
    Yang, Deheng
    Li, Pan
    He, Jiayu
    Mao, Xiaoguang
    2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
  • [2] What Are We Really Testing in Mutation Testing for Machine Learning? A Critical Reflection
    Panichella, Annibale
    Liem, Cynthia C. S.
    2021 ACM/IEEE 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: NEW IDEAS AND EMERGING RESULTS (ICSE-NIER 2021), 2021, : 66 - 70
  • [3] Spotting Code Mutation for Predictive Mutation Testing
    Zhao, Yifan
    Chen, Yizhou
    Sun, Zeyu
    Liang, Qingyuan
    Wang, Guoqing
    Hao, Dan
    PROCEEDINGS OF 2024 39TH ACM/IEEE INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2024, 2024, : 1133 - 1145
  • [4] Efficient mutation testing of multithreaded code
    Gligoric, Milos
    Jagannath, Vilas
    Luo, Qingzhou
    Marinov, Darko
    SOFTWARE TESTING VERIFICATION & RELIABILITY, 2013, 23 (05) : 375 - 403
  • [5] Control Oriented Mutation Testing for Detection of Potential Software Bugs
    Bashir, Muhammad Bilal
    Nadeem, Aamer
    10TH INTERNATIONAL CONFERENCE ON FRONTIERS OF INFORMATION TECHNOLOGY (FIT 2012), 2012, : 35 - 40
  • [6] How to kill them all: An exploratory study on the impact of code observability on mutation testing
    Zhu, Qianqian
    Zaidman, Andy
    Panichella, Annibale
    JOURNAL OF SYSTEMS AND SOFTWARE, 2021, 173
  • [7] FRAFOL: FRAmework FOr Learning mutation testing
    Tavares, Pedro
    Paiva, Ana
    Amalfitano, Domenico
    Just, Rene
    PROCEEDINGS OF THE 33RD ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2024, 2024, : 1846 - 1850
  • [8] Mutation testing of unsupervised learning systems
    Lu, Yuteng
    Shao, Kaicheng
    Zhao, Jia
    Sun, Weidi
    Sun, Meng
    JOURNAL OF SYSTEMS ARCHITECTURE, 2024, 146
  • [9] μAkka: Mutation Testing for Actor Concurrency in Akka using Real-World Bugs
    Moghadam, Mohsen Moradi
    Bagherzadeh, Mehdi
    Khatchadourian, Raffi
    Bagheri, Hamid
    PROCEEDINGS OF THE 31ST ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2023, 2023, : 262 - 274
  • [10] Learning from Faults: Mutation Testing in Active Automata Learning
    Aichernig, Bernhard K.
    Tappler, Martin
    NASA FORMAL METHODS (NFM 2017), 2017, 10227 : 19 - 34