Manifesting Bugs in Machine Learning Code: An Explorative Study with Mutation Testing

被引:24
作者
Cheng, Dawei [1 ]
Cao, Chun [1 ]
Xu, Chang [1 ]
Ma, Xiaoxing [1 ]
机构
[1] Nanjing Univ, Inst Comp Software, State Key Lab Novel Software Technol, Nanjing, Peoples R China
来源
2018 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY (QRS 2018) | 2018年
基金
国家重点研发计划;
关键词
machine learning programs; mutation testing; explorative study;
D O I
10.1109/QRS.2018.00044
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Nowadays statistical machine learning is widely adopted in various domains such as data mining, image recognition and automated driving. However, software quality assurance for machine learning is still in its infancy. While recent efforts have been put into improving the quality of training data and trained models, this paper focuses on code-level bugs in the implementations of machine learning algorithms. In this explorative study we simulated program bugs by mutating Weka implementations of several classification algorithms. We observed that 8%-40% of the logically non-equivalent executable mutants were statistically indistinguishable from their golden versions. Moreover, other 15%-36% of the mutants were stubborn, as they performed not significantly worse than a reference classifier on at least one natural data set. We also experimented with several approaches to killing those stubborn mutants. Preliminary results indicate that bugs in machine learning code may have negative impacts on statistical properties such as robustness and learning curves, but they could be very difficult to detect, due to the lack of effective oracles.
引用
收藏
页码:313 / 324
页数:12
相关论文
共 49 条
  • [41] Validation of Mutation Testing in the Safety Critical Industry through a Pilot Study
    Vercacmmen, Sten
    Borg, Markus
    Demeyer, Serge
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION WORKSHOPS, ICSTW, 2023, : 334 - 343
  • [42] What It Would Take to Use Mutation Testing in Industry-A Study at Facebook
    Beller, Moritz
    Wong, Chu-Pan
    Bader, Johannes
    Scott, Andrew
    Machalica, Mateusz
    Chandra, Satish
    Meijer, Erik
    [J]. 2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP 2021), 2021, : 268 - 277
  • [43] Enhancing software quality assurance in ubiquitous learning environments through mutation testing and diverse test oracles
    Naeem, Muhammad Rashid
    Khan, Muhammad Asghar
    Khan, Mansoor
    Alruwaili, Omar
    Alrashdi, Ibrahim
    Alanazi, Saad
    [J]. COMPUTERS IN HUMAN BEHAVIOR, 2025, 163
  • [44] Mutation Operator Reduction for Cost-effective Deep Learning Software Testing via Decision Boundary Change Measurement
    Feng, Li-Chao
    Wang, Xing-Ya
    Zhang, Shi-Yu
    Gao, Rui-Zhi
    Zhao, Zhi-Hong
    [J]. JOURNAL OF INTERNET TECHNOLOGY, 2022, 23 (03): : 601 - 610
  • [45] A large-scale study of call graph-based impact prediction using mutation testing
    Vincenzo Musco
    Martin Monperrus
    Philippe Preux
    [J]. Software Quality Journal, 2017, 25 : 921 - 950
  • [46] A large-scale study of call graph-based impact prediction using mutation testing
    Musco, Vincenzo
    Monperrus, Martin
    Preux, Philippe
    [J]. SOFTWARE QUALITY JOURNAL, 2017, 25 (03) : 921 - 950
  • [47] Lung cancer mutation testing: a clinical retesting study of agreement between a real-time PCR and a mass spectrometry test
    Shepherd, Phillip
    Sheath, Karen L.
    Tin, Sandar Tin
    Khwaounjoo, Prashannata
    Aye, Phyu S.
    Li, Angie
    Laking, George R.
    Kingston, Nicola J.
    Lewis, Christopher A.
    Elwood, J. Mark
    Love, Donald R.
    McKeage, Mark J.
    [J]. ONCOTARGET, 2017, 8 (60) : 101437 - 101451
  • [48] The efficacy of EGFR gene mutation testing in various samples from non-small cell lung cancer patients: a multicenter retrospective study
    Paweł Krawczyk
    Rodryg Ramlau
    Joanna Chorostowska-Wynimko
    Tomasz Powrózek
    Marzena Anna Lewandowska
    Janusz Limon
    Bartosz Wasąg
    Juliusz Pankowski
    Jerzy Kozielski
    Ewa Kalinka-Warzocha
    Aleksandra Szczęsna
    Kamila Wojas-Krawczyk
    Michał Skroński
    Rafał Dziadziuszko
    Paulina Jaguś
    Ewelina Antoszewska
    Justyna Szumiło
    Bożena Jarosz
    Aldona Woźniak
    Wojciech Jóźwicki
    Wojciech Dyszkiewicz
    Monika Pasieka-Lis
    Dariusz M. Kowalski
    Maciej Krzakowski
    Jacek Jassem
    Janusz Milanowski
    [J]. Journal of Cancer Research and Clinical Oncology, 2015, 141 : 61 - 68
  • [49] The efficacy of EGFR gene mutation testing in various samples from non-small cell lung cancer patients: a multicenter retrospective study
    Krawczyk, Pawel
    Ramlau, Rodryg
    Chorostowska-Wynimko, Joanna
    Powrozek, Tomasz
    Lewandowska, Marzena Anna
    Limon, Janusz
    Wasag, Bartosz
    Pankowski, Juliusz
    Kozielski, Jerzy
    Kalinka-Warzocha, Ewa
    Szczesna, Aleksandra
    Wojas-Krawczyk, Kamila
    Skronski, Michal
    Dziadziuszko, Rafal
    Jagus, Paulina
    Antoszewska, Ewelina
    Szumilo, Justyna
    Jarosz, Bozena
    Wozniak, Aldona
    Jozwicki, Wojciech
    Dyszkiewicz, Wojciech
    Pasieka-Lis, Monika
    Kowalski, Dariusz M.
    Krzakowski, Maciej
    Jassem, Jacek
    Milanowski, Janusz
    [J]. JOURNAL OF CANCER RESEARCH AND CLINICAL ONCOLOGY, 2015, 141 (01) : 61 - 68