Machine Learning Testing: Survey, Landscapes and Horizons

被引:418
作者
Zhang, Jie M. [1 ]
Harman, Mark [2 ]
Ma, Lei [3 ]
Liu, Yang [4 ]
机构
[1] UCL, CREST, London WC1E 6BT, England
[2] Facebook, London W1T 1FB, England
[3] Kyushu Univ, Fukuoka 8190395, Japan
[4] Nanyang Technol Univ, Singapore 639798, Singapore
基金
新加坡国家研究基金会;
关键词
Machine learning; software testing; deep neural network; COMPUTER-AIDED DIAGNOSIS; SYMBOLIC EXECUTION; SAMPLE-SIZE; CLASSIFIER; PERFORMANCE;
D O I
10.1109/TSE.2019.2962027
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper provides a comprehensive survey of techniques for testing machine learning systems; Machine Learning Testing (ML testing) research. It covers 144 papers on testing properties (e.g., correctness, robustness, and fairness), testing components (e.g., the data, learning program, and framework), testing workflow (e.g., test generation and test evaluation), and application scenarios (e.g., autonomous driving, machine translation). The paper also analyses trends concerning datasets, research trends, and research focus, concluding with research challenges and promising research directions in ML testing.
引用
收藏
页码:1 / 36
页数:36
相关论文
共 288 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] A systematic review of search-based testing for non-functional system properties
    Afzal, Wasif
    Torkar, Richard
    Feldt, Robert
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2009, 51 (06) : 957 - 976
  • [3] Agarwal A, 2018, 35 INT C MACHINE LEA, V80
  • [4] Agarwal Aniya, arXiv
  • [5] Al-Azani Sadam, 2017, Multi-disciplinary Trends in Artificial Intelligence. 11th International Workshop, MIWAI 2017. Proceedings: LNAI 10607, P77, DOI 10.1007/978-3-319-69456-6_7
  • [6] Fairness-Aware Programming
    Albarghouthi, Aws
    Vinitsky, Samuel
    [J]. FAT*'19: PROCEEDINGS OF THE 2019 CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, 2019, : 211 - 219
  • [7] Repairing Decision-Making Programs Under Uncertainty
    Albarghouthi, Aws
    D'Antoni, Loris
    Drews, Samuel
    [J]. COMPUTER AIDED VERIFICATION, CAV 2017, PT I, 2017, 10426 : 181 - 200
  • [8] Alfeld S, 2016, AAAI CONF ARTIF INTE, P1452
  • [9] code2vec: Learning Distributed Representations of Code
    Alon, Uri
    Zilberstein, Meital
    Levy, Omer
    Yahav, Eran
    [J]. PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2019, 3 (POPL):
  • [10] Software Engineering for Machine Learning: A Case Study
    Amershi, Saleema
    Begel, Andrew
    Bird, Christian
    DeLine, Robert
    Gall, Harald
    Kamar, Ece
    Nagappan, Nachiappan
    Nushi, Besmira
    Zimmermann, Thomas
    [J]. 2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP 2019), 2019, : 291 - 300