Machine Learning Testing: Survey, Landscapes and Horizons

被引：458

作者：

Zhang, Jie M. ^{[1
]}

Harman, Mark ^{[2
]}

Ma, Lei ^{[3
]}

Liu, Yang ^{[4
]}

机构：

[1] UCL, CREST, London WC1E 6BT, England

[2] Facebook, London W1T 1FB, England

[3] Kyushu Univ, Fukuoka 8190395, Japan

[4] Nanyang Technol Univ, Singapore 639798, Singapore

来源：

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING | 2022年 / 48卷 / 01期

基金：

新加坡国家研究基金会;

关键词：

Machine learning; software testing; deep neural network; COMPUTER-AIDED DIAGNOSIS; SYMBOLIC EXECUTION; SAMPLE-SIZE; CLASSIFIER; PERFORMANCE;

D O I：

10.1109/TSE.2019.2962027

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

This paper provides a comprehensive survey of techniques for testing machine learning systems; Machine Learning Testing (ML testing) research. It covers 144 papers on testing properties (e.g., correctness, robustness, and fairness), testing components (e.g., the data, learning program, and framework), testing workflow (e.g., test generation and test evaluation), and application scenarios (e.g., autonomous driving, machine translation). The paper also analyses trends concerning datasets, research trends, and research focus, concluding with research challenges and promising research directions in ML testing.

引用

页码：1 / 36

页数：36

共 287 条

[1]

Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265

[2] A systematic review of search-based testing for non-functional system properties [J].

Afzal, Wasif ;

Torkar, Richard ;

Feldt, Robert .

INFORMATION AND SOFTWARE TECHNOLOGY, 2009, 51 (06) :957-976

[3]

Agarwal A, 2018, 35 INT C MACHINE LEA, V80

[4]

Agarwal A., 2018, arXiv

[5]

Al-Azani Sadam, 2017, Multi-disciplinary Trends in Artificial Intelligence. 11th International Workshop, MIWAI 2017. Proceedings: LNAI 10607, P77, DOI 10.1007/978-3-319-69456-6_7

[6] Fairness-Aware Programming [J].

Albarghouthi, Aws ;

Vinitsky, Samuel .

FAT*'19: PROCEEDINGS OF THE 2019 CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, 2019, :211-219

[7] Repairing Decision-Making Programs Under Uncertainty [J].

Albarghouthi, Aws ;

D'Antoni, Loris ;

Drews, Samuel .

COMPUTER AIDED VERIFICATION, CAV 2017, PT I, 2017, 10426 :181-200

[8]

Alfeld S, 2016, AAAI CONF ARTIF INTE, P1452

[9] code2vec: Learning Distributed Representations of Code [J].

Alon, Uri ;

Zilberstein, Meital ;

Levy, Omer ;

Yahav, Eran .

PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2019, 3 (POPL)

[10] Software Engineering for Machine Learning: A Case Study [J].

Amershi, Saleema ;

Begel, Andrew ;

Bird, Christian ;

DeLine, Robert ;

Gall, Harald ;

Kamar, Ece ;

Nagappan, Nachiappan ;

Nushi, Besmira ;

Zimmermann, Thomas .

2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP 2019), 2019, :291-300

← 1 2 3 4 5 6 7 8 9 10 →