Testing and validating machine learning classifiers by metamorphic testing

被引:251
作者
Xie, Xiaoyuan [1 ,4 ,5 ]
Ho, Joshua W. K. [2 ]
Murphy, Christian [3 ]
Kaiser, Gail [3 ]
Xu, Baowen [5 ]
Chen, Tsong Yueh [1 ]
机构
[1] Swinburne Univ Technol, Ctr Software Anal & Testing, Hawthorn, Vic 3122, Australia
[2] Harvard Univ, Sch Med, Brigham & Womens Hosp, Dept Med, Boston, MA 02115 USA
[3] Columbia Univ, Dept Comp Sci, New York, NY 10027 USA
[4] Southeast Univ, Sch Comp Sci & Engn, Nanjing 210096, Peoples R China
[5] Nanjing Univ, Dept Comp Sci & Technol, State Key Lab Novel Software Technol, Nanjing 210093, Peoples R China
基金
中国国家自然科学基金; 美国国家科学基金会; 澳大利亚研究理事会;
关键词
Metamorphic testing; Machine learning; Test oracle; Oracle problem; Validation; Verification; MUTATION;
D O I
10.1016/j.jss.2010.11.920
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Machine learning algorithms have provided core functionality to many application domains - such as bioinformatics, computational linguistics, etc. However, it is difficult to detect faults in such applications because often there is no "test oracle" to verify the correctness of the computed outputs. To help address the software quality, in this paper we present a technique for testing the implementations of machine learning classification algorithms which support such applications. Our approach is based on the technique "metamorphic testing", which has been shown to be effective to alleviate the oracle problem. Also presented include a case study on a real-world machine learning application framework, and a discussion of how programmers implementing machine learning algorithms can avoid the common pitfalls discovered in our study. We also conduct mutation analysis and cross-validation, which reveal that our method has high effectiveness in killing mutants, and that observing expected cross-validation result alone is not sufficiently effective to detect faults in a supervised classification program. The effectiveness of metamorphic testing is further confirmed by the detection of real faults in a popular open-source classification program. (C) 2010 Elsevier Inc. All rights reserved.
引用
收藏
页码:544 / 558
页数:15
相关论文
共 40 条
[1]   Is mutation an appropriate tool for testing experiments? [J].
Andrews, JH ;
Briand, LC ;
Labiche, Y .
ICSE 05: 27TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, PROCEEDINGS, 2005, :402-411
[2]  
[Anonymous], 2004, Introduction to Machine Learning
[3]  
[Anonymous], 1998, UCI REPOSITORY MACHI
[4]   Saner: Composing static and dynamic analysis to validate sanitization in web applications [J].
Balzarotti, Davide ;
Cova, Marco ;
Felmetsger, Vika ;
Jovanovic, Nenad ;
Kirda, Engin ;
Kruegel, Christopher ;
Vigna, Giovanni .
PROCEEDINGS OF THE 2008 IEEE SYMPOSIUM ON SECURITY AND PRIVACY, 2008, :387-+
[5]   Novel Applications of Machine Learning in Software Testing [J].
Briand, Lionel C. .
QSIC 2008: PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON QUALITY SOFTWARE, 2008, :3-10
[6]   A metamorphic testing approach for online testing of service-oriented software applications [J].
Chan, W. K. ;
Cheung, S. C. ;
Leung, Karl R. P. H. .
INTERNATIONAL JOURNAL OF WEB SERVICES RESEARCH, 2007, 4 (02) :61-81
[7]  
Cheatham T. J., 1995, 23rd Annual 1995 ACM Computer Science Conference. The Shrinking Footprint and Growing Impact. Proceedings, P135, DOI 10.1145/259526.259548
[8]  
Chen T. Y., IEEE T SOFT IN PRESS
[9]  
Chen TS, 2004, 2004 IEEE INTERNATIONAL CONFERNECE ON E-TECHNOLOGY, E-COMMERE AND E-SERVICE, PROCEEDINGS, P567
[10]   An innovative approach for testing bioinformatics programs using metamorphic testing [J].
Chen, Tsong Yueh ;
Ho, Joshua W. K. ;
Liu, Huai ;
Xie, Xiaoyuan .
BMC BIOINFORMATICS, 2009, 10