How to "DODGE" Complex Software Analytics

被引:17
作者
Agrawal, Amritanshu [1 ]
Fu, Wei [2 ]
Chen, Di [3 ]
Shen, Xipeng [4 ]
Menzies, Tim [4 ]
机构
[1] Wayfair, Boston, MA 02116 USA
[2] Landing AI, Palo Alto, CA 94306 USA
[3] Facebook, Menlo Pk, CA 94025 USA
[4] North Carolina State Univ, Raleigh, NC 27695 USA
基金
美国国家科学基金会;
关键词
Tuning; Text mining; Software; Task analysis; Optimization; Software engineering; Tools; Software analytics; hyperparameter optimization; defect prediction; text mining; STATIC CODE ATTRIBUTES; ALGORITHM; SEARCH; OPTIMIZATION; SELECTION;
D O I
10.1109/TSE.2019.2945020
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Machine learning techniques applied to software engineering tasks can be improved by hyperparameter optimization, i.e., automatic tools that find good settings for a learner's control parameters. We show that such hyperparameter optimization can be unnecessarily slow, particularly when the optimizers waste time exploring "redundant tunings", i.e., pairs of tunings which lead to indistinguishable results. By ignoring redundant tunings, DODGE(epsilon), a tuning tool, runs orders of magnitude faster, while also generating learners with more accurate predictions than seen in prior state-of-the-art approaches.
引用
收藏
页码:2182 / 2194
页数:13
相关论文
共 70 条
[1]  
Agrawal A., 2019, THESIS N CAROLINA ST
[2]  
Agrawal A., 2018, ABS181201550 CORR
[3]   Is "Better Data" Better Than "Better Data Miners"? On the Benefits of Tuning SMOTE for Defect Prediction [J].
Agrawal, Amritanshu ;
Menzies, Tim .
PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2018, :1050-1061
[4]   What is wrong with topic modeling? And how to fix it using search-based software engineering [J].
Agrawal, Amritanshu ;
Fu, Wei ;
Menzies, Tim .
INFORMATION AND SOFTWARE TECHNOLOGY, 2018, 98 :74-88
[5]   Analysing the fitness landscape of search-based software testing problems [J].
Aleti, Aldeida ;
Moser, I. ;
Grunske, Lars .
AUTOMATED SOFTWARE ENGINEERING, 2017, 24 (03) :603-621
[6]   Parameter tuning or default values? An empirical investigation in search-based software engineering [J].
Arcuri, Andrea ;
Fraser, Gordon .
EMPIRICAL SOFTWARE ENGINEERING, 2013, 18 (03) :594-623
[7]   A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering [J].
Arcuri, Andrea ;
Briand, Lionel .
2011 33RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2011, :1-10
[8]   The Oracle Problem in Software Testing: A Survey [J].
Barr, Earl T. ;
Harman, Mark ;
McMinn, Phil ;
Shahbaz, Muzammil ;
Yoo, Shin .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2015, 41 (05) :507-525
[9]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[10]  
Biedenkapp A., 2018, ARTIF INTELL, V1, P35