Easy over Hard: A Case Study on Deep Learning

被引:136
作者
Fu, Wei [1 ]
Menzies, Tim [1 ]
机构
[1] NC State, Com Sci, Raleigh, NC 27606 USA
来源
ESEC/FSE 2017: PROCEEDINGS OF THE 2017 11TH JOINT MEETING ON FOUNDATIONS OF SOFTWARE ENGINEERING | 2017年
基金
美国国家科学基金会;
关键词
Search based software engineering; software analytics; parameter tuning; data analytics for software engineering; deep learning; SVM; differential evolution; SEARCH;
D O I
10.1145/3106237.3106256
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
While deep learning is an exciting new technique, the benefits of this method need to be assessed with respect to its computational cost. This is particularly important for deep learning since these learners need hours (to weeks) to train the model. Such long training time limits the ability of (a) a researcher to test the stability of their conclusion via repeated runs with different random seeds; and (b) other researchers to repeat, improve, or even refute that original work. For example, recently, deep learning was used to find which questions in the Stack Overflow programmer discussion forum can be linked together. That deep learning system took 14 hours to execute. We show here that applying a very simple optimizer called DE to fine tune SVM, it can achieve similar (and sometimes better) results. The DE approach terminated in 10 minutes; i.e. 84 times faster hours than deep learning method. We offer these results as a cautionary tale to the software analytics community and suggest that not every new innovation should be applied without critical analysis. If researchers deploy some new and expensive process, that work should be baselined against some simpler and faster alternatives.
引用
收藏
页码:49 / 60
页数:12
相关论文
共 80 条
  • [1] Combining Deep Learning with Information Retrieval to Localize Buggy Files for Bug Reports
    An Ngoc Lam
    Anh Tuan Nguyen
    Hoan Anh Nguyen
    Nguyen, Tien N.
    [J]. 2015 30TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE), 2015, : 476 - 481
  • [2] [Anonymous], 2013, ARXIV13125542
  • [3] [Anonymous], 2016, ARXIV160808176
  • [4] [Anonymous], 2015, Proceedings of the 20th Australasian document computing symposium, DOI [10.1145/2838931.2838936, 10.1145/, DOI 10.1145/2838931.2838936]
  • [5] [Anonymous], ARXIV160607006
  • [6] [Anonymous], 2010, P LREC 2010 WORKSHOP
  • [7] [Anonymous], 2017, P 23 ACM SIGKDD INT
  • [8] Anvik J., 2006, P 28 INT C SOFTW ENG, P361, DOI DOI 10.1145/1134285.1134336
  • [9] Arcuri A, 2011, LECT NOTES COMPUT SC, V6956, P33, DOI 10.1007/978-3-642-23716-4_6
  • [10] Deep Machine Learning-A New Frontier in Artificial Intelligence Research
    Arel, Itamar
    Rose, Derek C.
    Karnowski, Thomas P.
    [J]. IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2010, 5 (04) : 13 - 18