Mapping Bug Reports to Relevant Files: A Ranking Model, a Fine-Grained Benchmark, and Feature Evaluation

被引:58
作者
Ye, Xin [1 ]
Bunescu, Razvan [1 ]
Liu, Chang [1 ]
机构
[1] Ohio Univ, Sch Elect Engn & Comp Sci, Athens, OH 45701 USA
关键词
Bug reports; software maintenance; learning to rank; FEATURE LOCATION; PROBABILISTIC RANKING; INFORMATION-RETRIEVAL; PREDICTING FAULTS; SOURCE CODE; LOCALIZATION; EXECUTION; IR;
D O I
10.1109/TSE.2015.2479232
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
When a new bug report is received, developers usually need to reproduce the bug and perform code reviews to find the cause, a process that can be tedious and time consuming. A tool for ranking all the source files with respect to how likely they are to contain the cause of the bug would enable developers to narrow down their search and improve productivity. This paper introduces an adaptive ranking approach that leverages project knowledge through functional decomposition of source code, API descriptions of library components, the bug-fixing history, the code change history, and the file dependency graph. Given a bug report, the ranking score of each source file is computed as a weighted combination of an array of features, where the weights are trained automatically on previously solved bug reports using a learning-to-rank technique. We evaluate the ranking system on six large scale open source Java projects, using the before-fix version of the project for every bug report. The experimental results show that the learning-to-rank approach outperforms three recent state-of-the-art methods. In particular, our method makes correct recommendations within the top 10 ranked source files for over 70 percent of the bug reports in the Eclipse Platform and Tomcat projects.
引用
收藏
页码:379 / 402
页数:24
相关论文
共 71 条
[1]  
Anh Tuan Nguyen, 2011, 2011 26th IEEE/ACM International Conference on Automated Software Engineering, P263, DOI 10.1109/ASE.2011.6100062
[2]  
[Anonymous], 2006, P ACMSIGKDD INT C KN
[3]  
[Anonymous], 2017, Encyclopedia of Machine Learning and Data Mining
[4]  
[Anonymous], 2008, Introduction to information retrieval
[5]  
[Anonymous], 2009, OBJECT ORIENTED SOFT
[6]  
Antoniol G, 2005, PROC IEEE INT CONF S, P357
[7]   Feature identification:: An epidemiological metaphor [J].
Antoniol, Giuliano ;
Gueheneuc, Yann-Gael .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2006, 32 (09) :627-641
[8]  
Ashok B, 2009, 7TH JOINT MEETING OF THE EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND THE ACM SIGSOFT SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, P373
[9]  
Bacchelli A, 2013, PROCEEDINGS OF THE 35TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2013), P712, DOI 10.1109/ICSE.2013.6606617
[10]  
Bajracharya Sushil K., 2010, P 18 ACM SIGSOFT INT, DOI DOI 10.1145/1882291.1882316