A Similarity Integration Method based Information Retrieval and Word Embedding in Bug Localization

被引:12
作者
Cheng, Shasha [1 ]
Yan, Xuefeng [1 ]
Khan, Arif Ali [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci Technol, Collaborat Innovat Ctr Novel Software Technol & I, Nanjing, Peoples R China
来源
2020 IEEE 20TH INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY, AND SECURITY (QRS 2020) | 2020年
基金
国家重点研发计划;
关键词
software bug localization; information retrieval; word embedding; similarity integration; bug report;
D O I
10.1109/QRS51102.2020.00034
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
To improve the performance of bug localization, there is necessity to solve the lexical mismatch between the natural language in the bug report and the programming language in the source file. A similarity integration method for bug localization is proposed, in which the similarity between bug report and source file is calculated by information retrieval (IR) and word embedding. More specifically, IR technique is used to collect the exact matches between bug report and source file. The terms in the bug report and the potential source files of different code tokens are connected by word embedding technique, which is used to complement with IR technique. Finally, deep neural network (DNN) is utilized to integrate extracted features to get the correlation between bug reports and source files. The experimental results show that the proposed approach outperforms several existing bug localization approaches in terms of Top N Rank, MAP, and MRR.
引用
收藏
页码:180 / 187
页数:8
相关论文
共 23 条
[1]   A practical evaluation of spectrum-based fault localization [J].
Abreu, Rui ;
Zoeteweij, Peter ;
Golsteijn, Rob ;
van Gemund, Arjan J. C. .
JOURNAL OF SYSTEMS AND SOFTWARE, 2009, 82 (11) :1780-1792
[2]   Bug Localization with Combination of Deep Learning and Information Retrieval [J].
An Ngoc Lam ;
Anh Tuan Nguyen ;
Hoan Anh Nguyen ;
Nguyen, Tien N. .
2017 IEEE/ACM 25TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC), 2017, :218-229
[3]  
[Anonymous], 2017, ARXIV170501509
[4]  
Arisoy E., 2012, P NAACL HLT 2012 WOR, P20
[5]   To CamelCase or Under_score [J].
Binkley, Dave ;
Davis, Marcia ;
Lawrie, Dawn ;
Morrell, Christopher .
ICPC: 2009 IEEE 17TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION, 2009, :158-+
[6]   Improving IR-based traceability recovery via noun-based indexing of software artifacts [J].
Capobianco, Giovanni ;
De Lucia, Andrea ;
Oliveto, Rocco ;
Panichella, Annibale ;
Panichella, Sebastiano .
JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2013, 25 (07) :743-762
[7]  
Jones J. A., 2005, P 20 IEEE ACM INT C, P273, DOI DOI 10.1145/1101908.1101949
[8]  
Jones JA, 2002, ICSE 2002: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, P467, DOI 10.1109/ICSE.2002.1007991
[9]   Just enough semantics: An information theoretic approach for IR-based software bug localization [J].
Khatiwada, Saket ;
Tushev, Miroslav ;
Mahmoud, Anas .
INFORMATION AND SOFTWARE TECHNOLOGY, 2018, 93 :45-57
[10]  
Mihalcea R, 2006, AAAI, V1, P775