An Imbalanced Deep Learning Model for Bug Localization

被引:2
作者
Bui Thi Mai Anh [1 ]
Nguyen Viet Luyen [1 ]
机构
[1] Hanoi Univ Sci & Technol, Sch Informat & Commun Technol, Lab Intelligent Software Engn, Hanoi, Vietnam
来源
2021 28TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE WORKSHOPS (APSECW 2021) | 2021年
关键词
bug localization; deep neural network; imbalanced data-set; bootstrapping;
D O I
10.1109/APSECW53869.2021.00017
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Debugging and locating faulty source files are tedious and time-consuming tasks. To improve the productivity and to help developers focus on crucial files, automated bug localization models have been proposed for years. These models recommend buggy source files by ranking them according to their relevance to a given bug report. There are two significant challenges in this research field: (i) narrowing the lexical gap between bug reports which are typically described using natural languages and source files written in programming languages; (ii) reducing the impact of imbalanced data distribution in model training as a far fewer of source files relate to a given bug report while the majority of them are not relevant. In this paper, we propose a deep neural network model to investigate essential information hidden within bug reports and source files through capturing not only lexical relations but also semantic details as well as domain knowledge features such as historical bug fixings, code change history. To address the skewed class distribution, we apply a focal loss function combining with a bootstrapping method to rectify samples of the minority class within iterative training batches to our proposed model. We assessed the performance of our approach over six large scale Java open-source projects. The empirical results have showed that the proposed method outperformed other state-of-the-art models by improving the Mean Average Precision (MAP) and Mean Reciprocal Rank (MRR) scores from 3% to 11% and from 2% to 14%, respectively.
引用
收藏
页码:32 / 40
页数:9
相关论文
共 50 条
[41]   Will this localization tool be effective for this bug? Mitigating the impact of unreliability of information retrieval based bug localization tools [J].
Tien-Duy B. Le ;
Ferdian Thung ;
David Lo .
Empirical Software Engineering, 2017, 22 :2237-2279
[42]   Improved bug localization based on code change histories and bug reports [J].
Youm, Klaus Changsun ;
Ahn, June ;
Lee, Eunseok .
INFORMATION AND SOFTWARE TECHNOLOGY, 2017, 82 :177-192
[43]   A comparative study of a traditional localization algorithm and a deep learning model for radioactive particle tracking application [J].
Dam, Roos Sophia de Freitas ;
Affonso, Renato Raoni Werneck ;
Salgado, William Luna ;
Schirru, Roberto ;
Salgado, Cesar Marques .
APPLIED RADIATION AND ISOTOPES, 2024, 205
[44]   Enhancing bug localization with bug report decomposition and code hierarchical network [J].
Zhu, Ziye ;
Tong, Hanghang ;
Wang, Yu ;
Li, Yun .
KNOWLEDGE-BASED SYSTEMS, 2022, 248
[45]   Scaffle: Bug Localization on Millions of Files [J].
Pradel, Michael ;
Murali, Vijayaraghavan ;
Qian, Rebecca ;
Machalica, Mateusz ;
Meijer, Erik ;
Chandra, Satish .
PROCEEDINGS OF THE 29TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2020, 2020, :225-236
[46]   Automated Bug Localization in JIT Compilers [J].
Lim, HeuiChan ;
Debray, Saumya .
PROCEEDINGS OF THE 17TH ACM SIGPLAN/SIGOPS INTERNATIONAL CONFERENCE ON VIRTUAL EXECUTION ENVIRONMENTS (VEE '21), 2021, :153-164
[47]   Classification of imbalanced hyperspectral images using SMOTE-based deep learning methods [J].
Ozdemir, Akin ;
Polat, Kemal ;
Alhudhaif, Adi .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 178
[48]   Deep Hierarchical Ensemble Model for Suicide Detection on Imbalanced Social Media Data [J].
Li, Zepeng ;
Zhou, Jiawei ;
An, Zhengyi ;
Cheng, Wenchuan ;
Hu, Bin .
ENTROPY, 2022, 24 (04)
[49]   Deep Learning with MCA-based Instance Selection and Bootstrapping for Imbalanced Data Classification [J].
Guan, Sheng ;
Chen, Min ;
Ha, Hsin-Yu ;
Chen, Shu-Ching ;
Shyu, Mei-Ling ;
Zhang, Chengde .
2015 IEEE CONFERENCE ON COLLABORATION AND INTERNET COMPUTING (CIC), 2015, :288-295
[50]   bjXnet: an improved bug localization model based on code property graph and attention mechanism [J].
Han, Jiaxuan ;
Huang, Cheng ;
Sun, Siqi ;
Liu, Zhonglin ;
Liu, Jiayong .
AUTOMATED SOFTWARE ENGINEERING, 2023, 30 (01)