Towards Accurate Duplicate Bug Retrieval using Deep Learning Techniques

被引:69
|
作者
Deshmukh, Jayati [1 ]
Annervaz, K. M. [1 ]
Podder, Sanjay [1 ]
Sengupta, Shubhashis [1 ]
Dubash, Neville [1 ]
机构
[1] Accenture Technol Labs, San Jose, CA 95113 USA
来源
2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME) | 2017年
关键词
Information Retrieval; Duplicate Bug Detection; Deep Learning; Natural Language Processing; Word Embeddings; Siamese Networks; Convolutional Neural Networks; Long Short Term Memory;
D O I
10.1109/ICSME.2017.69
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Duplicate Bug Detection is the problem of identifying whether a newly reported bug is a duplicate of an existing bug in the system and retrieving the original or similar bugs from the past. This is required to avoid costly rediscovery and redundant work. In typical software projects, the number of duplicate bugs reported may run into the order of thousands, making it expensive in terms of cost and time for manual intervention. This makes the problem of duplicate or similar bug detection an important one in Software Engineering domain. However, an automated solution for the same is not quite accurate yet in practice, in spite of many reported approaches using various machine learning techniques. In this work, we propose a retrieval and classification model using Siamese Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM) for accurate detection and retrieval of duplicate and similar bugs. We report an accuracy close to 90% and recall rate close to 80%, which makes possible the practical use of such a system. We describe our model in detail along with related discussions from the Deep Learning domain. By presenting the detailed experimental results, we illustrate the effectiveness of the model in practical systems, including for repositories for which supervised training data is not available.
引用
收藏
页码:115 / 124
页数:10
相关论文
共 50 条
  • [1] A Survey on Near Duplicate Video Retrieval Using Deep Learning Techniques and Framework
    Phalke, Dhanashree Ajay
    Jahirabadkar, Sunita
    2020 IEEE PUNE SECTION INTERNATIONAL CONFERENCE (PUNECON), 2020, : 124 - 128
  • [2] A Contextual Approach towards More Accurate Duplicate Bug Report Detection
    Alipour, Anahita
    Hindle, Abram
    Stroulia, Eleni
    2013 10TH IEEE WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR), 2013, : 183 - 192
  • [3] Towards Word Embeddings for Improved Duplicate Bug Report Retrieval in Software Repositories
    Budhiraja, Amar
    Dutta, Kartik
    Shrivastava, Manish
    Reddy, Raghu
    PROCEEDINGS OF THE 2018 ACM SIGIR INTERNATIONAL CONFERENCE ON THEORY OF INFORMATION RETRIEVAL (ICTIR'18), 2018, : 167 - 170
  • [4] A contextual approach towards more accurate duplicate bug report detection and ranking
    Abram Hindle
    Anahita Alipour
    Eleni Stroulia
    Empirical Software Engineering, 2016, 21 : 368 - 410
  • [5] A contextual approach towards more accurate duplicate bug report detection and ranking
    Hindle, Abram
    Alipour, Anahita
    Stroulia, Eleni
    EMPIRICAL SOFTWARE ENGINEERING, 2016, 21 (02) : 368 - 410
  • [6] Semantic Video Retrieval using Deep Learning Techniques
    Yasin, Danish
    Sohail, Ashbal
    Siddiqi, Imran
    PROCEEDINGS OF 2020 17TH INTERNATIONAL BHURBAN CONFERENCE ON APPLIED SCIENCES AND TECHNOLOGY (IBCAST), 2020, : 338 - 343
  • [7] Bug Localization with Combination of Deep Learning and Information Retrieval
    An Ngoc Lam
    Anh Tuan Nguyen
    Hoan Anh Nguyen
    Nguyen, Tien N.
    2017 IEEE/ACM 25TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC), 2017, : 218 - 229
  • [8] Near-Duplicate Video Retrieval with Deep Metric Learning
    Kordopatis-Zilos, Giorgos
    Papadopoulos, Symeon
    Patras, Ioannis
    Kompatsiaris, Yiannis
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 347 - 356
  • [9] Duplicate Bug Report Detection and Classification System Based on Deep Learning Technique
    Kukkar, Ashima
    Mohana, Rajni
    Kumar, Yugal
    Nayyar, Anand
    Bilal, Muhammad
    Kwak, Kyung-Sup
    IEEE ACCESS, 2020, 8 (08): : 200749 - 200763
  • [10] Does Deep Learning improve the performance of duplicate bug report detection? An empirical study?
    Jiang, Yuan
    Su, Xiaohong
    Treude, Christoph
    Shang, Chao
    Wang, Tiantian
    JOURNAL OF SYSTEMS AND SOFTWARE, 2023, 198