Sentence embedding and fine-tuning to automatically identify duplicate bugs

被引:0
|
作者
Isotani, Haruna [1 ]
Washizaki, Hironori [1 ]
Fukazawa, Yoshiaki [1 ]
Nomoto, Tsutomu [2 ]
Ouji, Saori [3 ]
Saito, Shinobu [3 ]
机构
[1] Waseda Univ, Dept Comp Sci & Engn, Tokyo, Japan
[2] NTT CORP, Software Innovat Ctr, Tokyo, Japan
[3] NTT CORP, Comp & Data Sci Labs, Tokyo, Japan
来源
FRONTIERS IN COMPUTER SCIENCE | 2023年 / 4卷
关键词
bug reports; duplicate detection; BERT; sentence embedding; natural language processing; information retrieval;
D O I
10.3389/fcomp.2022.1032452
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Industrial software maintenance is critical but burdensome. Activities such as detecting duplicate bug reports are often performed manually. Herein an automated duplicate bug report detection system improves maintenance efficiency using vectorization of the contents and deep learning-based sentence embedding to calculate the similarity of the whole report from vectors of individual elements. Specifically, sentence embedding is realized using Sentence-BERT fine tuning. Additionally, its performance is experimentally compared to baseline methods to validate the proposed system. The proposed system detects duplicate bug reports more effectively than existing methods.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Health Care Language Models and Their Fine-Tuning for Information Extraction: Scoping Review
    Nunes, Miguel
    Bone, Joao
    Ferreira, Joao C.
    Elvas, Luis B.
    JMIR MEDICAL INFORMATICS, 2024, 12
  • [42] Fine-Tuning BERT on Coarse-Grained Labels: Exploring Hidden States for Fine-Grained Classification
    Anjum, Aftab
    Krestel, Ralf
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT I, NLDB 2024, 2024, 14762 : 1 - 15
  • [43] Chinese Medical Named Entity Recognition based on Expert Knowledge and Fine-tuning Bert
    Zhang, Bofeng
    Yao, Xiuhong
    Li, Haiyan
    Aini, Mirensha
    2023 IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH, ICKG, 2023, : 84 - 90
  • [44] Investigation of BERT Model on Biomedical Relation Extraction Based on Revised Fine-tuning Mechanism
    Su, Peng
    Vijay-Shanker, K.
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 2522 - 2529
  • [45] Parameter-Efficient Fine-Tuning Method for Task-Oriented Dialogue Systems
    Mo, Yunho
    Yoo, Joon
    Kang, Sangwoo
    MATHEMATICS, 2023, 11 (14)
  • [46] KNOWLEDGE DISTILLATION FROM BERT IN PRE-TRAINING AND FINE-TUNING FOR POLYPHONE DISAMBIGUATION
    Sun, Hao
    Tan, Xu
    Gan, Jun-Wei
    Zhao, Sheng
    Han, Dongxu
    Liu, Hongzhi
    Qin, Tao
    Liu, Tie-Yan
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 168 - 175
  • [47] Pre-training Fine-tuning data Enhancement method based on active learning
    Cao, Deqi
    Ding, Zhaoyun
    Wang, Fei
    Ma, Haoyang
    2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, 2022, : 1447 - 1454
  • [48] Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation
    Xu, Yi-Ge
    Qiu, Xi-Peng
    Zhou, Li-Gao
    Huang, Xuan-Jing
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2023, 38 (04) : 853 - 866
  • [49] Classification of Fake News by Fine-tuning Deep Bidirectional Transformers based Language Model
    Aggarwal, Akshay
    Chauhan, Aniruddha
    Kumar, Deepika
    Mittal, Mamta
    Verma, Sharad
    EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2020, 7 (27): : 1 - 12
  • [50] Fine-Tuning of Distil-BERT for Continual Learning in Text Classification: An Experimental Analysis
    Shah, Sahar
    Manzoni, Sara Lucia
    Zaman, Farooq
    Es Sabery, Fatima
    Epifania, Francesco
    Zoppis, Italo Francesco
    IEEE ACCESS, 2024, 12 : 104964 - 104982