Sentence embedding and fine-tuning to automatically identify duplicate bugs

被引:0
|
作者
Isotani, Haruna [1 ]
Washizaki, Hironori [1 ]
Fukazawa, Yoshiaki [1 ]
Nomoto, Tsutomu [2 ]
Ouji, Saori [3 ]
Saito, Shinobu [3 ]
机构
[1] Waseda Univ, Dept Comp Sci & Engn, Tokyo, Japan
[2] NTT CORP, Software Innovat Ctr, Tokyo, Japan
[3] NTT CORP, Comp & Data Sci Labs, Tokyo, Japan
来源
FRONTIERS IN COMPUTER SCIENCE | 2023年 / 4卷
关键词
bug reports; duplicate detection; BERT; sentence embedding; natural language processing; information retrieval;
D O I
10.3389/fcomp.2022.1032452
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Industrial software maintenance is critical but burdensome. Activities such as detecting duplicate bug reports are often performed manually. Herein an automated duplicate bug report detection system improves maintenance efficiency using vectorization of the contents and deep learning-based sentence embedding to calculate the similarity of the whole report from vectors of individual elements. Specifically, sentence embedding is realized using Sentence-BERT fine tuning. Additionally, its performance is experimentally compared to baseline methods to validate the proposed system. The proposed system detects duplicate bug reports more effectively than existing methods.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Predicting Protein-DNA Binding Sites by Fine-Tuning BERT
    Zhang, Yue
    Chen, Yuehui
    Chen, Baitong
    Cao, Yi
    Chen, Jiazi
    Cong, Hanhan
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2022, PT II, 2022, 13394 : 663 - 669
  • [32] Fine-tuning your answers: a bag of tricks for improving VQA models
    Arroyo, Roberto
    Alvarez, Sergio
    Aller, Aitor
    Bergasa, Luis M.
    Ortiz, Miguel E.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (19) : 26889 - 26913
  • [33] MediBioDeBERTa: Biomedical Language Model With Continuous Learning and Intermediate Fine-Tuning
    Kim, Eunhui
    Jeong, Yuna
    Choi, Myung-Seok
    IEEE ACCESS, 2023, 11 : 141036 - 141044
  • [34] Two-stage fine-tuning with ChatGPT data augmentation for learning class-imbalanced data
    Valizadehaslani, Taha
    Shi, Yiwen
    Wang, Jing
    Ren, Ping
    Zhang, Yi
    Hu, Meng
    Zhao, Liang
    Liang, Hualou
    NEUROCOMPUTING, 2024, 592
  • [35] EEBERT: An Emoji-Enhanced BERT Fine-Tuning on Amazon Product Reviews for Text Sentiment Classification
    Narejo, Komal Rani
    Zan, Hongying
    Dharmani, Kheem Parkash
    Zhou, Lijuan
    Alahmadi, Tahani Jaser
    Assam, Muhammad
    Sehito, Nabila
    Ghadi, Yazeed Yasin
    IEEE ACCESS, 2024, 12 : 131954 - 131967
  • [36] Research Paper Classification and Recommendation System based-on Fine-Tuning BERT
    Biswas, Dipto
    Gil, Joon-Min
    2023 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI, 2023, : 295 - 296
  • [37] Fine-Tuning Neural Patient Question Retrieval Model with Generative Adversarial Networks
    Tang, Guoyu
    Ni, Yuan
    Wang, Keqiang
    Yong, Qin
    BUILDING CONTINENTS OF KNOWLEDGE IN OCEANS OF DATA: THE FUTURE OF CO-CREATED EHEALTH, 2018, 247 : 720 - 724
  • [38] Enhanced Discriminative Fine-Tuning of Large Language Models for Chinese Text Classification
    Song, Jinwang
    Zan, Hongying
    Zhang, Kunli
    2024 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, IALP 2024, 2024, : 168 - 174
  • [39] Short Answer Questions Generation by Fine-Tuning BERT and GPT-2
    Tsai, Danny C. L.
    Chang, Willy J. W.
    Yang, Stephen J. H.
    29TH INTERNATIONAL CONFERENCE ON COMPUTERS IN EDUCATION (ICCE 2021), VOL II, 2021, : 508 - 514
  • [40] Exploiting Syntactic Information to Boost the Fine-tuning of Pre-trained Models
    Liu, Chaoming
    Zhu, Wenhao
    Zhang, Xiaoyu
    Zhai, Qiuhong
    2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 575 - 582