Sentence embedding and fine-tuning to automatically identify duplicate bugs

被引：0

作者：

Isotani, Haruna ^{[1
]}

Washizaki, Hironori ^{[1
]}

Fukazawa, Yoshiaki ^{[1
]}

Nomoto, Tsutomu ^{[2
]}

Ouji, Saori ^{[3
]}

Saito, Shinobu ^{[3
]}

机构：

[1] Waseda Univ, Dept Comp Sci & Engn, Tokyo, Japan

[2] NTT CORP, Software Innovat Ctr, Tokyo, Japan

[3] NTT CORP, Comp & Data Sci Labs, Tokyo, Japan

来源：

FRONTIERS IN COMPUTER SCIENCE | 2023年 / 4卷

关键词：

bug reports; duplicate detection; BERT; sentence embedding; natural language processing; information retrieval;

D O I：

10.3389/fcomp.2022.1032452

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Industrial software maintenance is critical but burdensome. Activities such as detecting duplicate bug reports are often performed manually. Herein an automated duplicate bug report detection system improves maintenance efficiency using vectorization of the contents and deep learning-based sentence embedding to calculate the similarity of the whole report from vectors of individual elements. Specifically, sentence embedding is realized using Sentence-BERT fine tuning. Additionally, its performance is experimentally compared to baseline methods to validate the proposed system. The proposed system detects duplicate bug reports more effectively than existing methods.

引用

页数：13

共 50 条

[31] Predicting Protein-DNA Binding Sites by Fine-Tuning BERT
Zhang, Yue
Chen, Yuehui
Chen, Baitong
Cao, Yi
Chen, Jiazi
Cong, Hanhan
INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2022, PT II, 2022, 13394 : 663 - 669
[32] Fine-tuning your answers: a bag of tricks for improving VQA models
Arroyo, Roberto
Alvarez, Sergio
Aller, Aitor
Bergasa, Luis M.
Ortiz, Miguel E.
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (19) : 26889 - 26913
[33] MediBioDeBERTa: Biomedical Language Model With Continuous Learning and Intermediate Fine-Tuning
Kim, Eunhui
Jeong, Yuna
Choi, Myung-Seok
IEEE ACCESS, 2023, 11 : 141036 - 141044
[34] Two-stage fine-tuning with ChatGPT data augmentation for learning class-imbalanced data
Valizadehaslani, Taha
Shi, Yiwen
Wang, Jing
Ren, Ping
Zhang, Yi
Hu, Meng
Zhao, Liang
Liang, Hualou
NEUROCOMPUTING, 2024, 592
[35] EEBERT: An Emoji-Enhanced BERT Fine-Tuning on Amazon Product Reviews for Text Sentiment Classification
Narejo, Komal Rani
Zan, Hongying
Dharmani, Kheem Parkash
Zhou, Lijuan
Alahmadi, Tahani Jaser
Assam, Muhammad
Sehito, Nabila
Ghadi, Yazeed Yasin
IEEE ACCESS, 2024, 12 : 131954 - 131967
[36] Research Paper Classification and Recommendation System based-on Fine-Tuning BERT
Biswas, Dipto
Gil, Joon-Min
2023 IEEE 24TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI, 2023, : 295 - 296
[37] Fine-Tuning Neural Patient Question Retrieval Model with Generative Adversarial Networks
Tang, Guoyu
Ni, Yuan
Wang, Keqiang
Yong, Qin
BUILDING CONTINENTS OF KNOWLEDGE IN OCEANS OF DATA: THE FUTURE OF CO-CREATED EHEALTH, 2018, 247 : 720 - 724
[38] Enhanced Discriminative Fine-Tuning of Large Language Models for Chinese Text Classification
Song, Jinwang
Zan, Hongying
Zhang, Kunli
2024 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, IALP 2024, 2024, : 168 - 174
[39] Short Answer Questions Generation by Fine-Tuning BERT and GPT-2
Tsai, Danny C. L.
Chang, Willy J. W.
Yang, Stephen J. H.
29TH INTERNATIONAL CONFERENCE ON COMPUTERS IN EDUCATION (ICCE 2021), VOL II, 2021, : 508 - 514
[40] Exploiting Syntactic Information to Boost the Fine-tuning of Pre-trained Models
Liu, Chaoming
Zhu, Wenhao
Zhang, Xiaoyu
Zhai, Qiuhong
2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 575 - 582

← 1 2 3 4 5 →