Towards Predicting Source Code Changes Based on Natural Language Processing Models: An Empirical Evaluation

被引:0
作者
Kaibe, Yuto [1 ]
Okamura, Hiroyuki [1 ]
Dohi, Tadashi [1 ]
机构
[1] Hiroshima Univ, Grad Sch Adv Sci & Engn, Higashihiroshima, Japan
来源
2023 IEEE 34TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING WORKSHOPS, ISSREW | 2023年
关键词
software code changes; prediction; BERT; Code understanding BERT; ORIENTED DESIGN METRICS; SOFTWARE; QUALITY;
D O I
10.1109/ISSREW60843.2023.00056
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we investigate the prediction of software code changes using a natural language processing (NLP) model. NLP is one of the most rapidly developing fields in recent years, allowing various tasks related to natural language to be performed using large-scale models. In particular, BERT (bidirectional encoder representations from transformers) is a well-known model for encoding the input sentences of natural language into an appropriate vector space and is used for various classification tasks. In this paper, we use CuBERT (code understanding BERT), which was trained on programming languages as data in the pre-training stage, to perform tasks related to program code. Specifically, we run a regression problem where the output is the number of code changes.
引用
收藏
页码:108 / 111
页数:4
相关论文
共 12 条
[1]   A validation of object-oriented design metrics as quality indicators [J].
Basili, VR ;
Briand, LC ;
Melo, WL .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1996, 22 (10) :751-761
[2]   Exploring the relationships between design measures and software quality in object-oriented systems [J].
Briand, LC ;
Wüst, J ;
Daly, JW ;
Porter, DV .
JOURNAL OF SYSTEMS AND SOFTWARE, 2000, 51 (03) :245-273
[3]   A METRICS SUITE FOR OBJECT-ORIENTED DESIGN [J].
CHIDAMBER, SR ;
KEMERER, CF .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1994, 20 (06) :476-493
[4]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[5]  
Guo D., 9 INT C LEARNING REP
[6]  
Kanade A, 2020, PR MACH LEARN RES, V119
[7]   An application of zero-inflated Poisson regression for software fault prediction [J].
Khoshgoftaar, TM ;
Gao, KH ;
Szabo, RM .
12TH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING, PROCEEDINGS, 2001, :66-73
[8]  
Lyu M.R., 1996, HDB SOFTWARE RELIABI
[9]   IDENTIFYING ERROR-PRONE SOFTWARE - AN EMPIRICAL-STUDY [J].
SHEN, VY ;
YU, TJ ;
THEBAUT, SM ;
PAULSEN, LR .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1985, 11 (04) :317-324
[10]  
Vaswani Ashish., 2018, P AMTA