Automation of Software Code Analysis Using Machine Learning Methods

被引:0
作者
Moshkin, V. S. [1 ]
Dyrnochkin, A. A. [1 ]
Yarushkina, N. G. [1 ]
机构
[1] Ulyanovsk State Tech Univ, Ulyanovsk 432027, Russia
关键词
software code review; machine learning;
D O I
10.1134/S1054661823030318
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The paper presents a description of the developed approach and service for the intelligent analysis of source code in Python. The service reduces the time of code verification by partial automation. The FastText algorithm is used to obtain vector representations of source code texts. A pretrained neural network language model based on the transformer architecture was used to obtain a possible assignment of a natural language function. A classifier based on the gradient enhancement algorithm was used to detect repetitive headers. The developed service checks the set of changes and publishes error reports and duplicates in the format of comments of the set of changes after publishing the set of changes in the remote Git repository.
引用
收藏
页码:417 / 424
页数:8
相关论文
共 13 条
[1]  
Bojanowski P, 2017, Transactions of the Association for Computational Linguistics, V5, P135, DOI [DOI 10.1162/TACL_A_00051, 10.1162/tacla00051]
[2]   ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning [J].
Elnaggar, Ahmed ;
Heinzinger, Michael ;
Dallago, Christian ;
Rehawi, Ghalia ;
Wang, Yu ;
Jones, Llion ;
Gibbs, Tom ;
Feher, Tamas ;
Angerer, Christoph ;
Steinegger, Martin ;
Bhowmik, Debsindhu ;
Rost, Burkhard .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) :7112-7127
[3]   Greedy function approximation: A gradient boosting machine [J].
Friedman, JH .
ANNALS OF STATISTICS, 2001, 29 (05) :1189-1232
[4]   Will they like this? Evaluating Code Contributions With Language Models [J].
Hellendoorn, Vincent J. ;
Devanbu, Premkumar T. ;
Bacchelli, Alberto .
12TH WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2015), 2015, :157-167
[5]   On the Naturalness of Software [J].
Hindle, Abram ;
Barr, Earl T. ;
Gabel, Mark ;
Su, Zhendong ;
Devanbu, Premkumar .
COMMUNICATIONS OF THE ACM, 2016, 59 (05) :122-131
[6]  
Hu XY, 2019, NEW FRONT EDUC RES, P1, DOI [10.1007/978-981-13-8203-1_1, 10.1007/s10664-019-09730-9]
[7]   Cloning considered harmful considered harmful: patterns of cloning in software [J].
Kapser, Cory J. ;
Godfrey, Michael W. .
EMPIRICAL SOFTWARE ENGINEERING, 2008, 13 (06) :645-692
[8]  
Mikolov T, 2013, Arxiv, DOI arXiv:1301.3781
[9]  
Moshkin V. S., 2014, 14 NAT C ART INT INT, V3, P173
[10]  
Pennington J., 2014, Proceedings of the Empiricial Methods in Natural Language Processing EMNLP 2014, DOI 10.3115/v1/D14-1162