Mining Common Syntactic Patterns used by Java']Java Programmers

被引:2
作者
Losada, Alvaro [1 ]
Facundo, Guillermo [1 ]
Garcia, Miguel [1 ]
Ortin, Francisco [1 ]
机构
[1] Univ Oviedo, Comp Sci Dept, C Federico Garcia Lorca 18, Oviedo 33007, Spain
关键词
!text type='Java']Java[!/text; Syntactics; Software development management; Software; IEEE transactions; Data mining; Codes; Syntactic patterns; rule mining; Abstract Syntax Trees; association rules;
D O I
10.1109/TLA.2022.9693559
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Open source code repositories provide massive data as programs that have been used to develop different tools. These kinds of works have been included in the active Big Code and Mining Software Repositories research fields. Although different machine learning works already classify the syntactic constructs used by programmers, there are no reports about the most common syntactic patterns used by Java programmers. In this article, we describe a system we build to provide such a report. Our system retrieves the syntactic patterns used by Java programmers, distinguishing those utilized by experts and beginners. We also present the anomalies found in the usage of different syntactic constructs. We modify the OpenJDK compiler to double the syntactic information included in its Abstract Syntax Tree (AST), define a mechanism to translate ASTs into n-dimensional vectors, combine the information of different syntax constructs to build heterogeneous patterns, and apply the Frequent Pattern Growth algorithm to mine the syntactic patterns as association rules. The mined patterns allow expressing hierarchical subpatterns connected to one another, providing a high level of expressiveness.
引用
收藏
页码:753 / 762
页数:10
相关论文
共 26 条
[1]  
Aggarwal Karan, 2015, Technical Report, V3, DOI DOI 10.7287/PEERJ.PREPRINTS.1459V1
[2]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[3]  
Agrawal R., 1994, P 20 VLDB C SANT CHI
[4]   Mining Idioms from Source Code [J].
Allamanis, Miltiadis ;
Sutton, Charles .
22ND ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (FSE 2014), 2014, :472-483
[5]  
Appel A.W., 2002, MODERN COMPILER IMPL, V2nd
[6]  
Barone A. V. M., 2017, ARXIV170702275V1CSCL
[7]   Clone detection using abstract syntax trees [J].
Baxter, ID ;
Yahin, A ;
Moura, L ;
Sant'Anna, M ;
Bier, L .
INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 1998, :368-377
[8]  
Bhatia S., 2016, ARXIV160306129V1CSPL
[9]  
Bhoopchand A., 2016, ARXIV161108307V1CSNE
[10]   NAR-Miner: Discovering Negative Association Rules from Code for Bug Detection [J].
Bian, Pan ;
Liang, Bin ;
Shi, Wenchang ;
Huang, Jianjun ;
Cai, Yan .
ESEC/FSE'18: PROCEEDINGS OF THE 2018 26TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2018, :411-422