Leveraging structural properties of source code graphs for just-in-time bug prediction

被引:11
作者
Nadim, Md [1 ]
Mondal, Debajyoti [1 ]
Roy, Chanchal K. [1 ]
机构
[1] Univ Saskatchewan, Saskatoon, SK, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Source code visualization; Graph representation; Graph attribute; Machine learning models; Feature selection; Classification; SOFTWARE CHANGES;
D O I
10.1007/s10515-022-00326-0
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The most common use of data visualization is to minimize the complexity for proper understanding. A graph is one of the most commonly used representations for understanding relational data. It produces a simplified representation of data that is challenging to comprehend if kept in a textual format. In this study, we propose a methodology to utilize the relational properties of source code in the form of a graph to identify Just-in-Time (JIT) bug prediction in software systems during different revisions of software evolution and maintenance. We presented a method to convert the source codes of commit patches to equivalent graph representations and named it Source Code Graph (SCG). To understand and compare multiple source code graphs, we extracted several structural properties of these .graphs, such as the density, number of cycles, nodes, edges, etc. We then utilized the attribute values of those SCGs to visualize and detect buggy software commits. We process more than 246 K software commits from 12 subject systems in this investigation. Our investigation on these 12 open-source software projects written in C++ and Java programming languages shows that if we combine the features from SCG with conventional features used in similar studies, we will get the increased performance of Machine Learning (ML) based buggy commit detection models. We also find the increase of F1 Scores in predicting buggy and non-buggy commits statistically significant using the Wilcoxon Signed Rank Test. Since SCG-based feature values represent the style or structural properties of source code updates or changes in the software system, it suggests the importance of careful maintenance of source code style or structure for keeping a software system bug-free.
引用
收藏
页数:30
相关论文
共 71 条
[1]   Evo-Clocks: Software Evolution at a Glance [J].
Alexandru, Carol V. ;
Proksch, Sebastian ;
Behnamghader, Pooyan ;
Gall, Harald C. .
2019 SEVENTH IEEE WORKING CONFERENCE ON SOFTWARE VISUALIZATION (VISSOFT), 2019, :12-22
[2]  
Allamanis M., 2018, INT C LEARN REPR
[3]   Graph-based Statistical Language Model for Code [J].
Anh Tuan Nguyen ;
Nguyen, Tien N. .
2015 IEEE/ACM 37TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, VOL 1, 2015, :858-868
[4]  
[Anonymous], 1908, BIOMETRIKA, V6, P1
[5]  
[Anonymous], 2007, Software Visualization Visualizing the Structure, Behaviour, and Evolution of Software
[6]  
[Anonymous], 2005, ACM SIGSOFT SOFTW EN, DOI 10.1145/1083142.1083147
[7]  
Asaduzzaman M., 2012, 2012 9th IEEE Working Conference on Mining Software Repositories (MSR 2012), P116, DOI 10.1109/MSR.2012.6224267
[8]  
Bavota G., 2012, 2012 12th IEEE Working Conference on Source Code Analysis and Manipulation (SCAM 2012), P104, DOI 10.1109/SCAM.2012.20
[9]   Clone detection using abstract syntax trees [J].
Baxter, ID ;
Yahin, A ;
Moura, L ;
Sant'Anna, M ;
Bier, L .
INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 1998, :368-377
[10]   Do Developers Introduce Bugs when they do not Communicate? The Case of Eclipse and Mozilla [J].
Bernardi, Mario Luca ;
Canfora, Gerardo ;
Di Lucca, Giuseppe A. ;
Di Penta, Massimiliano ;
Distante, Damiano .
2012 16TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING (CSMR), 2012, :139-148