CloneRipples: predicting change propagation between code clone instances by graph-based deep learning

被引:0
作者
Wu, Yijian [1 ,2 ]
Chen, Yuan [1 ,2 ]
Peng, Xin [1 ,2 ]
Hu, Bin [1 ,2 ]
Wang, Xiaochen [1 ,2 ]
Fu, Baiqiang [1 ,2 ]
Zhao, Wenyun [1 ,2 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai, Peoples R China
[2] Fudan Univ, Shanghai Key Lab Data Sci, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
Code clone; Change propagation; Consistent changes; Clone dataset; CONSISTENCY; PRONENESS;
D O I
10.1007/s10664-024-10567-0
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Code clones are recognized as a code smell that may require additional effort for simultaneous changes of multiple clone instances during software maintenance. To alleviate quality threats caused by inconsistent changes in clone instances, it is essential to accurately and efficiently make the decisions of change propagation between code clone instances. Our exploratory study has revealed that a clone class can have both propagation-required changes and propagation-free changes and thus fine-grained change propagation decisions are required. Based on the findings, we propose a graph-based deep learning approach to predict the change propagation requirements of clone instances. We design a deep learning model that employs a Relational Graph Convolutional Network (R-GCN) to predict the clone change propagation requirement. In order to evaluate our approach, we construct a dataset that includes 24,672 pairs of matched changes and 38,041 non-matched changes based on 51 open-source Java projects. Experiment results show that the approach achieves high precision (83.1%), recall (81.2%), and F1-score (82.1%). We implemented an IntelliJ IDEA tool called CloneRipples to assist developers to decide the necessity of change propagation between code clone instances seamlessly in development environment. Manual inspection identified the chances for purifying the dataset by rectifying the data labels of non-matched changes. Extended experiments for various data purification strategies reveal feasible ways to improve the prediction effectiveness and generality.
引用
收藏
页数:31
相关论文
共 53 条
[1]  
Aversano L, 2007, CSMR 2007: 11TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING, PROCEEDINGS, P81
[2]  
Barbour L., 2011, 2011 IEEE 27th International Conference on Software Maintenance, P273, DOI 10.1109/ICSM.2011.6080794
[3]   An investigation of the fault-proneness of clone evolutionary patterns [J].
Barbour, Liliane ;
An, Le ;
Khomh, Foutse ;
Zou, Ying ;
Wang, Shaohua .
SOFTWARE QUALITY JOURNAL, 2018, 26 (04) :1187-1222
[4]   An empirical study of faults in late propagation clone genealogies [J].
Barbour, Liliane ;
Khomh, Foutse ;
Zou, Ying .
JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2013, 25 (11) :1139-1165
[5]  
Bazrafshan S., 2012, 2012 12th IEEE Working Conference on Source Code Analysis and Manipulation (SCAM 2012), P74, DOI 10.1109/SCAM.2012.18
[6]  
Cheng X, 2016, 2016 IEEE 24TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC)
[7]   Managing Code Clones Using Dynamic Change Tracking and Resolution [J].
de Wit, Michiel ;
Zaidman, Andy ;
van Deursen, Arie .
2009 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, CONFERENCE PROCEEDINGS, 2009, :169-178
[8]  
Duala-Ekoko E, 2007, PROC INT CONF SOFTW, P158
[9]  
Feng ZY, 2020, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, P1536
[10]   THE PROGRAM DEPENDENCE GRAPH AND ITS USE IN OPTIMIZATION [J].
FERRANTE, J ;
OTTENSTEIN, KJ ;
WARREN, JD .
ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 1987, 9 (03) :319-349