CloneCognition: Machine Learning Based Code Clone Validation Tool

被引:12
作者
Mostaeen, Golam [1 ]
Svajlenko, Jeffrey [1 ]
Roy, Banani [1 ]
Roy, Chanchal K. [1 ]
Schneider, Kevin A. [1 ]
机构
[1] Univ Saskatchewan, Saskatoon, SK, Canada
来源
ESEC/FSE'2019: PROCEEDINGS OF THE 2019 27TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING | 2019年
基金
加拿大自然科学与工程研究理事会;
关键词
Code Clones; Validation; Machine Learning; Artificial Neural Network; Clone Management;
D O I
10.1145/3338906.3341182
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
A code clone is a pair of similar code fragments, within or between software systems. To detect each possible clone pair from a software system while handling the complex code structures, the clone detection tools undergo a lot of generalization of the original source codes. The generalization often results in returning code fragments that are only coincidentally similar and not considered clones by users, and hence requires manual validation of the reported possible clones by users which is often both time-consuming and challenging. In this paper, we propose a machine learning based tool 'CloneCognition' (Open Source Codes: https://github.com/pseudoPixels/CloneCognition; Video Demonstration: https://www.youtube.com/watch?v=KYQjmdr8rsw) to automate the laborious manual validation process. The tool runs on top of any code clone detection tools to facilitate the clone validation process. The tool shows promising clone classification performance with an accuracy of up to 87.4%. The tool also exhibits significant improvement in the results when compared with state-of-the-art techniques for code clone validation.
引用
收藏
页码:1105 / 1109
页数:5
相关论文
共 21 条
[1]  
Ambient Software Evoluton Group, IJADATASET 2 0
[2]  
[Anonymous], THESIS
[3]  
[Anonymous], 1994, P 1994 C CTR ADV STU
[4]   Comparison and evaluation of clone detection tools [J].
Bellon, Stefan ;
Koschke, Rainer ;
Antoniol, Giuliano ;
Krinke, Jens ;
Merlo, Ettore .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2007, 33 (09) :577-591
[5]  
Duala-Ekoko E, 2007, PROC INT CONF SOFTW, P158
[6]  
Jiang LX, 2007, PROC INT CONF SOFTW, P96
[7]   Do Code Clones Matter? [J].
Juergens, Elmar ;
Deissenboeck, Florian ;
Hummel, Benjamin ;
Wagner, Stefan .
2009 31ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, PROCEEDINGS, 2009, :485-495
[8]   CCFinder: A multilinguistic token-based code clone detection system for large scale source code [J].
Kamiya, T ;
Kusumoto, S ;
Inoue, K .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2002, 28 (07) :654-670
[9]   Aiding comprehension of cloning through categorization [J].
Kapser, C ;
Godfrey, MW .
7TH INTERNATIONAL WORKSHOP ON PRINCIPLES OF SOFTWARE EVOLUTION, 2004, :85-94
[10]  
Kapser C, 2006, WORK CONF REVERSE EN, P19