Definition, approaches, and analysis of code duplication detection (2006-2020): a critical review

被引:10
作者
Chen, Chang-Feng [1 ,3 ]
Zain, Azlan Mohd [2 ]
Zhou, Kai-Qing [3 ]
机构
[1] Univ Teknol Malaysia, Fac Engn, Sch Comp, Skudai 80310, Johor, Malaysia
[2] Univ Teknol Malaysia, Big Data Ctr, Skudai 80310, Johor, Malaysia
[3] Jishou Univ, Coll Informat Sci & Engn, Jishou 416000, Peoples R China
基金
中国国家自然科学基金;
关键词
Code duplication; Code duplication detection; System literature review; PLAGIARISM DETECTION; CLONE DETECTION; SOFTWARE SYSTEMS; ACCURATE; SIMILARITY; SEARCH;
D O I
10.1007/s00521-022-07707-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Code duplication detection is the act of finding similar code in software development. It is important for software engineer to address the issues of code duplication detection. In this paper, a critical review of previous works on code duplication for code clone and plagiarism detection is performed. The review involves five main parts. Firstly, a systematic literature review is conducted to confirm the selected articles. Secondly, a critical review of different code duplication approaches is conducted based on three phases; processing, detection, and decision. Thirdly, statistical analysis of the number of review articles is performed to show the trends and hots of code duplication research. Moreover, quantitative analysis of different code duplication approaches is presented to show the effectiveness of different approaches. Fourthly, the advantages and disadvantages of different approaches and techniques are summarized and discussed. Finally, the conclusion of the review is summarized and future research direction of code duplication is described.
引用
收藏
页码:20507 / 20537
页数:31
相关论文
共 126 条
[1]   A Metrics-Based Data Mining Approach for Software Clone Detection [J].
Abd-El-Hafiz, Salwa K. .
2012 IEEE 36TH ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), 2012, :35-41
[2]  
Acampora G., 2015, 2015 IEEE INT C FUZZ, P1, DOI [10.1109/FUZZ-IEEE.2015.7337935, DOI 10.1109/FUZZ-IEEE.2015.7337935]
[3]   Structural Code Clone Detection Methodology Using Software Metrics [J].
Aktas, Mehmet S. ;
Kapdan, Mustafa .
INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2016, 26 (02) :307-332
[4]   Sherlock N-overlap: Invasive Normalization and Overlap Coefficient for the Similarity Analysis Between Source Code [J].
Allyson, Franca B. ;
Danilo, Maciel L. ;
Jose, Soares M. ;
Giovanni, Barroso C. .
IEEE TRANSACTIONS ON COMPUTERS, 2019, 68 (05) :740-751
[5]   Automatically classifying source code using tree-based approaches [J].
Anh Viet Phan ;
Phuong Ngoc Chau ;
Minh Le Nguyen ;
Lam Thu Bui .
DATA & KNOWLEDGE ENGINEERING, 2018, 114 :12-25
[6]  
[Anonymous], 2006, KDD 06 PROC 12 ACM S
[7]  
[Anonymous], 2015, Math Probl Eng, DOI DOI 10.1155/2015/325185
[8]  
[Anonymous], 2007, PROCEEDING SLATE
[9]  
[Anonymous], P CHIN C PATT REC NA
[10]  
Bansal G, 2014, INT CONF CONTEMP, P484, DOI 10.1109/IC3.2014.6897221