CCAligner: a token based large-gap clone detector

被引:98
作者
Wang, Pengcheng [1 ]
Svajlenko, Jeffrey [2 ]
Wu, Yanzhao [1 ]
Xu, Yun [1 ,3 ]
Roy, Chanchal K. [2 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci, Hefei, Anhui, Peoples R China
[2] Univ Saskatchewan, Dept Comp Sci, Saskatoon, SK, Canada
[3] Key Lab High Performance Comp, Hefei, Anhui, Peoples R China
来源
PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE) | 2018年
基金
加拿大自然科学与工程研究理事会;
关键词
Clone Detection; Large-gap Clone; Evaluation; SOFTWARE; MUTATION;
D O I
10.1145/3180155.3180179
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Copying code and then pasting with large number of edits is a common activity in software development, and the pasted code is a kind of complicated Type-3 clone. Due to large number of edits, we consider the clone as a large-gap clone. Large-gap clone can reflect the extension of code, such as change and improvement. The existing state-of-the-art clone detectors suffer from several limitations in detecting large-gap clones. In this paper, we propose a tool, CCAligner, using code window that considers e edit distance for matching to detect large-gap clones. In our approach, a novel e-mismatch index is designed and the asymmetric similarity coefficient is used for similarity measure. We thoroughly evaluate CCAligner both for large-gap clone detection, and for general Type-1, Type-2 and Type-3 clone detection. The results show that CCAligner performs better than other competing tools in large-gap clone detection, and has the best execution time for 10MLOC input with good precision and recall in general Type-1 to Type-3 clone detection. Compared with existing state-of-the-art tools, CCAligner is the best performing large-gap clone detection tool, and remains competitive with the best clone detectors in general Type-1, Type-2 and Type-3 clone detection.
引用
收藏
页码:1066 / 1077
页数:12
相关论文
共 57 条
  • [1] Is mutation an appropriate tool for testing experiments?
    Andrews, JH
    Briand, LC
    Labiche, Y
    [J]. ICSE 05: 27TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, PROCEEDINGS, 2005, : 402 - 411
  • [2] [Anonymous], 1999, MODERN INFORM RETRIE
  • [3] [Anonymous], ACM COMPUTING SURVEY
  • [4] [Anonymous], [No title captured]
  • [5] [Anonymous], 2007, QUEENS SCH COMPUT T
  • [6] Finding clones with dup: Analysis of an experiment
    Baker, Brenda S.
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2007, 33 (09) : 608 - 621
  • [7] Bakota Tibor, 2007, 2007 IEEE International Conference on Software Maintenance, P24, DOI 10.1109/ICSM.2007.4362615
  • [8] Clone detection using abstract syntax trees
    Baxter, ID
    Yahin, A
    Moura, L
    Sant'Anna, M
    Bier, L
    [J]. INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 1998, : 368 - 377
  • [9] Comparison and evaluation of clone detection tools
    Bellon, Stefan
    Koschke, Rainer
    Antoniol, Giuliano
    Krinke, Jens
    Merlo, Ettore
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2007, 33 (09) : 577 - 591
  • [10] Cheng Xiao., 2016, Proceedings of the 5th International Workshop on Software Mining, P39