CCFinder: A multilinguistic token-based code clone detection system for large scale source code

被引:939
作者
Kamiya, T [1 ]
Kusumoto, S [1 ]
Inoue, K [1 ]
机构
[1] Osaka Univ, Grad Sch Engn Sci, Funct & Configurat Grp, RPRESTO,JST, Toyonaka, Osaka 5608531, Japan
关键词
code clone; duplicated code; CASE tool; metrics; maintenance;
D O I
10.1109/TSE.2002.1019480
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
A code clone is a code portion in source files that is identical or similar to another. Since code clones are believed to reduce the maintainability of software, several code clone detection techniques and tools have been proposed. This paper proposes a new clone detection technique, which consists of the transformation of input source text and a token-by-token comparison. For its implementation with several useful optimization techniques, we have developed a tool, named CCFinder, which extracts code clones in C, C++, Java, COBOL, and other source files. As well, metrics for the code clones have been developed. In order to evaluate the usefulness of CCFinder and metrics, we conducted several case studies where we applied the new tool to the source code of JDK, FreeBSD, NetBSD, Linux, and many other systems. As a result, CCFinder has effectively found clones and the metrics have been able to effectively identify the characteristics of the systems. In addition, we have compared the proposed technique with other clone detection techniques.
引用
收藏
页码:654 / 670
页数:17
相关论文
共 15 条
  • [1] BAKER BS, 1995, SECOND WORKING CONFERENCE ON REVERSE ENGINEERING, PROCEEDINGS, P86, DOI 10.1109/WCRE.1995.514697
  • [2] Balazinska M., 1999, Proceedings Sixth International Software Metrics Symposium (Cat. No.PR00403), P292, DOI 10.1109/METRIC.1999.809750
  • [3] BALAZINSKA M, 1999, P 6 WORK C REV ENG, P326
  • [4] Clone detection using abstract syntax trees
    Baxter, ID
    Yahin, A
    Moura, L
    Sant'Anna, M
    Bier, L
    [J]. INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 1998, : 368 - 377
  • [5] Ducasse S., 1999, Proceedings IEEE International Conference on Software Maintenance - 1999 (ICSM'99). `Software Maintenance for Business Change' (Cat. No.99CB36360), P109, DOI 10.1109/ICSM.1999.792593
  • [6] GUSFIELD D, 1997, ALGORITHMS STRINGS T, P89
  • [7] JOHNSON DG, 1994, CHINA ECON REV, V4, P1
  • [8] Johnson J. H., 1993, Proceedings CASCON '93, P171
  • [9] Using design abstractions to visualize, quantify, and restructure software
    Kang, BK
    Bieman, JM
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 1998, 42 (02) : 175 - 187
  • [10] Assessing the benefits of incorporating function clone detection in a development process
    Lague, B
    Proulx, D
    Merlo, E
    Mayrand, J
    Hudepohl, J
    [J]. INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 1997, : 314 - 321