Software Plagiarism Detection: A Graph-based Approach

被引:43
作者
Chae, Dong-Kyu [1 ]
Ha, Jiwoon [1 ]
Kim, Sang-Wook [1 ]
Kang, BooJoong [2 ]
Im, Eul Gyu [1 ]
机构
[1] Hanyang Univ, Dept Comp & Software, Seoul, South Korea
[2] Hanyang Univ, Dept Elect & Comp Engn, Seoul, South Korea
来源
PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13) | 2013年
基金
新加坡国家研究基金会;
关键词
Software Plagiarism; Binary Analysis; Graph; Similarity;
D O I
10.1145/2505515.2507848
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As plagiarism of software increases rapidly, there are growing needs for software plagiarism detection systems. In this paper, we propose a software plagiarism detection system using an API-labeled control flow graph (A-CFG) that abstracts the functionalities of a program. The A-CFG can reflect both the sequence and the frequency of APIs, while previous work rarely considers both of them together. To perform a scalable comparison of a pair of A-CFGs, we use random walk with restart (RWR) that computes an importance score for each node in a graph. By the RWR, we can generate a single score vector for an A-CFG and can also compare A-CFGs by comparing their score vectors. Extensive evaluations on a set of Windows applications demonstrate the effectiveness and the scalability of our proposed system compared with existing methods.
引用
收藏
页码:1577 / 1580
页数:4
相关论文
共 14 条
[1]   An information-theoretic perspective of tf-idf measures [J].
Aizawa, A .
INFORMATION PROCESSING & MANAGEMENT, 2003, 39 (01) :45-65
[2]  
[Anonymous], 1997, Compiler Construction Principles and Practice
[3]  
[Anonymous], 2011, ACM CIKM
[4]  
[Anonymous], MOSS-A System for Detecting Software Plagiarism
[5]  
[Anonymous], 2002, Proceedings of the 11th international conference on World Wide Web, DOI DOI 10.1145/511446.511513
[6]  
Business Software Alliance, 2010, BSA GLOB SOFTW PIR S
[7]  
Chae Dong-Kyu., 2013, Procee dings of the 28th Annual ACM Symposium on Applied Computing (SAC '13), P1639
[8]   A static API birthmark for Windows binary executables [J].
Choi, Seokwoo ;
Park, Heewan ;
Lim, Hyun-il ;
Han, Taisook .
JOURNAL OF SYSTEMS AND SOFTWARE, 2009, 82 (05) :862-873
[9]  
Han J, 2000, Data mining: Concepts and Techniques
[10]  
Hoffmann C.M., 1982, GROUP THEORETIC ALGO