VUDDY: A Scalable Approach for Vulnerable Code Clone Discovery

被引:255
作者
Kim, Seulbae [1 ]
Woo, Seunghoon [1 ]
Lee, Heejo [1 ]
Oh, Hakjoo [1 ]
机构
[1] Korea Univ, Dept Comp Sci & Engn, Seoul, South Korea
来源
2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP) | 2017年
关键词
SOFTWARE; PATTERNS;
D O I
10.1109/SP.2017.62
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The ecosystem of open source software (OSS) has been growing considerably in size. In addition, code clones - code fragments that are copied and pasted within or between software systems - are also proliferating. Although code cloning may expedite the process of software development, it often critically affects the security of software because vulnerabilities and bugs can easily be propagated through code clones. These vulnerable code clones are increasing in conjunction with the growth of OSS, potentially contaminating many systems. Although researchers have attempted to detect code clones for decades, most of these attempts fail to scale to the size of the ever-growing OSS code base. The lack of scalability prevents software developers from readily managing code clones and associated vulnerabilities. Moreover, most existing clone detection techniques focus overly on merely detecting clones and this impairs their ability to accurately find "vulnerable" clones. In this paper, we propose VUDDY, an approach for the scalable detection of vulnerable code clones, which is capable of detecting security vulnerabilities in large software programs efficiently and accurately. Its extreme scalability is achieved by leveraging function-level granularity and a length-filtering technique that reduces the number of signature comparisons. This efficient design enables VUDDY to preprocess a billion lines of code in 14 hour and 17 minutes, after which it requires a few seconds to identify code clones. In addition, we designed a security-aware abstraction technique that renders VUDDY resilient to common modifications in cloned code, while preserving the vulnerable conditions even after the abstraction is applied. This extends the scope of VUDDY to identifying variants of known vulnerabilities, with high accuracy. In this study, we describe its principles and evaluate its efficacy and effectiveness by comparing it with existing mechanisms and presenting the vulnerabilities it detected. VUDDY outperformed four state-of-the-art code clone detection techniques in terms of both scalability and accuracy, and proved its effectiveness by detecting zero-day vulnerabilities in widely used software systems, such as Apache HTTPD and Ubuntu OS Distribution.
引用
收藏
页码:595 / 614
页数:20
相关论文
共 40 条
[1]  
Appleby A., 2008, MURMURHASH 2 0
[2]  
BAKER BS, 1995, SECOND WORKING CONFERENCE ON REVERSE ENGINEERING, PROCEEDINGS, P86, DOI 10.1109/WCRE.1995.514697
[3]   Clone detection using abstract syntax trees [J].
Baxter, ID ;
Yahin, A ;
Moura, L ;
Sant'Anna, M ;
Bier, L .
INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 1998, :368-377
[4]   Comparison and evaluation of clone detection tools [J].
Bellon, Stefan ;
Koschke, Rainer ;
Antoniol, Giuliano ;
Krinke, Jens ;
Merlo, Ettore .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2007, 33 (09) :577-591
[5]  
Dean TR, 2003, CASCON '03, P266
[6]  
Godfrey MW, 2000, PROC IEEE INT CONF S, P131, DOI 10.1109/ICSM.2000.883030
[7]   ReDeBug: Finding Unpatched Code Clones in Entire OS Distributions [J].
Jang, Jiyong ;
Agrawal, Abeer ;
Brumley, David .
2012 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2012, :48-62
[8]  
Jiang LX, 2007, PROC INT CONF SOFTW, P96
[9]  
Jiang Lingxiao, 2007, P THE 6 JOINT M EUR, P55
[10]   CCFinder: A multilinguistic token-based code clone detection system for large scale source code [J].
Kamiya, T ;
Kusumoto, S ;
Inoue, K .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2002, 28 (07) :654-670