SootDiff Bytecode Comparison across Different Java']Java Compilers

被引:12
作者
Dann, Andreas [1 ]
Hermann, Ben [1 ]
Bodden, Eric [1 ,2 ]
机构
[1] Paderborn Univ, Heinz Nixdorf Inst, Paderborn, Germany
[2] Fraunhofer IEM, Paderborn, Germany
来源
SOAP'19: PROCEEDINGS OF THE 8TH ACM SIGPLAN INTERNATIONAL WORKSHOP ON STATE OF THE ART IN PROGRAM ANALYSIS | 2019年
关键词
Intermediate Representation; Code Clone Detection; Static Analysis;
D O I
10.1145/3315568.3329966
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Different Java compilers and compiler versions, e.g., javac or ecj, produce different bytecode from the same source code. This makes it hard to trace if the bytecode of an open-source library really matches the provided source code. Moreover, it prevents one from detecting which open-source libraries have been re-compiled and rebundled into a single jar, which is a common way to distribute an application. Such rebundling is problematic because it prevents one to check if the jar file contains open-source libraries with known vulnerabilities. To cope with these problems, we propose the tool SOOTDIFF that uses Soot's intermediate representation Jimple, in combination with code clone detection techniques, to reduce dissimilarities introduced by different compilers, and to identify clones. Our results show that SOOTDIFF successfully identifies clones in 102 of 144 cases, whereas bytecode comparison succeeds in 58 cases only.
引用
收藏
页码:14 / 19
页数:6
相关论文
共 18 条
[1]  
Baker Brenda S., 1998, P ANN C USENIX ANN T, P15
[2]  
Bauer V, 2012, 2012 28TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE (ICSM), P483, DOI 10.1109/ICSM.2012.6405311
[3]   How the Apache community upgrades dependencies: an evolutionary study [J].
Bavota, Gabriele ;
Canfora, Gerardo ;
Di Penta, Massimiliano ;
Oliveto, Rocco ;
Panichella, Sebastiano .
EMPIRICAL SOFTWARE ENGINEERING, 2015, 20 (05) :1275-1317
[4]   Clone detection using abstract syntax trees [J].
Baxter, ID ;
Yahin, A ;
Moura, L ;
Sant'Anna, M ;
Bier, L .
INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, 1998, :368-377
[5]   CodeMatch: Obfuscation Won't Conceal Your Repackaged App [J].
Glanz, Leonid ;
Amann, Sven ;
Eichberg, Michael ;
Reif, Michael ;
Hermann, Ben ;
Lerch, Johannes ;
Mezini, Mira .
ESEC/FSE 2017: PROCEEDINGS OF THE 2017 11TH JOINT MEETING ON FOUNDATIONS OF SOFTWARE ENGINEERING, 2017, :638-648
[6]  
Heinemann Lars, 2011, Top Productivity through Software Reuse. Proceedings of the 12th International Conference on Software Reuse, ICSR 2011, P207, DOI 10.1007/978-3-642-21347-2_16
[7]  
JOHNSON JH, 1994, INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS, P120, DOI 10.1109/ICSM.1994.336783
[8]   CCFinder: A multilinguistic token-based code clone detection system for large scale source code [J].
Kamiya, T ;
Kusumoto, S ;
Inoue, K .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2002, 28 (07) :654-670
[9]  
Koschke Rainer, 2007, DAGST SEM P
[10]   Do developers update their library dependencies? An empirical study on the impact of security advisories on library migration [J].
Kula, Raula Gaikovina ;
German, Daniel M. ;
Ouni, Ali ;
Ishio, Takashi ;
Inoue, Katsuro .
EMPIRICAL SOFTWARE ENGINEERING, 2018, 23 (01) :384-417