ROMEO: A binary vulnerability detection dataset for exploring Juliet through the lens of assembly language

被引:2
作者
Brust, Clemens-Alexander [1 ]
Sonnekalb, Tim [1 ]
Gruner, Bernd [1 ]
机构
[1] German Aerosp Ctr DLR, Inst Data Sci, Malzerstr 3-5, D-07745 Jena, Germany
关键词
C++ (programming language) - Classification (of information) - Codes (symbols) - Function evaluation - Program compilers - Statistical tests;
D O I
10.1016/j.cose.2023.103165
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Automatic vulnerability detection on C/C++ source code has benefitted from the introduction of machine learning to the field, with many recent publications targeting this combination. In contrast, assembly language or machine code artifacts receive less attention, although there are compelling reasons to study them. They are more representative of what is executed, more easily incorporated in dynamic analysis, and in the case of closed-source code, there is no alternative.Objective: We evaluate the representative capability of assembly language compared to C/C++ source code for vulnerability detection. Furthermore, we investigate the role of call graph context in detecting function-spanning vulnerabilities. Finally, we verify whether compiling a benchmark dataset compromises an experiment's soundness by inadvertently leaking label information. Method: We propose ROMEO, a publicly available, reproducible and reusable binary vulnerability de-tection benchmark dataset derived from the synthetic Juliet test suite. Alongside, we introduce a simple text-based assembly language representation that includes context for function-spanning vulnerability de-tection and semantics to detect high-level vulnerabilities. It is constructed by disassembling the .text segment of the respective binaries.Results: We evaluate an x86 assembly language representation of the compiled dataset, combined with an off-the-shelf classifier. It compares favorably to state-of-the-art methods, including those operating on the full C/C++ code. Including context information using the call graph improves detection of function-spanning vulnerabilities. There is no label information leaked during the compilation process.Conclusion: Performing vulnerability detection on a compiled program instead of the source code is a worthwhile tradeoff. While certain information is lost, e.g., comments and certain identifiers, other valu-able information is gained, e.g., about compiler optimizations.(c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页数:9
相关论文
共 24 条
[1]  
Afanador K.N., 2020, USENIX WORKSHOP CYBE
[2]  
Afanador K.N., 2021, THESIS NAVAL POSTGRA
[3]  
Allamanis M, 2019, Arxiv, DOI arXiv:1812.06469
[4]  
[Anonymous], 2013, CoRR abs/1301.3781
[5]   WYSINWYX: What You See Is Not What You eXecute [J].
Balakrishnan, Gogul ;
Reps, Thomas .
ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 2010, 32 (06)
[6]   Deep Learning Based Vulnerability Detection: Are We There Yet? [J].
Chakraborty, Saikat ;
Krishna, Rahul ;
Ding, Yangruibo ;
Ray, Baishakhi .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (09) :3280-3296
[7]  
Cho KYHY, 2014, Arxiv, DOI arXiv:1406.1078
[8]  
Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, 10.48550/arxiv.1810.04805]
[9]   A C/C plus plus Code Vulnerability Dataset with Code Changes and CVE Summaries [J].
Fan, Jiahao ;
Li, Yi ;
Wang, Shaohua ;
Nguyen, Tien N. .
2020 IEEE/ACM 17TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2020, :508-512
[10]  
Feng Zhangyin, 2020, CodeBERT: A Pre-Trained Model for Programming and Natural Languages, P1536