OSPREY: Recovery of Variable and Data Structure via Probabilistic Analysis for Stripped Binary

被引:21
作者
Zhang, Zhuo [1 ]
Ye, Yapeng [1 ]
You, Wei [2 ]
Tao, Guanhong [1 ]
Lee, Wen-chuan [1 ]
Kwon, Yonghwi [3 ]
Aafer, Yousra [4 ]
Zhang, Xiangyu [1 ]
机构
[1] Purdue Univ, W Lafayette, IN 47907 USA
[2] Renmin Univ China, Beijing, Peoples R China
[3] Univ Virginia, Charlottesville, VA 22903 USA
[4] Univ Waterloo, Waterloo, ON, Canada
来源
2021 IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP | 2021年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/SP40001.2021.00051
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recovering variables and data structure information from stripped binary is a prominent challenge in binary program analysis. While various state-of-the-art techniques are effective in specific settings, such effectiveness may not generalize. This is mainly because the problem is inherently uncertain due to the information loss in compilation. Most existing techniques are deterministic and lack a systematic way of handling such uncertainty. We propose a novel probabilistic technique for variable and structure recovery. Random variables are introduced to denote the likelihood of an abstract memory location having various types and structural properties such as being a field of some data structure. These random variables are connected through probabilistic constraints derived through program analysis. Solving these constraints produces the posterior probabilities of the random variables, which essentially denote the recovery results. Our experiments show that our technique substantially outperforms a number of state-of-the-art systems, including IDA, Ghidra, Angr, and Howard. Our case studies demonstrate the recovered information improves binary code hardening and binary decompilation.
引用
收藏
页码:813 / 832
页数:20
相关论文
共 78 条
[1]  
Aho AlfredV., 1977, Principles of Compiler Design
[2]  
Andriesse D, 2016, PROCEEDINGS OF THE 25TH USENIX SECURITY SYMPOSIUM, P583
[3]  
[Anonymous], 2019, IDA
[4]  
[Anonymous], 2011, Tie: Principled reverse engineering of types in binary programs
[5]  
Balakrishnan G, 2004, LECT NOTES COMPUT SC, V2985, P5
[6]   WYSINWYX: What You See Is Not What You eXecute [J].
Balakrishnan, Gogul ;
Reps, Thomas .
ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 2010, 32 (06)
[7]   Superset Disassembly: Statically Rewriting x86 Binaries Without Heuristics [J].
Bauman, Erick ;
Lin, Zhiqiang ;
Hamlen, Kevin W. .
25TH ANNUAL NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM (NDSS 2018), 2018,
[8]  
Beckman NE, 2011, PLDI 11: PROCEEDINGS OF THE 2011 ACM CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION, P211
[9]   Speculative disassembly of binary code [J].
Ben Khadra, M. Ammar ;
Stoffel, Dominik ;
Kunz, Wolfgang .
2016 INTERNATIONAL CONFERENCE ON COMPILERS, ARCHITECTURE AND SYNTHESIS FOR EMBEDDED SYSTEMS (CASES), 2016,
[10]   Iterative Distribution-Aware Sampling for Probabilistic Symbolic Execution [J].
Borges, Mateus ;
Filieri, Antonio ;
d'Amorim, Marcelo ;
Pasareanu, Corina S. .
2015 10TH JOINT MEETING OF THE EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND THE ACM SIGSOFT SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE 2015) PROCEEDINGS, 2015, :866-877