Input-Output Example-Guided Data Deobfuscation on Binary

被引:3
|
作者
Zhao, Yujie [1 ,2 ]
Tang, Zhanyong [2 ]
Ye, Guixin [2 ]
Gong, Xiaoqing [2 ]
Fang, Dingyi [2 ]
机构
[1] Shaanxi Normal Univ, Dept Informationizat Construct & Management, Xian 710119, Peoples R China
[2] Northwest Univ, Sch Comp Sci & Technol, Xian 710127, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1155/2021/4646048
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data obfuscation is usually used by malicious software to avoid detection and reverse analysis. When analyzing the malware, such obfuscations have to be removed to restore the program into an easier understandable form (deobfuscation). The deobfuscation based on program synthesis provides a good solution for treating the target program as a black box. Thus, deobfuscation becomes a problem of finding the shortest instruction sequence to synthesize a program with the same input-output behavior as the target program. Existing work has two limitations: assuming that obfuscated code snippets in the target program are known and using a stochastic search algorithm resulting in low efficiency. In this paper, we propose fine-grained obfuscation detection for locating obfuscated code snippets by machine learning. Besides, we also combine the program synthesis and a heuristic search algorithm of Nested Monte Carlo Search. We have applied a prototype implementation of our ideas to data obfuscation in different tools, including OLLVM and Tigress. Our experimental results suggest that this approach is highly effective in locating and deobfuscating the binaries with data obfuscation, with an accuracy of at least 90.34%. Compared with the state-of-the-art deobfuscation technique, our approach's efficiency has increased by 75%, with the success rate increasing by 5%.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Example-guided segmentation
    Chica, Antoni
    Monclus, Eva
    Brunet, Pere
    Navazo, Isabel
    Vinacua, Alvar
    GRAPHICAL MODELS, 2012, 74 : 302 - 310
  • [2] Example-Guided Abstraction Simplification
    Giacobazzi, Roberto
    Ranzato, Francesco
    AUTOMATA, LANGUAGES AND PROGRAMMING, PT II, 2010, 6199 : 211 - +
  • [3] INPUT-OUTPUT MODEL WITH INTERVAL DATA
    ROHN, J
    ECONOMETRICA, 1980, 48 (03) : 767 - 769
  • [4] DATA DIRECTED INPUT-OUTPUT IN FORTRAN
    HASSITT, A
    COMMUNICATIONS OF THE ACM, 1967, 10 (01) : 35 - &
  • [5] MARKETING USES OF INPUT-OUTPUT DATA
    Evans, W. Duane
    JOURNAL OF MARKETING, 1952, 17 (01) : 11 - 21
  • [6] Example-Guided Synthesis of Relational Queries
    Thakkar, Aalok
    Naik, Aaditya
    Sands, Nathaniel
    Alur, Rajeev
    Naik, Mayur
    Raghothaman, Mukund
    PROCEEDINGS OF THE 42ND ACM SIGPLAN INTERNATIONAL CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION (PLDI '21), 2021, : 1110 - 1125
  • [7] INPUT-OUTPUT MULTIPLIERS WITH ERRORS IN INPUT-OUTPUT COEFFICIENTS
    PARK, SH
    JOURNAL OF ECONOMIC THEORY, 1973, 6 (04) : 399 - 403
  • [8] The input-output weight enumeration of binary Hamming codes
    Loskot, Pavel
    Beaulieu, Norman C.
    EUROPEAN TRANSACTIONS ON TELECOMMUNICATIONS, 2006, 17 (04): : 483 - 488
  • [9] Optimal Transportation for Example-Guided Color Transfer
    Frigo, Oriel
    Sabater, Neus
    Demoulin, Vincent
    Hellier, Pierre
    COMPUTER VISION - ACCV 2014, PT III, 2015, 9005 : 655 - 670
  • [10] Data envelopment analysis with fuzzy input-output data
    Inuiguchi, M
    Tanino, T
    RESEARCH AND PRACTICE IN MULTIPLE CRITERIA DECISION MAKING, 2000, 487 : 296 - 307