Boosting Fuzzer Efficiency: An Information Theoretic Perspective

被引:52
作者
Bohme, Marcel [1 ]
Manes, Valentin J. M. [2 ]
Cha, Sang Kil [2 ]
机构
[1] Monash Univ, Clayton, Vic, Australia
[2] Korea Adv Inst Sci & Technol, CSRC, Daejeon, South Korea
来源
PROCEEDINGS OF THE 28TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '20) | 2020年
基金
澳大利亚研究理事会;
关键词
software testing; fuzzing; efficiency; information theory; entropy;
D O I
10.1145/3368089.3409748
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we take the fundamental perspective of fuzzing as a learning process. Suppose before fuzzing, we know nothing about the behaviors of a program P: What does it do? Executing the first test input, we learn how P behaves for this input. Executing the next input, we either observe the same or discover a new behavior. As such, each execution reveals "some amount" of information about P's behaviors. A classic measure of information is Shannon's entropy. Measuring entropy allows us to quantify how much is learned from each generated test input about the behaviors of the program. Within a probabilistic model of fuzzing, we show how entropy also measures fuzzer efficiency. Specifically, it measures the general rate at which the fuzzer discovers new behaviors. Intuitively, efficient fuzzers maximize information. From this information theoretic perspective, we develop ENTROPIC, an entropy-based power schedule for greybox fuzzing which assigns more energy to seeds that maximize information. We implemented ENTROPIC into the popular greybox fuzzer LIBFUZZER. Our experiments with more than 250 open-source programs (60 million LoC) demonstrate a substantially improved efficiency and confirm our hypothesis that an efficient fuzzer maximizes information. ENTROPIC has been independently evaluated and invited for integration into main-line LIBFUZZER. ENTROPIC now runs on more than 25,000 machines fuzzing hundreds of security-critical software systems simultaneously and continuously.
引用
收藏
页码:678 / 689
页数:12
相关论文
共 45 条
  • [1] Aarya Abhishek, 2019, Open sourcing ClusterFuzz
  • [2] Alshahwan N., 2014, ISSTA 2014, P181, DOI 10.1145/2610384.2610413
  • [3] Exploiting the Saturation Effect in Automatic Random Testing of Android Applications
    Amalfitano, Domenico
    Amatucci, Nicola
    Fasolino, Anna Rita
    Tramontana, Porfirio
    Kowalczyk, Emily
    Memon, Atif M.
    [J]. 2ND ACM INTERNATIONAL CONFERENCE ON MOBILE SOFTWARE ENGINEERING AND SYSTEMS MOBILESOFT 2015, 2015, : 33 - 43
  • [4] A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering
    Arcuri, Andrea
    Briand, Lionel
    [J]. 2011 33RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2011, : 1 - 10
  • [5] IJON: Exploring Deep State Spaces via Fuzzing
    Aschermann, Cornelius
    Schumilo, Sergej
    Abbasi, Ali
    Holz, Thorsten
    [J]. 2020 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP 2020), 2020, : 1597 - 1612
  • [6] A Probabilistic Analysis of the Efficiency of Automated Software Testing
    Boehme, Marcel
    Paul, Soumya
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2016, 42 (04) : 345 - 360
  • [7] Bohme M., 2020, P 28 ACM JOINT M EUR, P713
  • [8] Coverage-Based Greybox Fuzzing as Markov Chain
    Bohme, Marcel
    Van-Thuan Pham
    Roychoudhury, Abhik
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2019, 45 (05) : 489 - 506
  • [9] STADS: Software Testing as Species Discovery
    Bohme, Marcel
    [J]. ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2018, 27 (02)
  • [10] Observability analysis and active control for airborne SLAM
    Bryson, Mitch
    Sukkarieh, Salah
    [J]. IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2008, 44 (01) : 261 - 280