Graphuzz: Data-driven Seed Scheduling for Coverage-guided Greybox Fuzzing

被引:0
|
作者
Xu, Hang [1 ]
Chen, Liheng [2 ]
Gan, Shuitao [3 ]
Zhang, Chao [3 ]
Li, Zheming [3 ]
Ji, Jiangan [4 ]
Chen, Baojian [2 ]
Hu, Fan [1 ]
机构
[1] Minist Educ, Key Lab Cyberspace Secur, Zhengzhou, Henan, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
[3] Tsinghua Univ, Beijing, Peoples R China
[4] Informat Engn Univ, Zhengzhou, Peoples R China
关键词
Fuzzing; seed scheduling; graph neural network;
D O I
10.1145/3664603
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Seed scheduling is a critical step of greybox fuzzing, which assigns different weights to seed test cases during seed selection, and significantly impacts the efficiency of fuzzing. Existing seed scheduling strategies rely on manually designed models to estimate the potentials of seeds and determine their weights, which fails to capture the rich information of a seed and its execution and thus the estimation of seeds' potentials is not optimal. In this article, we introduce a new seed scheduling solution, Graphuzz, for coverage-guided greybox fuzzing, which utilizes deep learning models to estimate the potentials of seeds and works in a data-driven way. Specifically, we propose an extended control flow graph called e-CFG to represent the control-flow and data-flow features of a seed's execution, which is suitable for graph neural networks (GNN) to process and estimate seeds' potential. We evaluate each seed's code coverage increment and use it as the label to train the GNN model. Further, we propose a self-attention mechanism to enhance the GNN model so that it can capture overlooked features. We have implemented a prototype of Graphuzz based on the baseline fuzzer AFLplusplus. The evaluation results show that our model can estimate the potential of seeds and has the robust capability to generalize to different targets. Furthermore, the evaluation using 12 benchmarks from FuzzBench shows that Graphuzz outperforms AFLplusplus and the state-of-the-art seed scheduling solution K-Scheduler and other coverage-guided fuzzers in terms of code coverage, and the evaluation using 8 benchmarks from Magma shows that Graphuzz outperforms the baseline fuzzer AFLplusplus and SOTA solutions in terms of bug detection.
引用
收藏
页数:36
相关论文
共 50 条
  • [41] MalFuzz: Coverage-guided fuzzing on deep learning-based malware classification model
    Liu, Yuying
    Yang, Pin
    Jia, Peng
    He, Ziheng
    Luo, Hairu
    PLOS ONE, 2022, 17 (09):
  • [42] Just Fuzz It: Solving Floating-Point Constraints using Coverage-Guided Fuzzing
    Liew, Daniel
    Cadar, Cristian
    Donaldson, Alastair F.
    Stinnett, J. Ryan
    ESEC/FSE'2019: PROCEEDINGS OF THE 2019 27TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2019, : 521 - 532
  • [43] Fuzzing JavaScript Interpreters with Coverage-Guided Reinforcement Learning for LLM-Based Mutation
    Eom, Jueon
    Jeong, Seyeon
    Kwon, Taekyoung
    ISSTA 2024 - Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis, : 1656 - 1668
  • [44] CGFuzzer: A Fuzzing Approach Based on Coverage-Guided Generative Adversarial Networks for Industrial IoT Protocols
    Yu, Zhenhua
    Wang, Haolu
    Wang, Dan
    Li, Zhiwu
    Song, Houbing
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (21) : 21607 - 21619
  • [45] ARM-AFL: Coverage-Guided Fuzzing Framework for ARM-Based IoT Devices
    Fan, Rong
    Pan, Jianfeng
    Huang, Shaomang
    APPLIED CRYPTOGRAPHY AND NETWORK SECURITY WORKSHOPS, ACNS 2020, 2020, 12418 : 239 - 254
  • [46] signatr: A Data-Driven Fuzzing Tool for R
    Turcotte, Alexi
    Donat-Bouillud, Pierre
    Krikava, Filip
    Vitek, Jan
    PROCEEDINGS OF THE 15TH ACM SIGPLAN INTERNATIONAL CONFERENCE ON SOFTWARE LANGUAGE ENGINEERING, SLE 2022, 2022, : 216 - 221
  • [47] CAGFuzz: Coverage-Guided Adversarial Generative Fuzzing Testing for Image-Based Deep Learning Systems
    Zhang, Pengcheng
    Ren, Bin
    Dong, Hai
    Dai, Qiyin
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (11) : 4630 - 4646
  • [48] Fuzzing Approach of Clustering Analysis-driven in Seed Scheduling
    Zhang W.
    Chen J.-F.
    Cai S.-H.
    Zhang C.
    Liu Y.-S.
    Ruan Jian Xue Bao/Journal of Software, 2024, 35 (07): : 3141 - 3161
  • [49] Data-driven appointment scheduling
    Fiems, Dieter
    PROCEEDINGS OF THE 12TH EAI INTERNATIONAL CONFERENCE ON PERFORMANCE EVALUATION METHODOLOGIES AND TOOLS (VALUETOOLS 2019), 2019, : 3 - 3
  • [50] Data-Driven Batch Scheduling
    Bent, John
    Denehy, Timothy E.
    Livny, Miron
    Arpaci-Dusseau, Andrea C.
    Arpaci-Dusseau, Remzi H.
    DADC 2009: SECOND INTERNATIONAL WORKSHOP ON DATA AWARE DISTRIBUTED COMPUTING, 2009, : 1 - 10