How Effective Are They? Exploring Large Language Model Based Fuzz Driver Generation

被引:0
|
作者
Zhang, Cen [1 ]
Zheng, Yaowen [1 ]
Bai, Mingqiang [2 ,3 ]
Li, Yeting [2 ,3 ]
Ma, Wei [1 ]
Xie, Xiaofei [4 ]
Li, Yuekang [5 ]
Sun, Limin [2 ,3 ]
Liu, Yang [1 ]
机构
[1] Nanyang Technol Univ, Singapore, Singapore
[2] Chinese Acad Sci, IIE, Beijing, Peoples R China
[3] UCAS, Sch Cyber Secur, Beijing, Peoples R China
[4] Singapore Management Univ, Singapore, Singapore
[5] Univ New South Wales, Sydney, NSW, Australia
来源
PROCEEDINGS OF THE 33RD ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2024 | 2024年
基金
新加坡国家研究基金会;
关键词
Fuzz Driver Generation; Fuzz Testing; Large Language Model;
D O I
10.1145/3650212.3680355
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fuzz drivers are essential for library API fuzzing. However, automatically generating fuzz drivers is a complex task, as it demands the creation of high-quality, correct, and robust API usage code. An LLM-based (Large Language Model) approach for generating fuzz drivers is a promising area of research. Unlike traditional program analysis-based generators, this text-based approach is more generalized and capable of harnessing a variety of API usage information, resulting in code that is friendly for human readers. However, there is still a lack of understanding regarding the fundamental issues on this direction, such as its effectiveness and potential challenges. To bridge this gap, we conducted the first in-depth study targeting the important issues of using LLMs to generate effective fuzz drivers. Our study features a curated dataset with 86 fuzz driver generation questions from 30 widely-used C projects. Six prompting strategies are designed and tested across five state-of-the-art LLMs with five different temperature settings. In total, our study evaluated 736,430 generated fuzz drivers, with 0.85 billion token costs ($8,000+ charged tokens). Additionally, we compared the LLM-generated drivers against those utilized in industry, conducting extensive fuzzing experiments (3.75 CPU-year). Our study uncovered that: 1) While LLM-based fuzz driver generation is a promising direction, it still encounters several obstacles towards practical applications; 2) LLMs face difficulties in generating effective fuzz drivers for APIs with intricate specifics. Three featured design choices of prompt strategies can be beneficial: issuing repeat queries, querying with examples, and employing an iterative querying process; 3) While LLM-generated drivers can yield fuzzing outcomes that are on par with those used in the industry, there are substantial opportunities for enhancement, such as extending contained API usage, or integrating semantic oracles to facilitate logical bug detection. Our insights have been implemented to improve the OSS-Fuzz-Gen project, facilitating practical fuzz driver generation in industry.
引用
收藏
页码:1223 / 1235
页数:13
相关论文
共 50 条
  • [41] Assessing clinical medicine students' acceptance of large language model: based on technology acceptance model
    Liu, Fuze
    Chang, Xiao
    Zhu, Qi
    Huang, Yue
    Li, Yifei
    Wang, Hai
    BMC MEDICAL EDUCATION, 2024, 24 (01)
  • [42] Large language model based collaborative robot system for daily task assistance
    Seunguk Choi
    David Kim
    Myeonggyun Ahn
    Dongil Choi
    JMST Advances, 2024, 6 (3) : 315 - 327
  • [43] Accelerating Large Language Model Training with Hybrid GPU-based Compression
    Xu, Lang
    Anthony, Quentin
    Zhou, Qinghua
    Alnaasan, Nawras
    Gulhane, Radha
    Shafi, Aamir
    Subramoni, Hari
    Panda, Dhabaleswar K.
    2024 IEEE 24TH INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING, CCGRID 2024, 2024, : 196 - 205
  • [44] A large language model-based building operation and maintenance information query
    Li, Yan
    Ji, Minxuan
    Chen, Junyu
    Wei, Xin
    Gu, Xiaojun
    Tang, Juemin
    ENERGY AND BUILDINGS, 2025, 334
  • [45] Improving Text Classification with Large Language Model-Based Data Augmentation
    Zhao, Huanhuan
    Chen, Haihua
    Ruggles, Thomas A.
    Feng, Yunhe
    Singh, Debjani
    Yoon, Hong-Jun
    ELECTRONICS, 2024, 13 (13)
  • [46] TARG: Tree of Action-reward Generation With Large Language Model for Cabinet Opening Using Manipulator
    Park, Sung-Gil
    Kim, Han-Byeol
    Lee, Yong-Jun
    Ahn, Woo-Jin
    Lim, Myo Taeg
    INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2025, 23 (02) : 449 - 458
  • [47] Thinking Like an Author: A Zero-Shot Learning Approach to Keyphrase Generation with Large Language Model
    Wang, Siyu
    Dai, Shengran
    Jiang, Jianhui
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT III, ECML PKDD 2024, 2024, 14943 : 335 - 350
  • [48] Large Language Model-Driven Structured Output: A Comprehensive Benchmark and Spatial Data Generation Framework
    Li, Diya
    Zhao, Yue
    Wang, Zhifang
    Jung, Calvin
    Zhang, Zhe
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2024, 13 (11)
  • [49] Similarity-Based Prompt Construction for Large Language Model in Medical Tasks
    Liu, Gaofei
    Pan, Meiqi
    Ma, Zhiyuan
    Gu, Miaomiao
    Yang, Ling
    Qin, Jiwei
    HEALTH INFORMATION PROCESSING: EVALUATION TRACK PAPERS, CHIP 2023, 2024, 2080 : 73 - 83
  • [50] Large language model based agent for process planning of fiber composite structures
    Holland, Maximilian
    Chaudhari, Kunal
    MANUFACTURING LETTERS, 2024, 40 : 100 - 103