CoCoFuzzing: Testing Neural Code Models With Coverage-Guided Fuzzing

被引:4
|
作者
Wei, Moshi [1 ]
Huang, Yuchao [2 ]
Yang, Jinqiu [3 ]
Wang, Junjie [2 ]
Wang, Song [1 ]
机构
[1] York Univ, Toronto, ON M3J 1P3, Canada
[2] Chinese Acad Sci, Inst Software, Beijing 100045, Peoples R China
[3] Concordia Univ, Montreal, PQ H3G 1M8, Canada
关键词
Codes; Testing; Fuzzing; Neurons; Software; Biological neural networks; Task analysis; Code model; deep learning (DL); fuzzy logic; language model; robustness;
D O I
10.1109/TR.2022.3208239
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deep learning (DL)-based code processing models have demonstrated good performance for tasks such as method name prediction, program summarization, and comment generation. However, despite the tremendous advancements, DL models are frequently susceptible to adversarial attacks, which pose a significant threat to the robustness and generalizability of these models by causing them to misclassify unexpected inputs. To address the issue above, numerous DL testing approaches have been proposed; however, these approaches primarily target testing DL applications in the domains of image, audio, and text analysis, etc., and cannot be "directly applied" to "neural models for code" due to the unique properties of programs. In this article, we propose a coverage-based fuzzing framework, CoCoFuzzing, for testing DL-based code processing models. In particular, we first propose 10 mutation operators to automatically generate validly and semantically preserving source code examples as tests, followed by a neuron coverage (NC)-based approach for guiding the generation of tests. The performance of CoCoFuzzing is evaluated using three state-of-the-art neural code models, i.e., NeuralCodeSum, CODE2SEQ, and CODE2VEC. Our experiment results indicate that CoCoFuzzing can generate validly and semantically preserving source code examples for testing the robustness and generalizability of these models and enhancing NC. Furthermore, these tests can be used for adversarial retraining to improve the performance of neural code models.
引用
收藏
页码:1276 / 1289
页数:14
相关论文
共 31 条
  • [1] Coverage-Guided Testing for Recurrent Neural Networks
    Huang, Wei
    Sun, Youcheng
    Zhao, Xingyu
    Sharp, James
    Ruan, Wenjie
    Meng, Jie
    Huang, Xiaowei
    IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (03) : 1191 - 1206
  • [2] Tardis: Coverage-Guided Embedded Operating System Fuzzing
    Shen, Yuheng
    Xu, Yiru
    Sun, Hao
    Liu, Jianzhong
    Xu, Zichen
    Cui, Aiguo
    Shi, Heyuan
    Jiang, Yu
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (11) : 4563 - 4574
  • [3] CAGFuzz: Coverage-Guided Adversarial Generative Fuzzing Testing for Image-Based Deep Learning Systems
    Zhang, Pengcheng
    Ren, Bin
    Dong, Hai
    Dai, Qiyin
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (11) : 4630 - 4646
  • [4] Testing Error Handling Code With Software Fault Injection and Error-Coverage-Guided Fuzzing
    Bai, Jia-Ju
    Fu, Zi-Xuan
    Xie, Kai-Tao
    Jiang, Zu-Ming
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2024, 21 (04) : 1724 - 1739
  • [5] CGFuzzer: A Fuzzing Approach Based on Coverage-Guided Generative Adversarial Networks for Industrial IoT Protocols
    Yu, Zhenhua
    Wang, Haolu
    Wang, Dan
    Li, Zhiwu
    Song, Houbing
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (21) : 21607 - 21619
  • [6] A Coverage-Guided Fuzzing Method for Automatic Software Vulnerability Detection Using Reinforcement Learning-Enabled Multi-Level Input Mutation
    Pham, Van-Hau
    Hien, Do Thi Thu
    Chuong, Nguyen Phuc
    Thai, Pham Thanh
    Duy, Phan The
    IEEE ACCESS, 2024, 12 : 129064 - 129080
  • [7] JMLKelinci plus : Detecting Semantic Bugs and Covering Branches with Valid Inputs Using Coverage-guided Fuzzing and Runtime Assertion Checking
    Nilizadeh, Amirfarhad
    Leavens, Gary T.
    Pasareanu, Corina S.
    Noller, Yannic
    FORMAL ASPECTS OF COMPUTING, 2024, 36 (01)
  • [8] Coverage-guided Intelligent Test Loop A Concept for Applying Instrumented Testing to Self-organising Systems
    Kantert, Jan
    Tomforde, Sven
    Weber, Susanne
    Mueller-Schloer, Christian
    ICINCO: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, VOL 1, 2016, : 221 - 226
  • [9] A Seed Scheduling Method With a Reinforcement Learning for a Coverage Guided Fuzzing
    Choi, Gyeongtaek
    Jeon, Seungho
    Cho, Jaeik
    Moon, Jongsub
    IEEE ACCESS, 2023, 11 : 2048 - 2057
  • [10] Prioritize code for testing to improve code coverage of complex software
    Li, J. Jenny
    16th IEEE International Symposium on Software Reliability Engineering, Proceedings, 2005, : 75 - 84