CoCoFuzzing: Testing Neural Code Models With Coverage-Guided Fuzzing

被引：4

作者：

Wei, Moshi ^{[1
]}

Huang, Yuchao ^{[2
]}

Yang, Jinqiu ^{[3
]}

Wang, Junjie ^{[2
]}

Wang, Song ^{[1
]}

机构：

[1] York Univ, Toronto, ON M3J 1P3, Canada

[2] Chinese Acad Sci, Inst Software, Beijing 100045, Peoples R China

[3] Concordia Univ, Montreal, PQ H3G 1M8, Canada

来源：

IEEE TRANSACTIONS ON RELIABILITY | 2023年 / 72卷 / 03期

关键词：

Codes; Testing; Fuzzing; Neurons; Software; Biological neural networks; Task analysis; Code model; deep learning (DL); fuzzy logic; language model; robustness;

D O I：

10.1109/TR.2022.3208239

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep learning (DL)-based code processing models have demonstrated good performance for tasks such as method name prediction, program summarization, and comment generation. However, despite the tremendous advancements, DL models are frequently susceptible to adversarial attacks, which pose a significant threat to the robustness and generalizability of these models by causing them to misclassify unexpected inputs. To address the issue above, numerous DL testing approaches have been proposed; however, these approaches primarily target testing DL applications in the domains of image, audio, and text analysis, etc., and cannot be "directly applied" to "neural models for code" due to the unique properties of programs. In this article, we propose a coverage-based fuzzing framework, CoCoFuzzing, for testing DL-based code processing models. In particular, we first propose 10 mutation operators to automatically generate validly and semantically preserving source code examples as tests, followed by a neuron coverage (NC)-based approach for guiding the generation of tests. The performance of CoCoFuzzing is evaluated using three state-of-the-art neural code models, i.e., NeuralCodeSum, CODE2SEQ, and CODE2VEC. Our experiment results indicate that CoCoFuzzing can generate validly and semantically preserving source code examples for testing the robustness and generalizability of these models and enhancing NC. Furthermore, these tests can be used for adversarial retraining to improve the performance of neural code models.

引用

页码：1276 / 1289

页数：14

共 31 条

[31] Unlocking Neural Function with 3D In Vitro Models: A Technical Review of Self-Assembled, Guided, and Bioprinted Brain Organoids and Their Applications in the Study of Neurodevelopmental and Neurodegenerative Disorders
D'Antoni, Chiara
Mautone, Lorenza
Sanchini, Caterina
Tondo, Lucrezia
Grassmann, Greta
Cidonio, Gianluca
Bezzi, Paola
Cordella, Federica
Di Angelantonio, Silvia
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (13)

← 1 2 3 4 →