Compiler Fuzzing Test Case Generation with Feed-forward Neural Network

被引:0
作者
Xu H.-R. [1 ]
Wang Y.-J. [1 ]
Huang Z.-J. [2 ]
Xie P.-D. [1 ]
Fan S.-H. [1 ]
机构
[1] College of Computer Science and Technology, National University of Defense Technology, Changsha
[2] Institute of System Engineering, Academy of Military Sciences, Beijing
来源
Ruan Jian Xue Bao/Journal of Software | 2022年 / 33卷 / 06期
关键词
Abstract syntax network; Compiler fuzzing; Deep learning; Feed-forward neural network; Software defect;
D O I
10.13328/j.cnki.jos.006565
中图分类号
学科分类号
摘要
Compiler fuzzing is one of the commonly used techniques to test the functionality and safety of compilers. The fuzzer produces grammatically correct test cases to test the deep parts of the compiler. Recently, recurrent neural networks-based deep learning methods have been introduced to the test case generation process. Aiming at the problems of insufficient grammatical accuracy and low generation efficiency when generating test cases, a method for generating compiler fuzzing test cases is proposed based on feed-forward neural networks, and the prototype tool FAIR is designed and implemented. Different from the method based on token sequence learning, FAIR extracts code fragments from the abstract syntax tree, and uses a self-attention-based feed-forward neural network to capture the grammatical associations between code fragments. After learning a generative model of the programming language, fair automatically produce diverse test cases. Experimental results show that FAIR is superior to its competitors in terms of grammatical accuracy and generation efficiency of generating test cases. The proposed method has significantly improved the ability to detect compiler software defects, and has successfully detected 20 software defects in GCC and LLVM. In addition, the method has sound portability. The simple ported FAIR-JS has detected 2 defects in the JavaScript engine. © Copyright 2022, Institute of Software, the Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:1996 / 2011
页数:15
相关论文
共 30 条
[1]  
Chen JJ, Patra J, Pradel M, Xiong YF, Zhang HY, Hao D, Zhang L., A survey of compiler testing, ACM Computing Surveys (CSUR), 53, 1, pp. 1-36, (2020)
[2]  
Cummins C, Petoumenos P, Murray A, Leather H., Compiler fuzzing through deep learning, Proc. of the 27th ACM SIGSOFT Int'l Symp. on Software Testing and Analysis, pp. 95-105, (2018)
[3]  
Yang XJ, Chen Y, Eide E, Regehr J., Finding and understanding bugs in C compilers, Proc. of the 32nd ACM SIGPLAN Conf. on Programming Language Design and Implementation, pp. 283-294, (2011)
[4]  
Kolen JF, Kremer SC., Gradient flow in recurrent nets: The difficulty of learning LongTerm dependencies, A Field Guide to Dynamical Recurrent Networks, pp. 237-243, (2001)
[5]  
Liu X, Li X, Prajapati R, Wu D., Deepfuzz: Automatic generation of syntax valid C programs for fuzz testing, Proc. of the AAAI Conf. on Artificial Intelligence, 33, 1, pp. 1044-1051, (2019)
[6]  
Le V, Afshari M, Su ZD., Compiler validation via equivalence modulo inputs, ACM SIGPLAN Notices, 49, 6, pp. 216-226, (2014)
[7]  
Le V, Sun CN, Su ZD., Finding deep compiler bugs via guided stochastic program mutation, ACM SIGPLAN Notices, 50, 10, pp. 386-399, (2015)
[8]  
Sun CN, Le V, Su ZD., Finding compiler bugs via live code mutation, Proc. of the 2016 ACM SIGPLAN Int'l Conf. on Object- oriented Programming, Systems, Languages, and Applications, pp. 849-863, (2016)
[9]  
Chen P, Chen H., Angora: Efficient fuzzing by principled search, Proc. of the 2018 IEEE Symp. on Security and Privacy (SP), pp. 711-725, (2018)
[10]  
Holler C, Herzig K, Zeller A., Fuzzing with code fragments, Proc. of the 21st {USENIX} Security Symp. ({USENIX} Security 2012), pp. 445-458, (2012)