Software Vulnerability Detection Method Based on Code Property Graph and Bi-GRU

被引:0
作者
Xiao T. [1 ]
Guan J. [1 ]
Jian S. [1 ]
Ren Y. [1 ]
Zhang J. [1 ]
Li B. [1 ]
机构
[1] College of Computer Science and Technology, National University of Defense Technology, Changsha
来源
Jisuanji Yanjiu yu Fazhan/Computer Research and Development | 2021年 / 58卷 / 08期
基金
中国国家自然科学基金;
关键词
Bi-GRU; Code property graph; Code representation; Machine learning; Vulnerability detection;
D O I
10.7544/issn1000-1239.2021.20210297
中图分类号
学科分类号
摘要
For large-scale and complex software nowadays, the forms of vulnerability code tend to be more diversified. Traditional vulnerability detection methods can not meet the requirements of diverse vulnerabilities because of their high degree of human participation and weak ability of unknown vulnerability detection. In order to improve the detection effect of unknown vulnerability, a large number of machine learning methods have been applied to the field of software vulnerability detection. Due to the high loss of syntax and semantic information in code representation, the false positive rate and false negative rate are high. To solve this issue, a software vulnerability detection method based on code property graph and Bi-GRU is proposed. This method extracts the abstract syntax tree sequence and the control flow graph sequence from the code property graph of the function as the representation method of the function representation. The representation method can reduce the loss of information in the code representation. At the same time, the method selects Bi-GRU to build feature extraction model. It can improve the feature extraction ability of vulnerability code. Experimental results show that, compared with the method represented by abstract syntax tree, this method can improve the accuracy and recall by 35% and 22%. It can improve the vulnerability detection effect of real data set for multiple software source code mixing, and effectively reduce the false positive rate and false negative rate. © 2021, Science Press. All right reserved.
引用
收藏
页码:1668 / 1685
页数:17
相关论文
共 43 条
[1]  
National Vulnerability Database
[2]  
Engler D, Chen D Y, Hallem S, Et al., Bugs as deviant behavior: A general approach to inferring errors in systems code, SIGPS Operating Systems Review, 35, 5, pp. 57-72, (2001)
[3]  
Jang J, Agrawal A, Brumley D., ReDeBug: Finding unpatched code clones in entire OS distributions, Proc of 2012 IEEE Symp on Security and Privacy, pp. 48-62, (2012)
[4]  
Cadar C, Dunbar D, Engler D R., KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs, Proc of the 8th USENIX Symp on Operating Systems Design and Implementation, pp. 209-224, (2009)
[5]  
Li Zhen, Zou Deqing, Wang Zeli, Et al., Overview of source code oriented software vulnerability static detection, Journal of Network and Information Security, 5, 1, pp. 1-14, (2019)
[6]  
Sutton M, Greene A, Amini P., Fuzzing: Brute Force Vulnerability Discovery, (2007)
[7]  
Newsome J., Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software, Chinese Journal of Engineering Mathematics, 29, 5, pp. 720-724, (2005)
[8]  
Lin Guangjun, Zhang Jun, Luo Wei, Et al., Cross-project transfer representation learning for vulnerable function discovery, IEEE Transactions on Industrial Informatics, 14, 7, pp. 3289-3297, (2018)
[9]  
Agrawal A, Menzies T., Is "better data" better than "better data miners"?: On the benefits of tuning SMOTE for defect prediction, Proc of the 40th Int Conf on Software Engineering, pp. 1050-1061, (2018)
[10]  
Zou Quanchen, Zhang Tao, Wu Runpu, Et al., From automation to intelligence: Survey of research on vulnerability discovery techniques, Journal of Tsinghua University: Science and Technology, 58, 12, pp. 45-60, (2018)