Survey of Source Code Bug Detection Based on Deep Learning

被引:0
作者
Deng X. [1 ,2 ]
Ye W. [2 ]
Xie R. [2 ,3 ]
Zhang S.-K. [2 ]
机构
[1] School of Software and Microelectronics, Peking University, Beijing
[2] National Engineering Research Center for Software Engineering, Peking University, Beijing
[3] School of Electronics Engineering and Computer Science, Peking University, Beijing
来源
Ruan Jian Xue Bao/Journal of Software | 2023年 / 34卷 / 02期
关键词
code representation; deep learning; vulnerability detection;
D O I
10.13328/j.cnki.jos.006696
中图分类号
学科分类号
摘要
Source code bug (vulnerability) detection is a process of judging whether there are unexpected behaviors in the program code. It is widely used in software engineering tasks such as software testing and software maintenance, and plays a vital role in software functional assurance and application security. Traditional vulnerability detection research is based on program analysis, which usually requires strong domain knowledge and complex calculation rules, and faces the problem of state explosion, resulting in limited detection performance, and there is room for greater improvement in the rate of false positives and false negatives. In recent years, the open source community’s vigorous development has accumulated massive amounts of data with open source code as the core. In this context, the feature learning capabilities of deep learning can automatically learn semantically rich code representations, thereby providing a new way for vulnerability detection. This study collected the latest high-level papers in this field, systematically summarized and explained the current methods from two aspects: vulnerability code dataset and deep learning vulnerability detection model. Finally, it summarizes the main challenges faced by the research in this field, and looks forward to the possible future research focus. © 2023 Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:625 / 654
页数:29
相关论文
共 89 条
[41]  
Lin G, Zhang J, Luo W, Et al., Cross-project transfer representation learning for vulnerable function discovery, IEEE Trans. on Industrial Informatics, 14, 7, pp. 3289-3297, (2018)
[42]  
Li Z, Zou D, Xu S, Et al., VulDeePecker: A deep learning-based system for vulnerability detection, NDSS, (2018)
[43]  
Zou D, Wang S, Xu S, Et al., μVulDeePecker: A deep learning-based system for multiclass vulnerability detection, IEEE Trans. on Dependable and Secure Computing, (2019)
[44]  
Li Z, Zou D, Xu S, Et al., SySeVR: A framework for using deep learning to detect software vulnerabilities, IEEE Trans. on Dependable and Secure Computing, (2021)
[45]  
Xiao Y, Chen B, Yu C, Et al., Detecting vulnerabilities using patch-enhanced vulnerability signatures, Proc. of the 29th {USENIX} Security Symp. ({USENIX} Security 2020), pp. 1165-1182, (2020)
[46]  
Nikitopoulos G, Dritsa K, Louridas P, Et al., CrossVul: A cross-language vulnerability dataset with commit data, Proc. of the 29th ACM Joint Meeting on European Software Engineering Conf. and Symp. on the Foundations of Software Engineering, pp. 1565-1569, (2021)
[47]  
Lin G, Xiao W, Zhang J, Et al., Deep learning-based vulnerable function detection: A benchmark, Proc. of the Int’l Conf. on Information and Communications Security, pp. 219-232, (2019)
[48]  
Liu S, Lin G, Qu L, Et al., CD-VulD: Cross-domain vulnerability discovery based on deep domain adaptation, IEEE Trans. on Dependable and Secure Computing, (2020)
[49]  
Jimenez M, Le Traon Y, Papadakis M., Enabling the continuous analysis of security vulnerabilities with VulData7, Proc. of the 18th IEEE Int’l Working Conf. on Source Code Analysis and Manipulation (SCAM), pp. 56-61, (2018)
[50]  
Clemente CJ, Jaafar F, Malik Y., Is predicting software security bugs using deep learning better than the traditional machine learning algorithms?, Proc. of the 2018 IEEE Int’l Conf. on Software Quality, Reliability and Security (QRS), pp. 95-102, (2018)