Survey of Open-Source Software Defect Prediction Method

被引:0
作者
Tian X. [1 ,2 ]
Chang J. [3 ]
Zhang C. [2 ]
Rong J. [2 ,6 ]
Wang Z. [3 ]
Zhang G. [3 ]
Wang H. [1 ,2 ]
Wu G. [1 ,2 ,4 ]
Hu J. [5 ]
Zhang Y. [1 ,2 ,6 ,7 ]
机构
[1] School of Cyber Engineering, Xidian University, Xi’an
[2] National Computer Network Intrusion Protection Center, University of Chinese Academy of Sciences, Beijing
[3] School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang
[4] Guangxi Key Laboratory of Cryptography and Information Security, Guilin University of Electronic Technology, Guangxi, Guilin
[5] Graduate School of Information, Production and Systems, Waseda University
[6] College of Cyberspace Security, Hainan University, Haikou
[7] Zhongguancun Laboratory, Beijing
来源
Jisuanji Yanjiu yu Fazhan/Computer Research and Development | 2023年 / 60卷 / 07期
基金
中国国家自然科学基金;
关键词
deep learning; machine learning; metric; semantic and syntactic analysis; software defect prediction; vulnerability prediction;
D O I
10.7544/issn1000-1239.202221046
中图分类号
学科分类号
摘要
Open-source software defect prediction reduces software repair costs and improves product quality by mining data from software history warehouses, using the syntactic semantic features of metrics related to software defects or the source code itself, and utilizing machine learning or deep learning methods to find software defects in advance. Vulnerability prediction extracts and tags code modules by mining software instance repositories to predict whether new code instances contain vulnerabilities in order to reduce the cost of vulnerability discovery and fixing. We investigate and analyze the relevant literatures in the field of software defect prediction from 2000 to December 2022. Taking machine learning and deep learning as the starting point, we sort out two types of prediction models which are based on software metrics and grammatical semantics. Based on the two types of models, the difference and connection between software defect prediction and vulnerability prediction are analyzed. Moreover, six frontier hot issues such as dataset source and processing, code vector representation method, pre-training model improvement, deep learning model exploration, fine-grained prediction technology, software defect prediction and vulnerability prediction model migration are analyzed in detail. Finally, the future development direction of software defect prediction is pointed out. © 2023 Science Press. All rights reserved.
引用
收藏
页码:1467 / 1488
页数:21
相关论文
共 124 条
[11]  
Lina Gong, Shujuan Jiang, Li Jiang, Research progress of software defect prediction[J], Journal of Software, 30, 10, pp. 3090-3114, (2019)
[12]  
Li Yiyao, Lee S Y, Wotawa F, Et al., Using tri-relation networks for effective software fault-proneness prediction[J], IEEE Access, 7, pp. 63066-63080, (2019)
[13]  
Lee S Y, Wong W E, Li Yiyao, Et al., Software fault-proneness analysis based on composite developer-module networks[J], IEEE Access, 9, pp. 155314-155334, (2021)
[14]  
Kun Zhu, Nana Zhang, Shi Ying, Et al., Within-project and cross-project software defect prediction based on improved transfer naive Bayes algorithm[J], Computers, Materials and Continua, 63, 2, pp. 891-910, (2020)
[15]  
Akiyama F., An example of software system debugging.[J], IFIP Congress, 71, 1, pp. 353-359, (1971)
[16]  
Halstead M H., Elements of Software Science (Operating and Programming Systems Series), (1977)
[17]  
Shepperd M, Qinbao Song, Zhongbin Sun, Et al., Data quality: Some comments on the NASA software defect datasets[J], IEEE Transactions on Software Engineering, 39, 9, pp. 1208-1215, (2013)
[18]  
Khoshgoftaar T M, Kehan Gao, Napolitano A, Et al., A comparative study of iterative and non-iterative feature selection techniques for software defect prediction[J], Information Systems Frontiers, 16, 5, pp. 801-822, (2014)
[19]  
Li Zhiqiang, Jing Xiaoyuan, Zhu Xiaoke, Et al., Heterogeneous defect prediction through multiple kernel learning and ensemble learning[C], Proc of 2017 IEEE Int Conf on Software Maintenance and Evolution (ICSME), pp. 91-102, (2017)
[20]  
Kubat M, Matwin S., Addressing the curse of imbalanced training sets: One-sided selection[C], Proc of the 14th Int Conf on Machine Learning, pp. 179-186, (1997)