Software defect prediction with semantic and structural information of codes based on Graph Neural Networks

被引:23
|
作者
Zhou, Chunying [1 ]
He, Peng [1 ]
Zeng, Cheng [1 ]
Ma, Ju [1 ]
机构
[1] Hubei Univ, Sch Comp Sci & Informat Engn, Wuhan, Peoples R China
基金
国家重点研发计划;
关键词
Software defect prediction; Class Dependency Network; Convolutional Neural Network; Graph Convolutional Network;
D O I
10.1016/j.infsof.2022.107057
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Most defect prediction methods consider a series of traditional manually designed static code metrics. However, only using these hand-crafted features is impractical. Some researchers use the Convolutional Neural Network (CNN) to capture the potential semantic information based on the program's Syntax Trees (ASTs). In recent years, leveraging the dependency relationships between software modules to construct a software network and using network embedding models to capture the structural information have been helpful in defect prediction. This paper simultaneously takes the semantic and structural information into account and proposes a method called CGCN. Objective: This study aims to validate the feasibility and performance of the proposed method in software defect prediction. Method: Abstract Syntax Trees and a Class Dependency Network (CDN) are first generated based on the source code. For ASTs, symbolic tokens are extracted and encoded into vectors. The numerical vectors are then used as input to the CNN to capture the semantic information. For CDN, a Graph Convolutional Network (GCN) is used to learn the structural information of the network automatically. Afterward, the learned semantic and structural information are combined with different weights. Finally, we concatenate the learned features with traditional hand-crafted features to train a classifier for more accurate defect prediction. Results: The proposed method outperforms the state-of-the-art defect prediction models for both within-project prediction (including within-version and cross-version) and cross-project prediction on 21 open-source projects. In general, within-version prediction achieves better performance in the three prediction tasks.Conclusion: The proposed method of combining semantic and structural information can improve the performance of software defect prediction.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] Software Defect Prediction Using Augmented Bayesian Networks
    Muthukumaran, K.
    Srinivas, Suri
    Malapati, Aruna
    Neti, Lalita Bhanu Murthy
    PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR 2016), 2018, 614 : 279 - 293
  • [32] An Approach to Software Defect Prediction Combining Semantic Features and Code Changes
    Tao, Chuanqi
    Wang, Tao
    Guo, Hongjing
    Zhang, Jingxuan
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2022, 32 (09) : 1345 - 1368
  • [33] Learning Semantic Features for Software Defect Prediction by Code Comments Embedding
    Huo, Xuan
    Yang, Yang
    Li, Ming
    Zhan, De-Chuan
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 1049 - 1054
  • [34] Visualization-Based Software Defect Prediction via Convolutional Neural Network with Global Self-Attention
    Qiu, Shaojian
    Wang, Shaosheng
    Tian, Xuhong
    Huang, Mengyang
    Huang, Qiong
    2022 IEEE 22ND INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2022, : 189 - 198
  • [35] A new weighted naive Bayes method based on information diffusion for software defect prediction
    Haijin Ji
    Song Huang
    Yaning Wu
    Zhanwei Hui
    Changyou Zheng
    Software Quality Journal, 2019, 27 : 923 - 968
  • [36] A link prediction method for Chinese financial event knowledge graph based on graph attention networks and convolutional neural networks
    Cheng, Haitao
    Wang, Ke
    Tan, Xiaoying
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 138
  • [37] Software Defect Prediction Using SMOTE and Artificial Neural Network
    Dipa, Wisnu Arya
    Sunindyo, Wikan Danar
    PROCEEDINGS OF 2021 INTERNATIONAL CONFERENCE ON DATA AND SOFTWARE ENGINEERING (ICODSE): DATA AND SOFTWARE ENGINEERING FOR SUPPORTING SUSTAINABLE DEVELOPMENT GOALS, 2021,
  • [38] A new weighted naive Bayes method based on information diffusion for software defect prediction
    Ji, Haijin
    Huang, Song
    Wu, Yaning
    Hui, Zhanwei
    Zheng, Changyou
    SOFTWARE QUALITY JOURNAL, 2019, 27 (03) : 923 - 968
  • [39] Dictionary Learning Based Software Defect Prediction
    Jing, Xiao-Yuan
    Ying, Shi
    Zhang, Zhi-Wu
    Wu, Shan-Shan
    Liu, Jin
    36TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2014), 2014, : 414 - 423
  • [40] Ensemble learning based software defect prediction
    Dong, Xin
    Liang, Yan
    Miyamoto, Shoichiro
    Yamaguchi, Shingo
    JOURNAL OF ENGINEERING RESEARCH, 2023, 11 (04): : 377 - 391