bjCnet: A contrastive learning-based framework for software defect prediction

被引:0
作者
Han, Jiaxuan [1 ]
Huang, Cheng [1 ]
Liu, Jiayong [1 ]
机构
[1] Sichuan Univ, Sch Cyber Sci & Engn, Chengdu 610207, Sichuan, Peoples R China
关键词
Deep learning; Defect prediction; Transformer; Large language model; Contrastive learning;
D O I
10.1016/j.cose.2024.104024
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Defect prediction based on deep learning is proposed to provide practitioners with reliable and practical tools to determine whether an area of code is defective. Compared with traditional code features, semantic features of source codes automatically extracted by neural networks can better reflect the semantic differences between codes. However, the small difference between some bug codes and clean codes poses a challenge for deep learning models in distinguishing them, leading to a low accuracy in defect prediction. In this paper, we propose bjCnet, a software defect prediction framework based on contrastive learning. It fine-tunes the pre- trained Transformer-based code large language model via a supervised contrastive learning network, achieving accurate defect prediction. We evaluate the prediction effect of bjCnet, the results demonstrate that the highest accuracy and f1-score achieved by bjCnet are both 0.948, surpassing the performance of the state-of-the-art approaches selected for comparison.
引用
收藏
页数:11
相关论文
共 86 条
  • [1] Sequence-to-Sequence Contrastive Learning for Text Recognition
    Aberdam, Aviad
    Litman, Ron
    Tsiper, Shahar
    Anschel, Oron
    Slossberg, Ron
    Mazor, Shai
    Manmatha, R.
    Perona, Pietro
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 15297 - 15307
  • [2] Bresson X, 2018, Arxiv, DOI arXiv:1711.07553
  • [3] Brown TB, 2020, ADV NEUR IN, V33
  • [4] Self-Supervised Contrastive Learning for Code Retrieval and Summarization via Semantic-Preserving Transformations
    Bui, Nghi D. Q.
    Yu, Yijun
    Jiang, Lingxiao
    [J]. SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 511 - 521
  • [5] BGNN4VD: Constructing Bidirectional Graph Neural-Network for Vulnerability Detection
    Cao, Sicong
    Sun, Xiaobing
    Bo, Lili
    Wei, Ying
    Li, Bin
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2021, 136
  • [6] Chai YK, 2023, Arxiv, DOI arXiv:2212.06742
  • [7] Software Visualization and Deep Transfer Learning for Effective Software Defect Prediction
    Chen, Jinyin
    Hu, Keke
    Yu, Yue
    Chen, Zhuangzhi
    Xuan, Qi
    Liu, Yi
    Filkov, Vladimir
    [J]. 2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020), 2020, : 578 - 589
  • [8] Chen T, 2020, PR MACH LEARN RES, V119
  • [9] Path-Sensitive Code Embedding via Contrastive Learning for Software Vulnerability Detection
    Cheng, Xiao
    Zhan, Guanqin
    Wang, Haoyu
    Sui, Yulei
    [J]. PROCEEDINGS OF THE 31ST ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2022, 2022, : 519 - 531
  • [10] DeepWukong: Statically Detecting Software Vulnerabilities Using Deep Graph Neural Network
    Cheng, Xiao
    Wang, Haoyu
    Hua, Jiayi
    Xu, Guoai
    Sui, Yulei
    [J]. ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2021, 30 (03)