Code-aware fault localization with pre-training and interpretable machine learning

被引:1
作者
Zhang, Zhuo [1 ]
Li, Ya [2 ]
Yang, Sha [1 ]
Zhang, Zhanjun [3 ]
Lei, Yan [4 ]
机构
[1] Guangzhou Coll Commerce, Sch Informat Technol & Engn, Guangzhou, Peoples R China
[2] Shanghai Jiao Tong Univ, Ningbo Artificial Intelligence Inst, Ningbo, Peoples R China
[3] Natl Univ Def Technol, Coll Comp, Changsha, Peoples R China
[4] Chongqing Univ, Sch Big Data & Software Engn, Chongqing, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Fault localization; Pre-training; Interpretable machine learning;
D O I
10.1016/j.eswa.2023.121689
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Following the rapid development of deep learning, many studies in the field of fault localization (FL) have utilized deep learning to analyze statements' coverage information (i.e., executed or not executed) and test cases' results (i.e., failing or passing), which have shown dramatic ability in identifying suspicious statements potentially responsible for failures. However, they mainly pay attention to the binary information of executing test cases but ignore incorporating code snippets and their inner relationships into the learning process. Furthermore, how a complex deep learning model for FL achieves a particular decision is not transparent. These drawbacks may limit the effectiveness of FL. Recently, graph-based pre-training techniques have dramatically improved the state-of-the-art in a variety of code-related tasks such as natural language code search, clone detection, code translation, code refinement, etc. And interpretable machine learning tackles the problem of non-transparency and enables learning models to explain or present their behaviors to humans in an understandable way.In this paper, our insight is to select a candidate solution that leverages the promising learning ability of graph-based pre-training techniques to learn a feasible model for incorporating code snippets as well as their inner relationships into fault localization, and then uses interpretable machine learning to localize faulty statements. Thus, we propose CodeAwareFL, a code-aware fault localization technique with pre-training and interpretable machine learning. Concretely, CodeAwareFL constructs a variety of code snippets through executing test cases. Next, CodeAwareFL utilizes the code snippets to extract propagation chains which could show a set of variables interact with each other to cause a failure. After that, a graph-based pre-trained model is customized for fault localization. CodeAwareFL takes the code snippets and their corresponding propagation chains as inputs with test results as labels to conduct the training process. Finally, CodeAwareFL evaluates the suspiciousness of statements with interpretable machine learning techniques. In the experimental study, we choose 12 large-sized programs to conduct the comparison. The results show that CodeAwareFL achieves promising results (e.g., 32.43% faults are ranked within top 5), and is significantly better than 12 state-of-the-art baselines.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Bridging the KB-Text Gap: Leveraging Structured Knowledge-aware Pre-training for KBQA
    Dong, Guanting
    Li, Rumei
    Wang, Sirui
    Zhang, Yupeng
    Xian, Yunsen
    Xu, Weiran
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 3854 - 3859
  • [42] Cross-sensor contrastive learning-based pre-training for machinery fault diagnosis under sample-limited conditions
    Hu, Hao
    Ma, Yue
    Li, Ruoxue
    Feng, Zhixi
    Yang, Shuyuan
    Du, Shaoyi
    Gao, Yue
    KNOWLEDGE-BASED SYSTEMS, 2025, 311
  • [43] MISSRec: Pre-training and Transferring Multi-modal Interest-aware Sequence Representation for Recommendation
    Wang, Jinpeng
    Zeng, Ziyun
    Wang, Yunxiao
    Wang, Yuting
    Lu, Xingyu
    Li, Tianxiang
    Yuan, Jun
    Zhang, Rui
    Zheng, Hai-Tao
    Xia, Shu-Tao
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6548 - 6557
  • [44] A fault diagnosis method based on interpretable machine learning model and decision visualization for HVs
    Zhou, Dengji
    Huang, Dawen
    Tie, Ming
    Zhang, Xing
    Hao, Jiarui
    Wu, Yadong
    Shen, Yaoxin
    Wang, Yulin
    APPLIED INTELLIGENCE, 2025, 55 (04)
  • [45] CanvasEmb: Learning Layout Representation with Large-scale Pre-training for Graphic Design
    Xie, Yuxi
    Huang, Danqing
    Wang, Jinpeng
    Lin, Chin-Yew
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4100 - 4108
  • [46] An Online Reinforcement Learning Method for Multi-Zone Ventilation Control With Pre-Training
    Cui, Can
    Li, Chunxiao
    Li, Ming
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2023, 70 (07) : 7163 - 7172
  • [47] MEMOBERT: PRE-TRAINING MODEL WITH PROMPT-BASED LEARNING FOR MULTIMODAL EMOTION RECOGNITION
    Zhao, Jinming
    Li, Ruichen
    Jin, Qin
    Wang, Xinchao
    Li, Haizhou
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4703 - 4707
  • [48] Deep-Learning-Based Pre-Training and Refined Tuning for Web Summarization Software
    Liu, Mingyue
    Ma, Zhe
    Li, Jiale
    Wu, Ying Cheng
    Wang, Xukang
    IEEE ACCESS, 2024, 12 : 92120 - 92129
  • [49] Effect of Pre-Training and Role of Working Memory Characteristics in Learning with Immersive Virtual Reality
    Lawson, Alyssa P.
    Mayer, Richard E.
    INTERNATIONAL JOURNAL OF HUMAN-COMPUTER INTERACTION, 2025, 41 (04) : 2523 - 2540
  • [50] Exploring the Role of Monolingual Data in Cross-Attention Pre-training for Neural Machine Translation
    Khang Pham
    Long Nguyen
    Dien Dinh
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2023, 2023, 14162 : 179 - 190