Bug localization based on syntactical and semantic information of source code

被引:1
作者
Yan, Xuefeng [1 ,2 ]
Cheng, Shasha [1 ]
Guo, Liqin [3 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci Technol, Nanjing 211106, Peoples R China
[2] Collaborat Innovat Ctr Novel Software Technol & In, Nanjing 211106, Peoples R China
[3] Beijing Inst Elect Syst Engn, State Key Lab Intelligent Mfg Syst Technol, Beijing 100854, Peoples R China
基金
国家重点研发计划;
关键词
Computer bugs; Location awareness; Source coding; Syntactics; Semantics; Software; Natural languages; bug report; abstract syntax tree; code representation; software bug localization;
D O I
10.23919/JSEE.2023.000010
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The existing software bug localization models treat the source file as natural language, which leads to the loss of syntactical and structure information of the source file. A bug localization model based on syntactical and semantic information of source code is proposed. Firstly, abstract syntax tree (AST) is divided based on node category to obtain statement sequence. The statement tree is encoded into vectors to capture lexical and syntactical knowledge at the statement level. Secondly, the source code is transformed into vector representation by the sequence naturalness of the statement. Therefore, the problem of gradient vanishing and explosion caused by a large AST size is obviated when using AST to the represent source code. Finally, the correlation between bug reports and source files are comprehensively analyzed from three aspects of syntax, semantics and text to locate the buggy code. Experiments show that compared with other standard models, the proposed model improves the performance of bug localization, and it has good advantages in mean reciprocal rank (MRR), mean average precision (MAP) and Top N Rank.
引用
收藏
页码:236 / 246
页数:11
相关论文
共 24 条
  • [1] Bug Localization with Combination of Deep Learning and Information Retrieval
    An Ngoc Lam
    Anh Tuan Nguyen
    Hoan Anh Nguyen
    Nguyen, Tien N.
    [J]. 2017 IEEE/ACM 25TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC), 2017, : 218 - 229
  • [2] Leveraging textual properties of bug reports to localize relevant source files
    Gharibi, Reza
    Rasekh, Amir Hossein
    Sadreddini, Mohammad Hadi
    Fakhrahmad, Seyed Mostafa
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2018, 54 (06) : 1058 - 1076
  • [3] Hindle A, 2012, PROC INT CONF SOFTW, P837, DOI 10.1109/ICSE.2012.6227135
  • [4] A Memory-Related Vulnerability Detection Approach Based on Vulnerability Features
    Hu, Jinchang
    Chen, Jinfu
    Zhang, Lin
    Liu, Yisong
    Bao, Qihao
    Ackah-Arthur, Hilary
    Zhang, Chi
    [J]. TSINGHUA SCIENCE AND TECHNOLOGY, 2020, 25 (05) : 604 - 613
  • [5] Huo X., 2016, P 25 INT JOINT C ART, P1606
  • [6] Huo X, 2020, AAAI CONF ARTIF INTE, V34, P4223
  • [7] Huo X, 2017, PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P1909
  • [8] Jayasundara V, 2019, Arxiv, DOI arXiv:1910.12306
  • [9] Just enough semantics: An information theoretic approach for IR-based software bug localization
    Khatiwada, Saket
    Tushev, Miroslav
    Mahmoud, Anas
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2018, 93 : 45 - 57
  • [10] Deep Learning With Customized Abstract Syntax Tree for Bug Localization
    Liang, Hongliang
    Sun, Lu
    Wang, Meilin
    Yang, Yuxing
    [J]. IEEE ACCESS, 2019, 7 : 116309 - 116320