Learning a graph-based classifier for fault localization

被引:0
作者
Hao ZHONG [1 ]
Hong MEI [1 ]
机构
[1] Department of Computer Science and Engineering, Shanghai Jiao Tong University
基金
国家重点研发计划;
关键词
fault classifier; partial code analysis; bug fix analysis;
D O I
暂无
中图分类号
TP311.53 []; TP181 [自动推理、机器学习];
学科分类号
081104 ; 0812 ; 081202 ; 0835 ; 1405 ;
摘要
Because software emerged, locating software faults has been intensively researched, culminating in various approaches and tools that have been applied in real development. Despite the success of these developments, improved tools are still demanded by programmers. Meanwhile, some programmers are reluctant to use any tools when locating faults in their development. The state-of-the-art situation can be naturally improved by learning how programmers locate faults. The rapid development of open-source software has accumulated many bug fixes. A bug fix is a specific type of comments containing a set of buggy files and their corresponding fixed files, which reveal how programmers repair bugs. Feasibly, an automatic model can learn fault locations from bug fixes, but prior attempts to achieve this vision have been prevented by various technical challenges. For example, most bug fixes are not compilable after checking out, which hinders analyzing bug fixes by most advanced static/dynamic tools. This paper proposes an approach called ClaFa that trains a graph-based fault classifier from bug fixes. ClaFa is built on a recent partial-code tool called Grapa, which enables the analysis of partial programs by the complete code tool called WALA. Once Grapa has built a program dependency graph from a bug fix, ClaFa compares the graph from the buggy code with the graph from the fixed code, locates the buggy nodes, and extracts the various graph features of the buggy and clean nodes. Based on the extraction result, ClaFa trains a classifier that combines Adaboost and decision tree learning. The trained ClaFa can predict whether a node of a program dependency graph is buggy or clean.We evaluate ClaFa on thousands of buggy files collected from four open-source projects: Aries, Mahout,Derby, and Cassandra. The f-scores of ClaFa achieves are approximately 80% on all projects.
引用
收藏
页码:195 / 216
页数:22
相关论文
共 32 条
[1]  
Can big data bring a breakthrough for software automation?[J]. Hong MEI,Lu ZHANG.Science China(Information Sciences). 2018(05)
[2]   The Impact of Automated Parameter Optimization on Defect Prediction Models [J].
Tantithamthavorn, Chakkrit ;
McIntosh, Shane ;
Hassan, Ahmed E. ;
Matsumoto, Kenichi .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2019, 45 (07) :683-711
[3]  
There and back again: Can you compile that snapshot?[J] . Michele Tufano,Fabio Palomba,Gabriele Bavota,Massimiliano Di Penta,Rocco Oliveto,Andrea De Lucia,Denys Poshyvanyk.Journal of Software: Evolution and Process . 2017 (4)
[4]  
AmaLgam+: Composing Rich Information Sources for Accurate Bug Localization[J] . Shaowei Wang,David Lo.Journal of Software: Evolution and Process . 2016 (10)
[5]   A Survey on Software Fault Localization [J].
Wong, W. Eric ;
Gao, Ruizhi ;
Li, Yihao ;
Abreu, Rui ;
Wotawa, Franz .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2016, 42 (08) :707-740
[6]   Mining software repair models for reasoning on the search space of automated program fixing [J].
Martinez, Matias ;
Monperrus, Martin .
EMPIRICAL SOFTWARE ENGINEERING, 2015, 20 (01) :176-205
[7]   Extended comprehensive study of association measures for fault localization [J].
Lucia ;
Lo, David ;
Jiang, Lingxiao ;
Thung, Ferdian ;
Budi, Aditya .
JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2014, 26 (02) :172-219
[8]   Where Should We Fix This Bug? A Two-Phase Recommendation Model [J].
Kim, Dongsun ;
Tao, Yida ;
Kim, Sunghun ;
Zeller, Andreas .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2013, 39 (11) :1597-1610
[9]   Automated API Property Inference Techniques [J].
Robillard, Martin P. ;
Bodden, Eric ;
Kawrykow, David ;
Mezini, Mira ;
Ratchford, Tristan .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2013, 39 (05) :613-637
[10]  
Efficient Estimation of Word Representations in Vector Space[J] . Tomas Mikolov,Kai Chen 0010,Greg Corrado,Jeffrey Dean.CoRR . 2013