CC2Vec: Distributed Representations of Code Changes

被引:155
作者
Hoang, Thong [1 ]
Kang, Hong Jin [1 ]
Lo, David [1 ]
Lawall, Julia [2 ]
机构
[1] Singapore Management Univ, Singapore, Singapore
[2] Sorbonne Univ, Inria, LIP6, Paris, France
来源
2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020) | 2020年
基金
新加坡国家研究基金会;
关键词
NEURAL-NETWORKS;
D O I
10.1145/3377811.3380361
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Existing work on software patches often use features specific to a single task. These works often rely on manually identified features, and human effort is required to identify these features for each task. In this work, we propose CC2Vec, a neural network model that learns a representation of code changes guided by their accompanying log messages, which represent the semantic intent of the code changes. CC2Vec models the hierarchical structure of a code change with the help of the attention mechanism and uses multiple comparison functions to identify the differences between the removed and added code. To evaluate if CC2Vec can produce a distributed representation of code changes that is general and useful for multiple tasks on software patches, we use the vectors produced by CC2Vec for three tasks: log message generation, bug fixing patch identification, and just-in-time defect prediction. In all tasks, the models using CC2Vec outperform the state-of-the-art techniques.
引用
收藏
页码:518 / 529
页数:12
相关论文
共 62 条
[1]   A Survey of Machine Learning for Big Code and Naturalness [J].
Allamanis, Miltiadis ;
Barr, Earl T. ;
Devanbu, Premkumar ;
Sutton, Charles .
ACM COMPUTING SURVEYS, 2018, 51 (04)
[2]  
Alon U., 2019, ICLR
[3]   code2vec: Learning Distributed Representations of Code [J].
Alon, Uri ;
Zilberstein, Meital ;
Levy, Omer ;
Yahav, Eran .
PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2019, 3 (POPL)
[4]   Univariate hyperbolic tangent neural network approximation [J].
Anastassiou, George A. .
MATHEMATICAL AND COMPUTER MODELLING, 2011, 53 (5-6) :1111-1132
[5]  
[Anonymous], 2011, ARXIV14090473
[6]  
Aversano L., 2007, Proceedings of the Foundations of Software Engineering, P19
[7]   Deep Attention Neural Tensor Network for Visual Question Answering [J].
Bai, Yalong ;
Fu, Jianlong ;
Zhao, Tiejun ;
Mei, Tao .
COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 :21-37
[8]   Oops, My Tests Broke the Build: An Explorative Analysis of Travis CI with GitHub [J].
Beller, Moritz ;
Gousios, Georgios ;
Zaidman, Andy .
2017 IEEE/ACM 14TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2017), 2017, :356-367
[9]  
Bouchard Guillaume., 2007, NIPS
[10]  
Caruana R, 2001, ADV NEUR IN, V13, P402