A Neural Framework for Retrieval and Summarization of Source Code

被引:62
作者
Chen, Qingying [1 ]
Zhou, Minghui [1 ]
机构
[1] Peking Univ, Sch Elect Engn & Comp Sci, Key Lab High Confidence Software Technol, Minist Educ, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 2018 33RD IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMTED SOFTWARE ENGINEERING (ASE' 18) | 2018年
基金
中国国家自然科学基金;
关键词
Code retrieval; code summarization; neural framework; SEARCH;
D O I
10.1145/3238147.3240471
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Code retrieval and summarization are two tasks often employed by software developers to reuse code that spreads over online repositories. In this paper, we present a neural framework that allows bidirectional mapping between source code and natural language to improve these two tasks. Our framework, BVAE, is designed to have two Variational AutoEncoders (VAEs) to model bimodal data: C-VAE for source code and L-VAE for natural language. Both VAEs are trained jointly to reconstruct their input as much as possible with regularization that captures the closeness between the latent variables of code and description. BVAE could learn semantic vector representations for both code and description and generate completely new descriptions for arbitrary code snippets. We design two instance models of BVAE for retrieval and summarization tasks respectively and evaluate their performance on a benchmark which involves two programming languages: C and SQL. Experiments demonstrate BVAE's potential on the two tasks.
引用
收藏
页码:826 / 831
页数:6
相关论文
共 30 条
[1]  
Allamanis M, 2015, PR MACH LEARN RES, V37, P2123
[2]  
[Anonymous], TR19 ESPRIT
[3]   Sourcerer: An infrastructure for large-scale collection and analysis of open-source code [J].
Bajracharya, Sushi ;
Ossher, Joel ;
Lopes, Cristina .
SCIENCE OF COMPUTER PROGRAMMING, 2014, 79 :241-259
[4]   Shape indexing using approximate nearest-neighbour search in high-dimensional spaces [J].
Beis, JS ;
Lowe, DG .
1997 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1997, :1000-1006
[5]  
Biswas P, 2005, I CONF VLSI DESIGN, P651
[6]  
Bowman S. R., 2016, P 20 SIGNLL C COMP N, P10, DOI [DOI 10.18653/V1/K16-1002, 10.18653/v1/K16-1002]
[7]  
Cho K., 2014, ARXIV14061078, P1724, DOI 10.3115/V1/D14-1179
[8]  
Glorot X, 2010, P 13 INT C ART INT S, P249, DOI DOI 10.1109/LGRS.2016.2565705
[9]   Deep Code Search [J].
Gu, Xiaodong ;
Zhang, Hongyu ;
Kim, Sunghun .
PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2018, :933-944
[10]  
Hochreiter S, 1997, NEURAL COMPUT, V9, P1735, DOI [10.1162/neco.1997.9.8.1735, 10.1007/978-3-642-24797-2, 10.1162/neco.1997.9.1.1]