From Code to Natural Language: Type-Aware Sketch-Based Seq2Seq Learning

被引:4
作者
Deng, Yuhang [1 ]
Huang, Hao [1 ]
Chen, Xu [1 ]
Liu, Zuopeng [2 ]
Wu, Sai [3 ]
Xuan, Jifeng [1 ]
Li, Zongpeng [1 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China
[2] Xiaomi Technol Co Ltd, Beijing, Peoples R China
[3] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou, Peoples R China
来源
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT I | 2020年 / 12112卷
基金
国家重点研发计划;
关键词
Code comment generation; Sketch; Attention mechanism;
D O I
10.1007/978-3-030-59410-7_25
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Code comment generation aims to translate existing source code into natural language explanations. It provides an easy-to-understand description for developers who are unfamiliar with the functionality of source code. Existing approaches to code comment generation focus on summarizing multiple lines of code with a short text, but often cannot effectively explain a single line of code. In this paper, we propose an asynchronous learning model, which learns the code semantics and generates a fine-grained natural language explanation for each line of code. Different from a coarse-grained code comment generation, this fine-grained explanation can help developers better understand the functionality line-by-line. The proposed model adopts a type-aware sketch-based sequence-to-sequence learning method to generate natural language explanations for source code. This method incorporates the type of source code and the mask mechanism with the Long Short Term Memory (LSTM) network via encoding and decoding phases. We empirically compare the proposed model with state-of-the-art approaches on real data sets of source code and description in Python. Experimental results demonstrate that our model can outperform existing approaches on commonly used metrics for neural machine translation.
引用
收藏
页码:352 / 368
页数:17
相关论文
共 26 条
  • [1] Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, 10.48550/arXiv.1409.0473, DOI 10.48550/ARXIV.1409.0473]
  • [2] Banerjee S., 2005, P ACL WORKSH INTR EX, P65
  • [3] Boski M, 2017, 2017 10TH INTERNATIONAL WORKSHOP ON MULTIDIMENSIONAL (ND) SYSTEMS (NDS)
  • [4] Cho KYHY, 2014, Arxiv, DOI arXiv:1409.1259
  • [5] Cho KYHY, 2014, Arxiv, DOI [arXiv:1406.1078, DOI 10.48550/ARXIV.1406.1078]
  • [6] Dong L, 2018, Arxiv, DOI arXiv:1805.04793
  • [7] Eddy BP, 2013, CONF PROC INT SYMP C, P13, DOI 10.1109/ICPC.2013.6613829
  • [8] Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
  • [9] Gu JT, 2016, Arxiv, DOI arXiv:1603.06393
  • [10] Haiduc S., 2010, 2010 32nd International Conference on Software Engineering (ICSE), P223, DOI 10.1145/1810295.1810335