Function Call Graph Context Encoding for Neural Source Code Summarization

被引:10
作者
Bansal, Aakash [1 ,2 ]
Eberhart, Zachary [1 ,2 ]
Karas, Zachary [1 ,2 ]
Huang, Yu [1 ,2 ]
Mcmillan, Collin [1 ,2 ]
机构
[1] Univ Notre Dame, Dept Comp Sci & Engn, Notre Dame, IN 46556 USA
[2] Univ Vanderbilt, Dept Comp Sci, Tennessee, IL USA
关键词
Codes; Source coding; Context modeling; Decoding; Algorithms; Software engineering; Machine translation; Automatic documentation generation; context-aware models; neural networks; source code summarization; PROGRAM COMPREHENSION; SOFTWARE MAINTENANCE; MENTAL MODELS; INFORMATION;
D O I
10.1109/TSE.2023.3279774
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Source code summarization is the task of writing natural language descriptions of source code. The primary use of these descriptions is in documentation for programmers. Automatic generation of these descriptions is a high value research target due to the time cost to programmers of writing these descriptions themselves. In recent years, a confluence of software engineering and artificial intelligence research has made inroads into automatic source code summarization through applications of neural models of that source code. However, an Achilles' heel to a vast majority of approaches is that they tend to rely solely on the context provided by the source code being summarized. But empirical studies in program comprehension are quite clear that the information needed to describe code much more often resides in the context in the form of Function Call Graph surrounding that code. In this paper, we present a technique for encoding this call graph context for neural models of code summarization. We implement our approach as a supplement to existing approaches, and show statistically significant improvement over existing approaches. In a human study with 20 programmers, we show that programmers perceive generated summaries to generally be as accurate, readable, and concise as human-written summaries.
引用
收藏
页码:4268 / 4281
页数:14
相关论文
共 78 条
[11]  
Banerjee S, 2005, P ACL WORKSH INTR EX, P65
[12]   Project-Level Encoding for Neural Source Code Summarization of Subroutines [J].
Bansal, Aakash ;
Haque, Sakib ;
McMillan, Collin .
2021 IEEE/ACM 29TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC 2021), 2021, :253-264
[13]  
BIGGERSTAFF TJ, 1993, PROC INT CONF SOFTW, P482, DOI 10.1109/ICSE.1993.346017
[14]   Source code analysis: A road map [J].
Binkley, David .
FoSE 2007: Future of Software Engineering, 2007, :104-119
[15]   Software documentation: How much is enough? [J].
Briand, LC .
SEVENTH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING, PROCEEDINGS, 2003, :13-15
[16]  
Cer D, 2018, Arxiv, DOI [arXiv:1803.11175, 10.48550/arXiv.1803.11175, DOI 10.48550/ARXIV.1803.11175]
[17]  
Chaudhary J. R., 2018, Int. J. Sci. Res. Sci. Eng. Technol, V4, P145
[18]  
Chen K, 2016, Arxiv, DOI arXiv:1511.05960
[19]   Lightweight Transformation and Fact Extraction with the srcML Toolkit [J].
Collard, Michael L. ;
Decker, Michael J. ;
Maletic, Jonathan I. .
11TH IEEE INTERNATIONAL WORKING CONFERENCE ON SOURCE CODE ANALYSIS AND MANIPULATION (SCAM 2011), 2011, :173-184
[20]   A Survey of Multilingual Neural Machine Translation [J].
Dabre, Raj ;
Chu, Chenhui ;
Kunchukuttan, Anoop .
ACM COMPUTING SURVEYS, 2020, 53 (05)