AST-Trans: Code Summarization with Efficient Tree-Structured Attention

被引：44

作者：

Tang, Ze ^{[1
]}

Shen, Xiaoyu ^{[2
]}

Li, Chuanyi ^{[1
]}

Ge, Jidong ^{[1
]}

Huang, Liguo ^{[3
]}

Zhu, Zhelin ^{[1
]}

Luo, Bin ^{[1
]}

机构：

[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Peoples R China

[2] Amazon, Alexa AI, Berlin, Germany

[3] Southern Methodist Univ, Dept Comp Sci, Dallas, TX USA

来源：

2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022) | 2022年

基金：

中国国家自然科学基金; 美国国家科学基金会;

关键词：

tree-based neural network; source code summarization;

D O I：

10.1145/3510003.3510224

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Code summarization aims to generate brief natural language descriptions for source codes. The state-of-the-art approaches follow a transformer-based encoder-decoder architecture. As the source code is highly structured and follows strict grammars, its Abstract Syntax Tree (AST) is widely used for encoding structural information. However, ASTs are much longer than the corresponding source code. Existing approaches ignore the size constraint and simply feed the whole linearized AST into the encoders. We argue that such a simple process makes it difficult to extract the truly useful dependency relations from the overlong input sequence. It also incurs significant computational overhead since each node needs to apply self-attention to all other nodes in the AST. To encode the AST more effectively and efficiently, we propose AST-Trans in this paper which exploits two types of node relationships in the AST: ancestor-descendant and sibling relationships. It applies the tree-structured attention to dynamically allocate weights for relevant nodes and exclude irrelevant nodes based on these two relationships. We further propose an efficient implementation to support fast parallel computation for tree-structure attention. On the two code summarization datasets, experimental results show that AST-Trans significantly outperforms the state-of-the-arts while being times more efficient than standard transformers (1).

引用

页码：150 / 162

页数：13

共 57 条

[1]

Ahmad W. U., 2020, P 58 ANN M ASS COMPU, P4998, DOI 10.18653/v1/2020.acl-main.449

[2]

Allamanis M, 2016, PR MACH LEARN RES, V48

[3]

Allamanis Miltiadis, 2018, 6 INT C LEARNING REP

[4]

Alon Uri, 2019, P 7 INT C LEARNING R

[5]

[Anonymous], 2010, P IEEE ACM INT C AUT

[6]

Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473

[7]

Banerjee S, 2005, P ACL WORKSH INTR EX, P65

[8]

Beltagy I, 2020, Arxiv, DOI [arXiv:2004.05150, 10.48550/arXiv.2004.05150]

[9]

Chang Ernie, SHORT PAPERS, V2, P8

[10] EFLOPS: Algorithm and System Co-design for a High Performance Distributed Training Platform [J].

Dong, Jianbo ;

Cao, Zheng ;

Zhang, Tao ;

Ye, Jianxi ;

Wang, Shaochuang ;

Feng, Fei ;

Zhao, Li ;

Liu, Xiaoyong ;

Song, Liuyihan ;

Peng, Liwei ;

Guo, Yiqun ;

Jiang, Xiaowei ;

Tang, Lingbo ;

Du, Yin ;

Zhang, Yingya ;

Pan, Pan ;

Xie, Yuan .

2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2020), 2020, :610-622

← 1 2 3 4 5 6 →