A hybrid code representation learning approach for predicting method names

被引：5

作者：

Zhang, Fengyi ^{[1
,2
]}

Chen, Bihuan ^{[1
]}

Li, Rongfan ^{[1
,2
]}

Peng, Xin ^{[1
,2
]}

机构：

[1] Fudan Univ, Sch Comp Sci, Shanghai, Peoples R China

[2] Fudan Univ, Shanghai Key Lab Data Sci, Shanghai, Peoples R China

来源：

JOURNAL OF SYSTEMS AND SOFTWARE | 2021年 / 180卷

基金：

中国国家自然科学基金;

关键词：

Code representation learning; Method name prediction; Deep learning;

D O I：

10.1016/j.jss.2021.111011

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Program semantic properties such as class names, method names, and variable names and types play an important role in software development and maintenance. Method names are of particular importance because they provide the cornerstone of abstraction for developers to communicate with each other for various purposes (e.g., code review and program comprehension). Existing method name prediction approaches often represent code as lexical tokens or syntactical AST (abstract syntax tree) paths, making them difficult to learn code semantics and hindering their effectiveness in predicting method names. Initial attempts have been made to represent code as execution traces to capture code semantics, but suffer scalability in collecting execution traces. In this paper, we propose a hybrid code representation learning approach, named Meth2Seq, to encode a method as a sequence of distributed vectors. Meth2Seq represents a method as (1) a bag of paths on the program dependence graph, (2) a sequence of typed intermediate representation statements and (3) a sentence of natural language comment, to scalably capture code semantics. The learned sequence of vectors of a method is fed to a decoder model to predict method names. Our evaluation with a dataset of 280.5K methods in 67 Java projects has demonstrated that Meth2Seq outperforms the two state-of-the-art code representation learning approaches in F1-score by 92.6% and 36.6%, while also outperforming two state-of-the-art method name prediction approaches in F1-score by 85.6% and 178.1%. (C) 2021 Elsevier Inc. All rights reserved.

引用

页数：15

共 50 条

[1] Exploiting Method Names to Improve Code Summarization: A Deliberation Multi-Task Learning Approach
Xie, Rui
Ye, Wei
Sun, Jinan
Zhang, Shikun
2021 IEEE/ACM 29TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC 2021), 2021, : 138 - 148
[2] MetaTPTrans: A Meta Learning Approach for Multilingual Code Representation Learning
Pian, Weiguo
Peng, Hanyu
Tang, Xunzhu
Sun, Tiezhu
Tian, Haoye
Habib, Andrew
Klein, Jacques
Bissyande, Tegawende F.
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 4, 2023, : 5239 - 5247
[3] Evaluating Representation Learning of Code Changes for Predicting Patch Correctness in Program Repair
Tian, Haoye
Liu, Kui
Kabore, Abdoul Kader
Koyuncu, Anil
Li, Li
Klein, Jacques
Bissyande, Tegawende F.
2020 35TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE 2020), 2020, : 981 - 992
[4] Robust Representation Learning of Biomedical Names
Phan, Minh C.
Sun, Aixin
Tay, Yi
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3275 - 3285
[5] Predicting bugs in source code changes with incremental learning method
Yuan, Zi
Yu, Lili
Liu, Chao
Zhang, Linghua
Journal of Software, 2013, 8 (07) : 1620 - 1633
[6] Contrastive Code Representation Learning
Jain, Paras
Jain, Ajay
Zhang, Tianjun
Abbeel, Pieter
Gonzalez, Joseph E.
Stoica, Ion
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 5954 - 5971
[7] Contextuality of Code Representation Learning
Li, Yi
Wang, Shaohua
Nguyen, Tien N.
2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 548 - 559
[8] Predicting Lumbar Spondylolisthesis: A Hybrid Deep Learning Approach
Saravagi, Deepika
Agrawal, Shweta
Saravagi, Manisha
Jain, Sanjiv K.
Sharma, Bhisham
Mehbodniya, Abolfazl
Chowdhury, Subrata
Webber, Julian L.
INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 37 (02): : 2133 - 2151
[9] A Hybrid Approach To Detect Code Smells using Deep Learning
Hadj-Kacem, Mouna
Bouassida, Nadia
PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON EVALUATION OF NOVEL APPROACHES TO SOFTWARE ENGINEERING, 2018, : 137 - 146
[10] A Hybrid Malicious Code Detection Method based on Deep Learning
Li, Yuancheng
Ma, Rong
Jiao, Runhai
INTERNATIONAL JOURNAL OF SECURITY AND ITS APPLICATIONS, 2015, 9 (05): : 205 - 215

← 1 2 3 4 5 →