Statistical Translation of English Texts to API Code Templates

被引:12
作者
Anh Tuan Nguyen [1 ]
Rigby, Peter C. [2 ]
Thanh Nguyen [3 ]
Palani, Dharani [2 ]
Karanfil, Mark [2 ]
Nguyen, Tien N. [4 ]
机构
[1] Axon US Corp, New York, NY 10036 USA
[2] Concordia Univ, Montreal, PQ, Canada
[3] Iowa State Univ, Ames, IA USA
[4] Univ Texas Dallas, Richardson, TX 75083 USA
来源
PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME) | 2018年
基金
美国国家科学基金会;
关键词
D O I
10.1109/ICSME.2018.00029
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We develop T2API, a context-sensitive, graph-based statistical translation approach that takes as input an English description of a programming task and synthesizes the corresponding API code template for the task. We train T2API to statistically learn the alignments between English and API elements and determine the relevant API elements. The training is done on StackOverflow, a bilingual corpus on which developers discuss programming problems in two types of language: English and programming language. T2API considers both the context of the words in the input query and the context of API elements that often go together in the corpus. The derived API elements with their relevance scores are assembled into an API usage by GRASYN, a novel graph-based API synthesis algorithm that generates a graph representing an API usage from a large code corpus. Importantly, it is capable of generating new API usages from previously seen sub-usages. We curate a test benchmark of 250 real-world StackOverflow posts. Across the benchmark, T2API's synthesized snippets have the correct API elements with a median top-1 precision and recall of 67% and 100%, respectively. Four professional developers and five graduate students judged that 77% of our top synthesized API code templates are useful to solve the problem presented in the StackOverflow posts.
引用
收藏
页码:194 / 205
页数:12
相关论文
共 49 条
  • [1] Allamanis M., 2015, ICML 15, V15
  • [2] Allamanis M., 2015, ESEC FSE 2015
  • [3] Learning Natural Coding Conventions
    Allamanis, Miltiadis
    Barr, Earl T.
    Bird, Christian
    Sutton, Charles
    [J]. 22ND ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (FSE 2014), 2014, : 281 - 293
  • [4] Allamanis M, 2013, IEEE WORK CONF MIN S, P207, DOI 10.1109/MSR.2013.6624029
  • [5] [Anonymous], 2007, P 6 JOINT M EUR SOFT
  • [6] [Anonymous], 2010, Statistical Machine Translation
  • [7] [Anonymous], 2007, P 22 IEEE ACM INT C
  • [8] Bajracharya S., 2006, COMP 21 ACM SIGPLAN, P681, DOI DOI 10.1145/1176617.1176671
  • [9] Brandt J, 2010, CHI2010: PROCEEDINGS OF THE 28TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, VOLS 1-4, P513
  • [10] Brown P. F., 1993, Computational Linguistics, V19, P263