AutoPandas: Neural-Backed Generators for Program Synthesis

被引:46
作者
Bavishi, Rohan [1 ]
Lemieux, Caroline [1 ]
Fox, Roy [2 ]
Sen, Koushik [1 ]
Stoica, Ion [1 ]
机构
[1] Univ Calif Berkeley, Comp Sci Div, Berkeley, CA 94720 USA
[2] Univ Calif Irvine, Dept Comp Sci, Irvine, CA 92717 USA
来源
PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL | 2019年 / 3卷 / OOPSLA期
基金
美国国家科学基金会;
关键词
pandas; !text type='python']python[!/text; program synthesis; programming-by-example; generators; graph neural networks;
D O I
10.1145/3360594
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Developers nowadays have to contend with a growing number of APIs. While in the long-term they are very useful to developers, many modern APIs have an incredibly steep learning curve, due to their hundreds of functions handling many arguments, obscure documentation, and frequently changing semantics. For APIs that perform data transformations, novices can often provide an I/O example demonstrating the desired transformation, but may be stuck on how to translate it to the API. A programming-by-example synthesis engine that takes such I/O examples and directly produces programs in the target API could help such novices. Such an engine presents unique challenges due to the breadth of real-world APIs, and the often-complex constraints over function arguments. We present a generator-based synthesis approach to contend with these problems. This approach uses a program candidate generator, which encodes basic constraints on the space of programs. We introduce neural-backed operators which can be seamlessly integrated into the program generator. To improve the efficiency of the search, we simply use these operators at non-deterministic decision points, instead of relying on domain-specific heuristics. We implement this technique for the Python pandas library in AUTOPANDAS. AUTOPANDAS supports 119 pandas dataframe transformation functions. We evaluate AUTOPANDAS on 26 real-world benchmarks and find it solves 17 of them.
引用
收藏
页数:27
相关论文
共 37 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] Allamanis Miltiadis, 2018, INT C LEARN REPR ICL
  • [3] Syntax-Guided Synthesis
    Alur, Rajeev
    Bodik, Rastislav
    Dallal, Eric
    Fisman, Dana
    Garg, Pranav
    Juniwal, Garvit
    Kress-Gazit, Hadas
    Madhusudan, P.
    Martin, Milo M. K.
    Raghothaman, Mukund
    Saha, Shamwaditya
    Seshia, Sanjit A.
    Singh, Rishabh
    Solar-Lezama, Armando
    Torlak, Emina
    Udupa, Abhishek
    [J]. DEPENDABLE SOFTWARE SYSTEMS ENGINEERING, 2015, 40 : 1 - 25
  • [4] [Anonymous], 2017, SIGPLAN NOT, DOI DOI 10.1145/3140587.3062351
  • [5] [Anonymous], 2017, ICML
  • [6] [Anonymous], 2014, PANDAS PROJECT
  • [7] [Anonymous], 2016, SIGPLAN NOT, DOI DOI 10.1145/2980983.2908093
  • [8] Balog M., 2016, INT C LEARN REPR
  • [9] Deep Reinforcement Fuzzing
    Boettinger, Konstantin
    Godefroid, Patrice
    Singh, Rishabh
    [J]. 2018 IEEE SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS (SPW 2018), 2018, : 116 - 122
  • [10] Cho K., 2014, ARXIV14061078, P1724