Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models

被引:220
作者
Vaithilingam, Priyan [1 ]
Zhang, Tianyi [2 ]
Glassman, Elena L. [1 ]
机构
[1] Harvard Univ, Cambridge, MA 02138 USA
[2] Purdue Univ, W Lafayette, IN 47907 USA
来源
EXTENDED ABSTRACTS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2022 | 2022年
关键词
large language model; github copilot;
D O I
10.1145/3491101.3519665
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recent advances in Large Language Models (LLM) have made automatic code generation possible for real-world programming tasks in general-purpose programming languages such as Python. However, there are few human studies on the usability of these tools and how they fit the programming workflow. In this work, we conducted a within-subjects user study with 24 participants to understand how programmers use and perceive Copilot, a LLM-based code generation tool. We found that, while Copilot did not necessarily improve the task completion time or success rate, most participants preferred to use Copilot in daily programming tasks, since Copilot often provided a useful starting point and saved the effort of searching online. However, participants did face difficulties in understanding, editing, and debugging code snippets generated by Copilot, which significantly hindered their task-solving effectiveness. Finally, we highlighted several promising directions for improving the design of Copilot based on our observations and participants' feedback.
引用
收藏
页数:7
相关论文
共 52 条
  • [21] Code Prediction by Feeding Trees to Transformers
    Kim, Seohyun
    Zhao, Jinman
    Tian, Yuchi
    Chandra, Satish
    [J]. 2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021), 2021, : 150 - 162
  • [22] Kite-Free AI Coding Assistant and Code Auto-Complete Plugin, 2020, Kite-Free AI Coding Assistant and Code Auto-Complete Plugin
  • [23] Will You Accept an Imperfect AI? Exploring Designs for Adjusting End-user Expectations of AI Systems
    Kocielnik, Rafal
    Amershi, Saleema
    Bennett, Paul N.
    [J]. CHI 2019: PROCEEDINGS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2019,
  • [24] Programming by demonstration using version space algebra
    Lau, T
    Wolfman, SA
    Domingos, P
    Weld, DS
    [J]. MACHINE LEARNING, 2003, 53 (1-2) : 111 - 156
  • [25] Le V, 2014, ACM SIGPLAN NOTICES, V49, P542, DOI [10.1145/2594291.2594333, 10.1145/2666356.2594333]
  • [26] Lim BY, 2010, UBICOMP 2010: PROCEEDINGS OF THE 2010 ACM CONFERENCE ON UBIQUITOUS COMPUTING, P13
  • [27] Lim BY, 2009, UBICOMP'09: PROCEEDINGS OF THE 11TH ACM INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING, P195
  • [28] Lim BY, 2009, CHI2009: PROCEEDINGS OF THE 27TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, VOLS 1-4, P2119
  • [29] Studying the Usage of Text-To-Text Transfer Transformer to Support Code-Related Tasks
    Mastropaolo, Antonio
    Scalabrimo, Simone
    Cooper, Nathan
    Palacio, David Nader
    Poshyvanyk, Denys
    Oliveto, Rocco
    Bavota, Gabriele
    [J]. 2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021), 2021, : 336 - 347
  • [30] Myers B. A., 1991, Human Factors in Computing Systems. Reaching Through Technology. CHI '91. Conference Proceedings, P243, DOI 10.1145/108844.108903