Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models

被引：220

作者：

Vaithilingam, Priyan ^{[1
]}

Zhang, Tianyi ^{[2
]}

Glassman, Elena L. ^{[1
]}

机构：

[1] Harvard Univ, Cambridge, MA 02138 USA

[2] Purdue Univ, W Lafayette, IN 47907 USA

来源：

EXTENDED ABSTRACTS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2022 | 2022年

关键词：

large language model; github copilot;

D O I：

10.1145/3491101.3519665

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recent advances in Large Language Models (LLM) have made automatic code generation possible for real-world programming tasks in general-purpose programming languages such as Python. However, there are few human studies on the usability of these tools and how they fit the programming workflow. In this work, we conducted a within-subjects user study with 24 participants to understand how programmers use and perceive Copilot, a LLM-based code generation tool. We found that, while Copilot did not necessarily improve the task completion time or success rate, most participants preferred to use Copilot in daily programming tasks, since Copilot often provided a useful starting point and saved the effort of searching online. However, participants did face difficulties in understanding, editing, and debugging code snippets generated by Copilot, which significantly hindered their task-solving effectiveness. Finally, we highlighted several promising directions for improving the design of Copilot based on our observations and participants' feedback.

引用

页数：7

共 52 条

[21] Code Prediction by Feeding Trees to Transformers
Kim, Seohyun
Zhao, Jinman
Tian, Yuchi
Chandra, Satish
[J]. 2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021), 2021, : 150 - 162
[22] Kite-Free AI Coding Assistant and Code Auto-Complete Plugin, 2020, Kite-Free AI Coding Assistant and Code Auto-Complete Plugin
[23] Will You Accept an Imperfect AI? Exploring Designs for Adjusting End-user Expectations of AI Systems
Kocielnik, Rafal
Amershi, Saleema
Bennett, Paul N.
[J]. CHI 2019: PROCEEDINGS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2019,
[24] Programming by demonstration using version space algebra
Lau, T
Wolfman, SA
Domingos, P
Weld, DS
[J]. MACHINE LEARNING, 2003, 53 (1-2) : 111 - 156
[25] Le V, 2014, ACM SIGPLAN NOTICES, V49, P542, DOI [10.1145/2594291.2594333, 10.1145/2666356.2594333]
[26] Lim BY, 2010, UBICOMP 2010: PROCEEDINGS OF THE 2010 ACM CONFERENCE ON UBIQUITOUS COMPUTING, P13
[27] Lim BY, 2009, UBICOMP'09: PROCEEDINGS OF THE 11TH ACM INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING, P195
[28] Lim BY, 2009, CHI2009: PROCEEDINGS OF THE 27TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, VOLS 1-4, P2119
[29] Studying the Usage of Text-To-Text Transfer Transformer to Support Code-Related Tasks
Mastropaolo, Antonio
Scalabrimo, Simone
Cooper, Nathan
Palacio, David Nader
Poshyvanyk, Denys
Oliveto, Rocco
Bavota, Gabriele
[J]. 2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021), 2021, : 336 - 347
[30] Myers B. A., 1991, Human Factors in Computing Systems. Reaching Through Technology. CHI '91. Conference Proceedings, P243, DOI 10.1145/108844.108903

← 1 2 3 4 5 6 →