ConstitutionMaker: Interactively Critiquing Large Language Models by Converting Feedback into Principles

被引:0
|
作者
Petridis, Savvas [1 ]
Wedin, Ben [2 ]
Wexler, James [2 ]
Donsbach, Aaron [3 ]
Pushkarna, Mahima [2 ]
Goyal, Nitesh [1 ]
Cai, Carrie J. [4 ]
Terry, Michael [2 ]
机构
[1] Google Res, New York, NY 10011 USA
[2] Google Res, Cambridge, MA USA
[3] Google Res, Seattle, WA USA
[4] Google Res, Mountain View, CA USA
来源
PROCEEDINGS OF 2024 29TH ANNUAL CONFERENCE ON INTELLIGENT USER INTERFACES, IUI 2024 | 2024年
关键词
Large Language Models; Generative AI; Conversational AI; Interactive Critique; Feedback; CHATBOT;
D O I
10.1145/3640543.3645144
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language model (LLM) prompting is a promising new approach for users to create and customize their own chatbots. However, current methods for steering a chatbot's outputs, such as prompt engineering and fine-tuning, do not support users in converting their natural feedback on the model's outputs to changes in the prompt or model. In this work, we explore how to enable users to interactively refine model outputs through their feedback, by helping them convert their feedback into a set of principles (i.e. a constitution) that dictate the model's behavior. From a formative study, we (1) found that users needed support converting their feedback into principles for the chatbot and (2) classified the different principle types desired by users. Inspired by these findings, we developed ConstitutionMaker, an interactive tool for converting user feedback into principles, to steer LLM-based chatbots. With ConstitutionMaker, users can provide either positive or negative feedback in natural language, select auto-generated feedback, or rewrite the chatbot's response; each mode of feedback automatically generates a principle that is inserted into the chatbot's prompt. In a user study with 14 participants, we compare ConstitutionMaker to an ablated version, where users write their own principles. With ConstitutionMaker, participants felt that their principles could better guide the chatbot, that they could more easily convert their feedback into principles, and that they could write principles more efficiently, with less mental demand. ConstitutionMaker helped users identify ways to improve the chatbot, formulate their intuitive responses to the model into feedback, and convert this feedback into specific and clear principles. Together, these findings inform future tools that support the interactive critiquing of LLM outputs.
引用
收藏
页码:853 / 868
页数:16
相关论文
共 50 条
  • [1] Large language models: Principles, implementation, and progress
    Shu W.
    Li R.
    Sun T.
    Huang X.
    Qiu X.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2024, 61 (02): : 351 - 361
  • [2] Propagating Large Language Models Programming Feedback
    Koutcheme, Charles
    Hellas, Arto
    PROCEEDINGS OF THE ELEVENTH ACM CONFERENCE ON LEARNING@SCALE, L@S 2024, 2024, : 366 - 370
  • [3] Large language models meet user interfaces: The case of provisioning feedback
    Pozdniakov, Stanislav
    Brazil, Jonathan
    Abdi, Solmaz
    Bakharia, Aneesha
    Sadiq, Shazia
    Gašević, Dragan
    Denny, Paul
    Khosravi, Hassan
    Computers and Education: Artificial Intelligence, 2024, 7
  • [4] Generating Automatic Feedback on UI Mockups with Large Language Models
    Duan, Peitong
    Warner, Jeremy
    Li, Yang
    Hartmann, Bjoern
    PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024), 2024,
  • [5] Large language models for sustainable assessment and feedback in higher education
    Agostini, Daniele
    Picasso, Federica
    INTELLIGENZA ARTIFICIALE, 2024, 18 (01) : 121 - 138
  • [6] Comparing Feedback from Large Language Models and Instructors: Teaching Computer Science at Scale
    Ha Nguyen
    Stott, Nate
    Allan, Vicki
    PROCEEDINGS OF THE ELEVENTH ACM CONFERENCE ON LEARNING@SCALE, L@S 2024, 2024, : 335 - 339
  • [7] Evaluating the Effectiveness of Large Language Models in Converting Clinical Data to FHIR Format
    Delaunay, Julien
    Girbes, Daniel
    Cusido, Jordi
    APPLIED SCIENCES-BASEL, 2025, 15 (06):
  • [8] Leveraging large language models to construct feedback from medical multiple-choice Questions
    Tomova, Mihaela
    Rosello Atanet, Ivan
    Sehy, Victoria
    Sieg, Miriam
    Maerz, Maren
    Maeder, Patrick
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [9] Large Language Models as Evaluators in Education: Verification of Feedback Consistency and Accuracy
    Seo, Hyein
    Hwang, Taewook
    Jung, Jeesu
    Kang, Hyeonseok
    Namgoong, Hyuk
    Lee, Yohan
    Jung, Sangkeun
    APPLIED SCIENCES-BASEL, 2025, 15 (02):
  • [10] Using Large Language Models to Provide Formative Feedback in Intelligent Textbooks
    Morris, Wesley
    Crossley, Scott
    Holmes, Langdon
    Ou, Chaohua
    McNamara, Danielle
    Dascalu, Mihai
    ARTIFICIAL INTELLIGENCE IN EDUCATION. POSTERS AND LATE BREAKING RESULTS, WORKSHOPS AND TUTORIALS, INDUSTRY AND INNOVATION TRACKS, PRACTITIONERS, DOCTORAL CONSORTIUM AND BLUE SKY, AIED 2023, 2023, 1831 : 484 - 489