Reward Design Using Large Language Models for Natural Language Explanation of Reinforcement Learning Agent Actions

被引：0

作者：

Masadome, Shinya ^{[1
]}

Harada, Taku ^{[2
]}

机构：

[1] Tokyo Univ Sci, Grad Sch Sci & Technol, Dept Ind & Syst Engn, 2641 Yamazaki, Noda, Chiba 2788510, Japan

[2] Tokyo Univ Sci, Fac Sci & Technol, Dept Ind & Syst Engn, 2641 Yamazaki, Noda, Chiba 2788510, Japan

来源：

IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING | 2025年

关键词：

large language model; explainable reinforcement learning; natural language explanation;

D O I：

10.1002/tee.70005

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Reinforcement learning (RL) has found applications across diverse domains; however, it grapples with challenges when formulating reward functions and exhibits low exploration efficiency. Recent studies leveraging large language models (LLMs) have made strides in addressing these issues. However, for RL agents to be practically deployable, elucidating their decision-making process is crucial for enhancing explainability. We introduce a novel RL approach aimed at alleviating the burden of designing reward functions and facilitating natural language explanations for actions grounded in the agent's decisions. Our method employs two types of agents: a low-level agent responsible for concrete action selection and a high-level agent tasked with setting abstract action goals. The high-level agent undergoes training using a hybrid reward function framework, which incentivizes its actions by comparing them with those generated by an LLM across discretized states. Meanwhile, the training of the low-level agent is guided by a reward function designed using the EUREKA algorithm. We applied the proposed method to the cart-pole problem and demonstrated its ability to achieve a learning convergence rate while reducing human effort. Moreover, our approach yields coherent natural language explanations elucidating the rationale behind the agent's actions. (c) 2025 The Author(s). IEEJ Transactions on Electrical and Electronic Engineering published by Institute of Electrical Engineers of Japan and Wiley Periodicals LLC.

引用

页数：9

共 14 条

[1] Chen V., 2021, Ask your humans: Using human instructions to improve generalization in reinforcement learning. Proceedings of the International Conference on Learning Representations
[2] Kwon M., 2023, Reward design with language models
[3] A Closer Look at Reward Decomposition for High-level Robotic Explanations
Lu, Wenhao
Zhao, Xufeng
Magg, Sven
Gromniak, Martin
Li, Mengdi
Wermter, Stefan
[J]. 2023 IEEE INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING, ICDL, 2023, : 429 - 436
[4] Ma YJ., 2024, EUREKA: Humanlevel reward design via coding large language models. Proceedings of the International Conference on Learning Representations
[5] OpenAI, ModelsOpenAI API. OpenAI API Documentation
[6] Prakash B., 2023, Proceedings of the 37th Conference on Neural Information Processing Systems, P1
[7] Qing YP, 2023, Arxiv, DOI arXiv:2211.06665
[8] Schulman J, 2017, Arxiv, DOI arXiv:1707.06347
[9] Shota T., 2023, Analysis on task versatility of instruction based robot learning guided by large language models. Proceedings of the 37th Annual Conference of the Japanese Society for Artificial Intelligence, 2O1GS805
[10] A Systematic Study on Reinforcement Learning Based Applications
Sivamayil, Keerthana
Rajasekar, Elakkiya
Aljafari, Belqasem
Nikolovski, Srete
Vairavasundaram, Subramaniyaswamy
Vairavasundaram, Indragandhi
[J]. ENERGIES, 2023, 16 (03)

← 1 2 →