Optimizing Agent Behavior in the MiniGrid Environment Using Reinforcement Learning Based on Large Language Models

被引：0

作者：

Park, Byeong-Ju ^{[1
]}

Yong, Sung-Jung ^{[1
]}

Hwang, Hyun-Seo ^{[1
]}

Moon, Il-Young ^{[1
]}

机构：

[1] Korea Univ Technol & Educ, Sch Comp Sci & Engn, Cheonan 31253, South Korea

来源：

APPLIED SCIENCES-BASEL | 2025年 / 15卷 / 04期

基金：

新加坡国家研究基金会;

关键词：

reinforcement learning; artificial intelligence; large language models; natural language processing; agent behavior optimization;

D O I：

10.3390/app15041860

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Reinforcement learning is one of the most prominent research areas in the field of artificial intelligence, playing a crucial role in developing agents that autonomously make decisions in complex environments. This study proposes a method to optimize agent behavior in the MiniGrid-Empty-5x5-v0 environment using large language models (LLMs). By leveraging the natural language processing capabilities of LLMs to interpret environmental states and select appropriate actions, this research explores an approach that differs from traditional reinforcement learning methods. Experimental results confirm that LLM-based agents can effectively achieve their goals, and it is anticipated that maximizing the synergy between LLMs and reinforcement learning will contribute to the development of more intelligent and adaptable AI systems.

引用

页数：14

共 17 条

[1]

2023, Arxiv, DOI arXiv:2303.08774

[2]

Bahrini A, 2023, Arxiv, DOI [arXiv:2304.09103, DOI 10.48550/ARXIV.2304.09103, 10.48550/arXiv.2304.09103]

[3]

Bommasani R., 2021, arXiv

[4]

Bubeck S, 2023, Arxiv, DOI arXiv:2303.12712

[5]

Cho S.K., 2024, J. Archit. Inst. Korea, V40, P81, DOI [10.5659/JAIK.2024.40.9.81, DOI 10.5659/JAIK.2024.40.9.81]

[6]

DATAVERSITY, About us

[7]

Fedus W, 2022, J MACH LEARN RES, V23

[8]

Haarnoja T, 2018, Arxiv, DOI [arXiv:1801.01290, DOI 10.48550/ARXIV.1801.01290]

[9]

Kang S.H., 2024, J. Korea Inst. Inf. Commun. Eng, V28, P916, DOI [10.6109/jkiice.2024.28.8.916, DOI 10.6109/JKIICE.2024.28.8.916]

[10] Deep Reinforcement Learning for Autonomous Driving: A Survey [J].

Kiran, B. Ravi ;

Sobh, Ibrahim ;

Talpaert, Victor ;

Mannion, Patrick ;

Al Sallab, Ahmad A. ;

Yogamani, Senthil ;

Perez, Patrick .

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (06) :4909-4926

← 1 2 →