LLM-Guided Reinforcement Learning for Interactive Environments

被引：0

作者：

Yang, Fuxue ^{[1
]}

Liu, Jiawen ^{[1
]}

Li, Kan ^{[1
]}

机构：

[1] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing 100081, Peoples R China

来源：

MATHEMATICS | 2025年 / 13卷 / 12期

基金：

北京市自然科学基金;

关键词：

reinforcement learning; large language models; chain of thought; LANGUAGE;

D O I：

10.3390/math13121932

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

We propose herein LLM-Guided Reinforcement Learning (LGRL), a novel framework that leverages large language models (LLMs) to decompose high-level objectives into a sequence of manageable subgoals in interactive environments. Our approach decouples high-level planning from low-level action execution by dynamically generating context-aware subgoals that guide the reinforcement learning (RL) agent. During training, intermediate subgoals-each associated with partial rewards-are generated based on the agent's current progress, providing fine-grained feedback that facilitates structured exploration and accelerates convergence. At inference, a chain-of-thought strategy is employed, enabling the LLM to adaptively update subgoals in response to evolving environmental states. Although demonstrated on a representative interactive setting, our method is generalizable to a wide range of complex, goal-oriented tasks. Experimental results show that LGRL achieves higher success rates, improved efficiency, and faster convergence compared to baseline approaches.

引用

页数：13

共 26 条

[1]

Ahn M, 2022, PR MACH LEARN RES, V205, P287

[2]

Ahuja A., 2023, arXiv

[3] Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments [J].

Anderson, Peter ;

Wu, Qi ;

Teney, Damien ;

Bruce, Jake ;

Johnson, Mark ;

Sunderhauf, Niko ;

Reid, Ian ;

Gould, Stephen ;

van den Hengel, Anton .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3674-3683

[4]

Bacon PL, 2017, AAAI CONF ARTIF INTE, P1726

[5]

Brown TB, 2020, ADV NEUR IN, V33

[6] End-to-End Object Detection with Transformers [J].

Carion, Nicolas ;

Massa, Francisco ;

Synnaeve, Gabriel ;

Usunier, Nicolas ;

Kirillov, Alexander ;

Zagoruyko, Sergey .

COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229

[7]

Carta Thomas, 2023, PR MACH LEARN RES, V202

[8]

Chevalier-Boisvert M., 2018, P INT C LEARN REPR V

[9]

Chevalier-Boisvert M, 2023, ADV NEUR IN

[10]

Hu Edward J, 2022, P 2022 INT C LEARN R

← 1 2 3 →