A synthesis of automated planning and reinforcement learning for efficient, robust decision-making

被引：63

作者：

Leonetti, Matteo ^{[1
,3
]}

Iocchi, Luca ^{[2
]}

Stone, Peter ^{[1
]}

机构：

[1] Univ Texas Austin, Dept Comp Sci, 2317 Speedway,Stop D9500, Austin, TX 78712 USA

[2] Sapienza Univ Rome, Dept Comp Control & Management Engn, Via Ariosto 25, I-00185 Rome, Italy

[3] Univ Leeds, Sch Comp, Leeds LS2 9JT, W Yorkshire, England

来源：

ARTIFICIAL INTELLIGENCE | 2016年 / 241卷

基金：

美国国家科学基金会;

关键词：

Automated planning; Reinforcement learning; Autonomous robot; Robot learning; Answer set programming;

D O I：

10.1016/j.artint.2016.07.004

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Automated planning and reinforcement learning are characterized by complementary views on decision making: the former relies on previous knowledge and computation, while the latter on interaction with the world, and experience. Planning allows robots to carry out different tasks in the same domain, without the need to acquire knowledge about each one of them, but relies strongly on the accuracy of the model. Reinforcement learning, on the other hand, does not require previous knowledge, and allows robots to robustly adapt to the environment, but often necessitates an infeasible amount of experience. We present Domain Approximation for Reinforcement LearniNG (DARLING), a method that takes advantage of planning to constrain the behavior of the agent to reasonable choices, and of reinforcement learning to adapt to the environment, and increase the reliability of the decision making process. We demonstrate the effectiveness of the proposed method on a service robot, carrying out a variety of tasks in an office building. We find that when the robot makes decisions by planning alone on a given model it often fails, and when it makes decisions by reinforcement learning alone it often cannot complete its tasks in a reasonable amount of time. When employing DARLING, even when seeded with the same model that was used for planning alone, however, the robot can quickly learn a behavior to carry out all the tasks, improves over time, and adapts to, the environment as it changes. (C) 2016 Elsevier B.V. All rights reserved.

引用

页码：103 / 130

页数：28

共 50 条

[31] Decision-making models on perceptual uncertainty with distributional reinforcement learning
Xu, Shuyuan
Liu, Qiao
Hu, Yuhui
Xu, Mengtian
Hao, Jiachen
GREEN ENERGY AND INTELLIGENT TRANSPORTATION, 2023, 2 (02):
[32] Cognitive Reinforcement Learning: An Interpretable Decision-Making for Virtual Driver
Qi, Hao
Hou, Enguang
Ye, Peijun
IEEE JOURNAL OF RADIO FREQUENCY IDENTIFICATION, 2024, 8 : 627 - 631
[33] MONEYBARL: EXPLOITING PITCHER DECISION-MAKING USING REINFORCEMENT LEARNING
Sidhu, Gagan
Caffo, Brian
ANNALS OF APPLIED STATISTICS, 2014, 8 (02): : 926 - 955
[34] Reinforcement Learning with Uncertainty Estimation for Tactical Decision-Making in Intersections
Hoel, Carl-Johan
Tram, Tommy
Sjoberg, Jonas
2020 IEEE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2020,
[35] A Multiple-Attribute Decision-Making Approach to Reinforcement Learning
Shi, Haobin
Xu, Meng
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2020, 12 (04) : 695 - 708
[36] Unveiling the Decision-Making Process in Reinforcement Learning with Genetic Programming
Eberhardinger, Manuel
Rupp, Florian
Maucher, Johannes
Maghsudi, Setareh
ADVANCES IN SWARM INTELLIGENCE, PT I, ICSI 2024, 2024, 14788 : 349 - 365
[37] Intrusion Response Decision-making Method Based on Reinforcement Learning
Yang, Jun-nan
Zhang, Hong-qi
Zhang, Chuan-fu
2018 INTERNATIONAL CONFERENCE ON COMMUNICATION, NETWORK AND ARTIFICIAL INTELLIGENCE (CNAI 2018), 2018, : 154 - 162
[38] Research on Decision-Making in Emotional Agent Based on Reinforcement Learning
Feng Chao
Chen Lin
Jiang Kui
Wei Zhonglin
Zhai Bing
2016 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2016, : 1191 - 1194
[39] Historical Decision-Making Regularized Maximum Entropy Reinforcement Learning
Dong, Botao
Huang, Longyang
Pang, Ning
Chen, Hongtian
Zhang, Weidong
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
[40] SPACECRAFT DECISION-MAKING AUTONOMY USING DEEP REINFORCEMENT LEARNING
Harris, Andrew
Teil, Thibaud
Schaub, Hanspeter
SPACEFLIGHT MECHANICS 2019, VOL 168, PTS I-IV, 2019, 168 : 1757 - 1775

← 1 2 3 4 5 →