Efficient Reinforcement Learning of Task Planners for Robotic Palletization Through Iterative Action Masking Learning

被引:0
|
作者
Wu, Zheng [1 ]
Li, Yichuan [2 ]
Zhan, Wei [1 ]
Liu, Changliu [3 ]
Liu, Yun-Hui [2 ]
Tomizuka, Masayoshi [1 ]
机构
[1] Univ Calif Berkeley, Dept Mech Engn, Berkeley, CA 92093 USA
[2] Chinese Univ Hong Kong, T Stone Robot Inst, Dept Mech & Automat Engn, Hong Kong, Peoples R China
[3] Carnegie Mellon Univ, Inst Robot, Pittsburgh, PA 15213 USA
来源
IEEE ROBOTICS AND AUTOMATION LETTERS | 2024年 / 9卷 / 11期
关键词
Three-dimensional displays; Robots; Task analysis; Pallets; Planning; Training; Thermal stability; Action space masking; reinforcement learning; robotic palletization; PACKING;
D O I
10.1109/LRA.2024.3440731
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
The development of robotic systems for palletization in logistics scenarios is of paramount importance, addressing critical efficiency and precision demands in supply chain management. This paper investigates the application of Reinforcement Learning (RL) in enhancing task planning for such robotic systems. Confronted with the substantial challenge of a vast action space, which is a significant impediment to efficiently apply out-of-the-shelf RL methods, our study introduces a novel method of utilizing supervised learning to iteratively prune and manage the action space effectively. By reducing the complexity of the action space, our approach not only accelerates the learning phase but also ensures the effectiveness and reliability of the task planning in robotic palletization. The experiemental results underscore the efficacy of this method, highlighting its potential in improving the performance of RL applications in complex and high-dimensional environments like logistics palletization.
引用
收藏
页码:9303 / 9310
页数:8
相关论文
共 50 条
  • [31] Control strategy of robotic manipulator based on multi-task reinforcement learning
    Wang, Tao
    Ruan, Ziming
    Wang, Yuyan
    Chen, Chong
    COMPLEX & INTELLIGENT SYSTEMS, 2025, 11 (03)
  • [32] Utilizing Skipped Frames in Action Repeats for Improving Sample Efficiency in Reinforcement Learning
    Luu, Tung M.
    Thanh Nguyen
    Thang Vu
    Yoo, Chang D.
    IEEE ACCESS, 2022, 10 : 64965 - 64975
  • [33] Federated Deep Reinforcement Learning for Task Scheduling in Heterogeneous Autonomous Robotic System
    Tai Manh Ho
    Kim-Khoa Nguyen
    Cheriet, Mohamed
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 1134 - 1139
  • [34] Efficient Novelty Search Through Deep Reinforcement Learning
    Shi, Longxiang
    Li, Shijian
    Zheng, Qian
    Yao, Min
    Pan, Gang
    IEEE ACCESS, 2020, 8 : 128809 - 128818
  • [35] Bridging Reinforcement Learning and Iterative Learning Control: Autonomous Motion Learning for Unknown, Nonlinear Dynamics
    Meindl, Michael
    Lehmann, Dustin
    Seel, Thomas
    FRONTIERS IN ROBOTICS AND AI, 2022, 9
  • [36] Efficient exploration through active learning for value function approximation in reinforcement learning
    Akiyama, Takayuki
    Hachiya, Hirotaka
    Sugiyama, Masashi
    NEURAL NETWORKS, 2010, 23 (05) : 639 - 648
  • [37] Data Efficient Deep Reinforcement Learning With Action-Ranked Temporal Difference Learning
    Liu, Qi
    Li, Yanjie
    Liu, Yuecheng
    Lin, Ke
    Gao, Jianqi
    Lou, Yunjiang
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (04): : 2949 - 2961
  • [38] Exploration With Task Information for Meta Reinforcement Learning
    Jiang, Peng
    Song, Shiji
    Huang, Gao
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (08) : 4033 - 4046
  • [39] Mastering the Complex Assembly Task With a Dual-Arm Robot Based on Deep Reinforcement Learning: A Novel Reinforcement Learning Method
    Jiang, Daqi
    Wang, Hong
    Lu, Yanzheng
    IEEE ROBOTICS & AUTOMATION MAGAZINE, 2023, 30 (02) : 57 - 66
  • [40] Causal action empowerment for efficient reinforcement learning in embodied agents
    Hongye Cao
    Fan Feng
    Jing Huo
    Yang Gao
    Science China Information Sciences, 2025, 68 (5)