Goal-Conditioned Reinforcement Learning within a Human-Robot Disassembly Environment

被引：3

作者：

Elguea-Aguinaco, Inigo ^{[1
,2
]}

Serrano-Munoz, Antonio ^{[2
]}

Chrysostomou, Dimitrios ^{[3
]}

Inziarte-Hidalgo, Ibai ^{[1
]}

Bogh, Simon ^{[3
]}

Arana-Arexolaleiba, Nestor ^{[2
,3
]}

机构：

[1] Electrotecn Alavesa SL, Res & Dev Dept, Vitoria 1010, Spain

[2] Univ Mondragon, Robot & Automat Elect & Comp Sci Dept, Arrasate Mondragon 20500, Spain

[3] Aalborg Univ, Mat & Prod Dept, DK-9220 Aalborg, Denmark

来源：

APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 22期

基金：

欧盟地平线“2020”;

关键词：

collaborative robots; machine learning; reinforcement learning; contact-rich tasks; disassembly; collision avoidance; TASKS;

D O I：

10.3390/app122211610

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

The introduction of collaborative robots in industrial environments reinforces the need to provide these robots with better cognition to accomplish their tasks while fostering worker safety without entering into safety shutdowns that reduce workflow and production times. This paper presents a novel strategy that combines the execution of contact-rich tasks, namely disassembly, with real-time collision avoidance through machine learning for safe human-robot interaction. Specifically, a goal-conditioned reinforcement learning approach is proposed, in which the removal direction of a peg, of varying friction, tolerance, and orientation, is subject to the location of a human collaborator with respect to a 7-degree-of-freedom manipulator at each time step. For this purpose, the suitability of three state-of-the-art actor-critic algorithms is evaluated, and results from simulation and real-world experiments are presented. In reality, the policy's deployment is achieved through a new scalable multi-control framework that allows a direct transfer of the control policy to the robot and reduces response times. The results show the effectiveness, generalization, and transferability of the proposed approach with two collaborative robots against static and dynamic obstacles, leveraging the set of available solutions in non-monotonic tasks to avoid a potential collision with the human worker.

引用

页数：19

共 50 条

[1] Contrastive Learning as Goal-Conditioned Reinforcement Learning
Eysenbach, Benjamin
Zhang, Tianjun
Levine, Sergey
Salakhutdinov, Ruslan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[2] Goal-Conditioned Reinforcement Learning with Imagined Subgoals
Chane-Sane, Elliot
Schmid, Cordelia
Laptev, Ivan
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[3] State Representation Learning for Goal-Conditioned Reinforcement Learning
Steccanella, Lorenzo
Jonsson, Anders
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT IV, 2023, 13716 : 84 - 99
[4] Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning
Hansen-Estruch, Philippe
Zhang, Amy
Nair, Ashvin
Yin, Patrick
Levine, Sergey
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[5] Curriculum Goal-Conditioned Imitation for Offline Reinforcement Learning
Feng, Xiaoyun
Jiang, Li
Yu, Xudong
Xu, Haoran
Sun, Xiaoyan
Wang, Jie
Zhan, Xianyuan
Chan, Wai Kin
IEEE TRANSACTIONS ON GAMES, 2024, 16 (01) : 102 - 112
[6] Goal-Conditioned Predictive Coding for Offline Reinforcement Learning
Zeng, Zilai
Zhang, Ce
Wang, Shijie
Sun, Chen
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[7] Goal-Conditioned Reinforcement Learning for Ultrasound Navigation Guidance
Amadou, Abdoul Aziz
Singh, Vivek
Ghesu, Florin C.
Kim, Young-Ho
Stanciulescu, Laura
Sai, Harshitha P.
Sharma, Puneet
Young, Alistair
Rajani, Ronak
Rhode, Kawal
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT XI, 2024, 15011 : 319 - 329
[8] Hindsight Expectation Maximization for Goal-conditioned Reinforcement Learning
Tang, Yunhao
Kucukelbir, Alp
24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
[9] Sample Complexity of Goal-Conditioned Hierarchical Reinforcement Learning
Robert, Arnaud
Pike-Burke, Ciara
Faisal, A. Aldo
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[10] Contrastive Goal Grouping for Policy Generalization in Goal-Conditioned Reinforcement Learning
Zou, Qiming
Suzuki, Einoshin
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2021, 13108 LNCS : 240 - 253

← 1 2 3 4 5 →