Deep reinforcement learning for cooperative robots based on adaptive sentiment feedback

被引:6
作者
Jeon, Haein [1 ]
Kim, Dae-Won [2 ]
Kang, Bo-Yeong [3 ]
机构
[1] Kyungpook Natl Univ, Dept Artificial Intelligence, Daegu 41566, South Korea
[2] Chung Ang Univ, Sch Comp Sci & Engn, Seoul 06974, South Korea
[3] Kyungpook Natl Univ, Dept Robot & Smart Syst Engn, Daegu 41566, South Korea
基金
新加坡国家研究基金会;
关键词
Human-robot interaction; Deep reinforcement learning; Interactive reinforcement learning; Human-in-the-loop; Reward shaping;
D O I
10.1016/j.eswa.2023.121198
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human-robot cooperative tasks have gained importance with the emergence of robotics and artificial intelligence technology. In interactive reinforcement learning techniques, robots learn target tasks by receiving feedback from an experienced human trainer. However, most interactive reinforcement learning studies require a separate process to integrate the trainer's feedback into the training dataset, making it challenging for robots to learn new tasks from humans in real-time. Furthermore, the types of feedback sentences that trainers can use are limited in previous research. To address these limitations, this paper proposes a robot teaching strategy that uses deep RL via human-robot interaction to learn table balancing tasks interactively. The proposed system employs Deep Q-Network with real-time sentiment feedback delivered through the trainer's speech to learn cooperative tasks. We designed a novel reward function that incorporates sentiment feedback from human speech in real-time during the learning process. The paper presents an improved reward shaping technique based on subdivided feedback levels and shrinking feedback. This function serves as a guide for the robot to engage in natural interactions with humans and enables it to learn the tasks effectively. Experimental results demonstrate that the proposed interactive deep reinforcement learning model achieved a high success rate of up to 99.06%, outperforming the model without sentiment feedback.
引用
收藏
页数:11
相关论文
共 42 条
  • [1] Arakawa R., 2018, arXiv
  • [2] Multi-Robot Path Planning Method Using Reinforcement Learning
    Bae, Hyansu
    Kim, Gidong
    Kim, Jonguk
    Qian, Dianwei
    Lee, Sukgyu
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (15):
  • [3] Barraquand R., 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI 2008), P209
  • [4] Boud D., 1995, ASSESSMENT LEARNING, P35, DOI DOI 10.4324/9780203062074-8
  • [5] Brys T, 2015, PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), P3352
  • [6] Deep reinforcement learning based moving object grasping
    Chen, Pengzhan
    Lu, Weiqing
    [J]. INFORMATION SCIENCES, 2021, 565 : 62 - 76
  • [7] Cruz F, 2015, IEEE IJCNN
  • [8] Training Agents With Interactive Reinforcement Learning and Contextual Affordances
    Cruz, Francisco
    Magg, Sven
    Weber, Cornelius
    Wermter, Stefan
    [J]. IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2016, 8 (04) : 271 - 284
  • [9] Accelerating Interactive Reinforcement Learning by Human Advice for an Assembly Task by a Cobot
    De Winter, Joris
    De Beir, Albert
    El Makrini, Ilias
    van de Perre, Greet
    Nowe, Ann
    Vanderborght, Bram
    [J]. ROBOTICS, 2019, 8 (04)
  • [10] Lindsey the Tour Guide Robot - Usage Patterns in a Museum Long-Term Deployment
    Del Duchetto, Francesco
    Baxter, Paul
    Hanheide, Marc
    [J]. 2019 28TH IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN), 2019,