Deep Reinforcement Learning Enables Joint Trajectory and Communication in Internet of Robotic Things

被引:0
作者
Luo, Ruyu [1 ]
Tian, Hui [1 ]
Ni, Wanli [2 ]
Cheng, Julian [3 ]
Chen, Kwang-Cheng [4 ]
机构
[1] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China
[2] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
[3] Univ British Columbia, Sch Engn, Kelowna, BC V1V 1V7, Canada
[4] Univ S Florida, Dept Elect Engn, Tampa, FL 33620 USA
基金
中国国家自然科学基金;
关键词
Ultra reliable low latency communication; Trajectory; Resource management; Robots; NOMA; Wireless communication; Decoding; Deep reinforcement learning; Internet of Robotic Things; trajectory design; ultra-reliable low-latency communications; RESOURCE-ALLOCATION; URLLC; NOMA; OPTIMIZATION; CAPACITY;
D O I
10.1109/TWC.2024.3462450
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Internet of Robotic Things (IoRT) emphasizes the integrated robotic, artificial intelligence computing, and communication technologies, enabling more sophisticated operations and decision-making. As a crucial element of IoRT, mission-critical applications, such as industrial manufacturing and emergency services, impose stringent requirements on ultra-reliable and low-latency communication (URLLC). The paper focuses on addressing URLLC challenges in the context of IoRT, particularly when autonomous mobile robots (AMRs) coexist with static sensors. We prioritize safe and efficient AMRs' travel through trajectory design and communication resource allocation in IoRT systems without the need of any prior knowledge. To enhance network connectivity and exploit diversity gains, we introduce the flexible decoding and free clustering as the next-generation multiple access technologies in spectrum-limited downlink IoRT system. Then, aiming at minimizing the decoding error probability and travel time, we formulate a long-term multi-objective optimization problem by jointly designing AMRs' trajectory and communication resource. To accommodate the inherent dynamics and unpredictability in the IoRT system, we introduce a multi-agent actor-critic deep reinforcement learning (DRL) framework, offering four distinct implementations, each accompanied by comprehensive complexity analyses. Simulation results reveal the following insights: 1) in terms of DRL implementations, off-policy algorithms with deterministic policies outperform their on-policy counterparts, achieving approximately a 67% increase in rewards; 2) In terms of communication schemes, our proposed flexible decoding and free clustering strategies under designed trajectories can effectively reduce decoding errors; and 3) In terms of algorithm optimality, our DRL framework shows superior flexibility and adaptability in communication environments compared to traditional A* search and heuristic methods.
引用
收藏
页码:18154 / 18168
页数:15
相关论文
共 56 条
[21]   On actor-critic algorithms [J].
Konda, VR ;
Tsitsiklis, JN .
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2003, 42 (04) :1143-1166
[22]   URLLC-Based Cooperative Industrial IoT Networks With Nonlinear Energy Harvesting [J].
Kurma, Sravani ;
Sharma, Prabhat Kumar ;
Singh, Keshav ;
Mumtaz, Shahid ;
Li, Chih-Peng .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (02) :2078-2088
[23]   Cognitive Optimal-Setting Control of AIoT Industrial Applications With Deep Reinforcement Learning [J].
Lai, Ying-Hsun ;
Wu, Tung-Cheng ;
Lai, Chin-Feng ;
Yang, Laurence Tianruo ;
Zhou, Xiaokang .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (03) :2116-2123
[24]   Deep Learning for Distributed Optimization: Applications to Wireless Resource Management [J].
Lee, Hoon ;
Lee, Sang Hyun ;
Quek, Tony Q. S. .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2019, 37 (10) :2251-2266
[25]   Evolution of NOMA Toward Next Generation Multiple Access (NGMA) for 6G [J].
Liu, Yuanwei ;
Zhang, Shuowen ;
Mu, Xidong ;
Ding, Zhiguo ;
Schober, Robert ;
Al-Dhahir, Naofal ;
Hossain, Ekram ;
Shen, Xuemin .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2022, 40 (04) :1037-1071
[26]   Joint Trajectory and Radio Resource Optimization for Autonomous Mobile Robots Exploiting Multi-Agent Reinforcement Learning [J].
Luo, Ruyu ;
Ni, Wanli ;
Tian, Hui ;
Cheng, Julian ;
Chen, Kwang-Cheng .
IEEE TRANSACTIONS ON COMMUNICATIONS, 2023, 71 (09) :5244-5258
[27]   Federated Deep Reinforcement Learning for RIS-Assisted Indoor Multi-Robot Communication Systems [J].
Luo, Ruyu ;
Ni, Wanli ;
Tian, Hui ;
Cheng, Julian .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (11) :12321-12326
[28]   Communication-Aware Path Design for Indoor Robots Exploiting Federated Deep Reinforcement Learning [J].
Luo, Ruyu ;
Tian, Hui ;
Ni, Wanli .
2021 IEEE 32ND ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2021,
[29]   Simulation-based optimization of Markov reward processes [J].
Marbach, P ;
Tsitsiklis, JN .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2001, 46 (02) :191-209
[30]   Human-level control through deep reinforcement learning [J].
Mnih, Volodymyr ;
Kavukcuoglu, Koray ;
Silver, David ;
Rusu, Andrei A. ;
Veness, Joel ;
Bellemare, Marc G. ;
Graves, Alex ;
Riedmiller, Martin ;
Fidjeland, Andreas K. ;
Ostrovski, Georg ;
Petersen, Stig ;
Beattie, Charles ;
Sadik, Amir ;
Antonoglou, Ioannis ;
King, Helen ;
Kumaran, Dharshan ;
Wierstra, Daan ;
Legg, Shane ;
Hassabis, Demis .
NATURE, 2015, 518 (7540) :529-533