Joint Trajectory and Radio Resource Optimization for Autonomous Mobile Robots Exploiting Multi-Agent Reinforcement Learning

被引:3
作者
Luo, Ruyu [1 ]
Ni, Wanli [1 ]
Tian, Hui [1 ]
Cheng, Julian [2 ]
Chen, Kwang-Cheng [3 ]
机构
[1] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China
[2] Univ British Columbia, Sch Engn, Kelowna, BC V1V 1V7, Canada
[3] Univ S Florida, Dept Elect Engn, Tampa, FL 33620 USA
基金
中国国家自然科学基金;
关键词
Index Terms- Autonomous mobile robots; industrial Internet of Things; multi-agent reinforcement learning; radio resource optimization; trajectory design; INDUSTRIAL IOT; ARTIFICIAL-INTELLIGENCE; DATA-COLLECTION; NOMA; TRANSMISSION; ALLOCATION; NETWORKS;
D O I
10.1109/TCOMM.2023.3285799
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Rapid and efficient sensor data acquisition plays a critical role in the decision-making process of each robot in a multi-robot smart factory. This paper investigates the trajectory design of autonomous mobile robots (AMRs) and communication resource allocation problems in industrial Internet of Things. Specifically, by exploiting both power and spatial domains, we adopt non-orthogonal multiple access to improve network connectivity in a spectrum-efficient manner, while the multi antenna technique is employed to enhance diversity gain. The average sum rate is maximized by jointly optimizing the transmit power of sensors and the trajectory of AMRs. To deal with prior knowledge and dynamic channel conditions, we reformulate the long-term maximization problem as a Markov decision process, and further develop a provably efficient multi-agent reinforcement learning algorithm with a near-optimal regret bound. Our theoretical analysis reveals that both the decentralized execution and the experience exchange method are beneficial to accelerate convergence. Simulation results show that our proposed algorithm can reduce at least 80% convergence time compared to the centralized baseline, and can gain better rewards than the conventional ?-greedy exploration.
引用
收藏
页码:5244 / 5258
页数:15
相关论文
共 48 条
[1]   RIS-Assisted UAV for Timely Data Collection in IoT Networks [J].
Al-Hilo, Ahmed ;
Samir, Moataz ;
Elhattab, Mohamed ;
Assi, Chadi ;
Sharafeddine, Sanaa .
IEEE SYSTEMS JOURNAL, 2023, 17 (01) :431-442
[2]   Downlink Power Allocation for CoMP-NOMA in Multi-Cell Networks [J].
Ali, Md Shipon ;
Hossain, Ekram ;
Al-Dweik, Arafat ;
Kim, Dong In .
IEEE TRANSACTIONS ON COMMUNICATIONS, 2018, 66 (09) :3982-3998
[3]   DATA SCIENCE AND ARTIFICIAL INTELLIGENCE FOR COMMUNICATIONS [J].
Atov, Irena ;
Chen, Kwang-Cheng ;
Kamal, Ahmed ;
Yu, Shui .
IEEE COMMUNICATIONS MAGAZINE, 2019, 57 (11) :82-83
[4]  
Azar MG, 2017, PR MACH LEARN RES, V70
[5]   Distributed Noncoherent Joint Transmission Based on Multi-Agent Reinforcement Learning for Dense Small Cell Networks [J].
Bai, Shaozhuang ;
Gao, Zhenzhen ;
Liao, Xuewen .
IEEE TRANSACTIONS ON COMMUNICATIONS, 2023, 71 (02) :851-863
[6]   Wireless Networked Multirobot Systems in Smart Factories [J].
Chen, Kwang-Cheng ;
Lin, Shih-Chun ;
Hsiao, Jen-Hao ;
Liu, Chun-Hung ;
Molisch, Andreas F. ;
Fettweis, Gerhard P. .
PROCEEDINGS OF THE IEEE, 2021, 109 (04) :468-494
[7]   A Delay-Aware Network Structure for Wireless Sensor Networks With Consecutive Data Collection Processes [J].
Cheng, Chi-Tsun ;
Tse, Chi K. .
IEEE SENSORS JOURNAL, 2013, 13 (06) :2413-2422
[8]   Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks [J].
Cui, Jingjing ;
Liu, Yuanwei ;
Nallanathan, Arumugam .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2020, 19 (02) :729-743
[9]  
Deisenroth M. P., 2011, P 28 INT C MACH LEAR, P465, DOI DOI 10.5555/3104482.3104541
[10]   A Real-Time Big Data Gathering Algorithm Based on Indoor Wireless Sensor Networks for Risk Analysis of Industrial Operations [J].
Ding, Xuejun ;
Tian, Yong ;
Yu, Yan .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2016, 12 (03) :1232-1242