Trajectory Design for UAV-Based Internet of Things Data Collection: A Deep Reinforcement Learning Approach

被引:75
|
作者
Wang, Yang [1 ]
Gao, Zhen [1 ]
Zhang, Jun [1 ]
Cao, Xianbin [2 ]
Zheng, Dezhi [3 ]
Gao, Yue [4 ]
Ng, Derrick Wing Kwan [5 ]
Di Renzo, Marco [6 ]
机构
[1] Beijing Inst Technol, Sch Informat & Elect, Beijing 100081, Peoples R China
[2] Beihang Univ, Sch Elect & Informat Engn, Beijing 100191, Peoples R China
[3] Beihang Univ, Sch Instrumentat & Optoelect Engn, Innovat Inst Frontier Sci & Technol, Beijing 100191, Peoples R China
[4] Univ Surrey, Dept Elect & Elect Engn, Surrey GU2 7XH, England
[5] Univ New South Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2025, Australia
[6] Univ Paris Saclay, Lab Signaux & Syst, Cent Supelec, CNRS, F-91192 Gif Sur Yvette, France
来源
IEEE INTERNET OF THINGS JOURNAL | 2022年 / 9卷 / 05期
基金
北京市自然科学基金; 澳大利亚研究理事会; 中国国家自然科学基金;
关键词
Trajectory; Data collection; Sensors; Optimization; Three-dimensional displays; Minimization; Resource management; deep reinforcement learning (DRL); Internet of Things (IoT); trajectory design; unmanned aerial vehicle (UAV) communications; ENERGY-EFFICIENT; RESOURCE-ALLOCATION; COMMUNICATION; OPTIMIZATION;
D O I
10.1109/JIOT.2021.3102185
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we investigate an unmanned aerial vehicle (UAV)-assisted Internet of Things (IoT) system in a sophisticated 3-D environment, where the UAV's trajectory is optimized to efficiently collect data from multiple IoT ground nodes. Unlike existing approaches focusing only on a simplified 2-D scenario and the availability of perfect channel state information (CSI), this article considers a practical 3-D urban environment with imperfect CSI, where the UAV's trajectory is designed to minimize data collection completion time subject to practical throughput and flight movement constraints. Specifically, inspired by the state-of-the-art deep reinforcement learning approaches, we leverage the twin-delayed deep deterministic policy gradient (TD3) to design the UAV's trajectory and we present a TD3-based trajectory design for completion time minimization (TD3-TDCTM) algorithm. In particular, we set an additional information, i.e., the merged pheromone, to represent the state information of the UAV and environment as a reference of reward which facilitates the algorithm design. By taking the service statuses of the IoT nodes, the UAV's position, and the merged pheromone as input, the proposed algorithm can continuously and adaptively learn how to adjust the UAV's movement strategy. By interacting with the external environment in the corresponding Markov decision process, the proposed algorithm can achieve a near-optimal navigation strategy. Our simulation results show the superiority of the proposed TD3-TDCTM algorithm over three conventional nonlearning-based baseline methods.
引用
收藏
页码:3899 / 3912
页数:14
相关论文
共 50 条
  • [41] Trajectory Design for UAV Communications with No-Fly Zones by Deep Reinforcement Learning
    Liu, Zhenrong
    Zeng, Yuan
    Zhang, Wei
    Gong, Yi
    2021 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS (ICC WORKSHOPS), 2021,
  • [42] Trajectory Design for Overlay UAV-to-Device Communications by Deep Reinforcement Learning
    Wu, Fanyi
    Zhang, Hongliang
    Wu, Jianjun
    Song, Lingyang
    2019 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2019,
  • [43] AoI optimal UAV trajectory planning: A Deep Recurrent Reinforcement Learning Approach
    Wu, Mengjie
    Chi, Huijia
    Gan, Shuying
    Wang, Xijun
    Xu, Chao
    2021 IEEE 32ND ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2021,
  • [44] UAV UV Information Collection Method Based on Deep Reinforcement Learning
    Zhao, Taifei
    Guo, Jiahao
    Xin, Yu
    Wang, Lu
    ACTA PHOTONICA SINICA, 2025, 54 (01)
  • [45] UAV Data Collection With Deep Reinforcement Learning for Grant-Free IoT
    Zhong, Jiale
    Hu, Yingdong
    Li, Ye
    Xu, Yicheng
    Gao, Ruifeng
    Wang, Jue
    2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
  • [46] Energy-efficient UAV-enabled computation offloading for industrial internet of things: a deep reinforcement learning approach
    Shi, Shuo
    Wang, Meng
    Gu, Shushi
    Zheng, Zhong
    WIRELESS NETWORKS, 2024, 30 (05) : 3921 - 3934
  • [47] 5G Network on Wings: A Deep Reinforcement Learning Approach to the UAV-Based Integrated Access and Backhaul
    Zhang, Hongyi
    Qi, Zhiqiang
    Li, Jingya
    Aronsson, Anders
    Bosch, Jan
    Holmström Olsson, Helena
    IEEE Transactions on Machine Learning in Communications and Networking, 2024, 2 : 1109 - 1126
  • [48] Reinforcement and deep reinforcement learning for wireless Internet of Things: A survey
    Frikha, Mohamed Said
    Gammar, Sonia Mettali
    Lahmadi, Abdelkader
    Andrey, Laurent
    COMPUTER COMMUNICATIONS, 2021, 178 : 98 - 113
  • [49] UAV Trajectory Design Based on Reinforcement Learning for Wireless Power Transfer
    Ku, Sungmo
    Jung, Sangwon
    Lee, Chungyoung
    2019 34TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC 2019), 2019, : 553 - 555
  • [50] Age-energy-aware trajectory planning for UAV-assisted data collection in Internet of Things
    Chen, Hao
    Jia, Zekun
    Ma, Nan
    Liu, Yiming
    Yao, Yuanyuan
    Qin, Xiaoqi
    IET COMMUNICATIONS, 2023, 17 (10) : 1177 - 1187