Trajectory Design for UAV-Based Internet of Things Data Collection: A Deep Reinforcement Learning Approach

被引:75
|
作者
Wang, Yang [1 ]
Gao, Zhen [1 ]
Zhang, Jun [1 ]
Cao, Xianbin [2 ]
Zheng, Dezhi [3 ]
Gao, Yue [4 ]
Ng, Derrick Wing Kwan [5 ]
Di Renzo, Marco [6 ]
机构
[1] Beijing Inst Technol, Sch Informat & Elect, Beijing 100081, Peoples R China
[2] Beihang Univ, Sch Elect & Informat Engn, Beijing 100191, Peoples R China
[3] Beihang Univ, Sch Instrumentat & Optoelect Engn, Innovat Inst Frontier Sci & Technol, Beijing 100191, Peoples R China
[4] Univ Surrey, Dept Elect & Elect Engn, Surrey GU2 7XH, England
[5] Univ New South Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2025, Australia
[6] Univ Paris Saclay, Lab Signaux & Syst, Cent Supelec, CNRS, F-91192 Gif Sur Yvette, France
来源
IEEE INTERNET OF THINGS JOURNAL | 2022年 / 9卷 / 05期
基金
北京市自然科学基金; 澳大利亚研究理事会; 中国国家自然科学基金;
关键词
Trajectory; Data collection; Sensors; Optimization; Three-dimensional displays; Minimization; Resource management; deep reinforcement learning (DRL); Internet of Things (IoT); trajectory design; unmanned aerial vehicle (UAV) communications; ENERGY-EFFICIENT; RESOURCE-ALLOCATION; COMMUNICATION; OPTIMIZATION;
D O I
10.1109/JIOT.2021.3102185
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we investigate an unmanned aerial vehicle (UAV)-assisted Internet of Things (IoT) system in a sophisticated 3-D environment, where the UAV's trajectory is optimized to efficiently collect data from multiple IoT ground nodes. Unlike existing approaches focusing only on a simplified 2-D scenario and the availability of perfect channel state information (CSI), this article considers a practical 3-D urban environment with imperfect CSI, where the UAV's trajectory is designed to minimize data collection completion time subject to practical throughput and flight movement constraints. Specifically, inspired by the state-of-the-art deep reinforcement learning approaches, we leverage the twin-delayed deep deterministic policy gradient (TD3) to design the UAV's trajectory and we present a TD3-based trajectory design for completion time minimization (TD3-TDCTM) algorithm. In particular, we set an additional information, i.e., the merged pheromone, to represent the state information of the UAV and environment as a reference of reward which facilitates the algorithm design. By taking the service statuses of the IoT nodes, the UAV's position, and the merged pheromone as input, the proposed algorithm can continuously and adaptively learn how to adjust the UAV's movement strategy. By interacting with the external environment in the corresponding Markov decision process, the proposed algorithm can achieve a near-optimal navigation strategy. Our simulation results show the superiority of the proposed TD3-TDCTM algorithm over three conventional nonlearning-based baseline methods.
引用
收藏
页码:3899 / 3912
页数:14
相关论文
共 50 条
  • [1] Trajectory Design for UAV-Based Internet of Things Data Collection: A Deep Reinforcement Learning Approach
    Wang, Yang
    Gao, Zhen
    Zhang, Jun
    Cao, Xianbin
    Zheng, Dezhi
    Gao, Yue
    Ng, Derrick Wing Kwan
    Renzo, Marco Di
    IEEE Internet of Things Journal, 2022, 9 (05): : 3899 - 3912
  • [2] Trajectory Design for UAV-Based Inspection System: A Deep Reinforcement Learning Approach
    Zhang, Wei
    Yang, Dingcheng
    Wu, Fahui
    Xiao, Lin
    2023 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS, ICC WORKSHOPS, 2023, : 1654 - 1659
  • [3] Deep Reinforcement Learning for UAV-Based SDWSN Data Collection
    Karegar, Pejman A.
    Al-Hamid, Duaa Zuhair
    Chong, Peter Han Joo
    FUTURE INTERNET, 2024, 16 (11)
  • [4] Timely Data Collection for UAV-Based IoT Networks: A Deep Reinforcement Learning Approach
    Hu, Yingmeng
    Liu, Yan
    Kaushik, Aryan
    Masouros, Christos
    Thompson, John S.
    IEEE SENSORS JOURNAL, 2023, 23 (11) : 12295 - 12308
  • [5] Deep Reinforcement Learning for Efficient Data Collection in UAV-Aided Internet of Things
    Tong, Peng
    Liu, Juan
    Wang, Xijun
    Bai, Bo
    Dai, Huaiyu
    2020 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS (ICC WORKSHOPS), 2020,
  • [6] Joint AoI-Aware UAVs Trajectory Planning and Data Collection in UAV-Based IoT Systems: A Deep Reinforcement Learning Approach
    Xiao, Xiongbing
    Wang, Xiumin
    Lin, Weiwei
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2024, 70 (04) : 6484 - 6495
  • [7] UAV-Based Data Collection and Wireless Power Transfer System with Deep Reinforcement Learning
    Lee, Jaewook
    Seo, Sangwon
    Ko, Haneul
    2023 INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, ICOIN, 2023, : 400 - 403
  • [8] Deep Reinforcement Learning Based Trajectory Design for Customized UAV-Aided NOMA Data Collection
    Zhang, Lei
    Zhang, Yuandi
    Lu, Jiawangnan
    Xiao, Yunfa
    Zhang, Guanglin
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2024, 13 (12) : 3365 - 3369
  • [9] Joint Flight Cruise Control and Data Collection in UAV-Aided Internet of Things: An Onboard Deep Reinforcement Learning Approach
    Li, Kai
    Ni, Wei
    Tovar, Eduardo
    Guizani, Mohsen
    IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (12) : 9787 - 9799
  • [10] Caching Transient Data for Internet of Things: A Deep Reinforcement Learning Approach
    Zhu, Hao
    Cao, Yang
    Wei, Xiao
    Wang, Wei
    Jiang, Tao
    Jin, Shi
    IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (02): : 2074 - 2083