Trajectory Design for UAV-Based Internet of Things Data Collection: A Deep Reinforcement Learning Approach

被引:75
|
作者
Wang, Yang [1 ]
Gao, Zhen [1 ]
Zhang, Jun [1 ]
Cao, Xianbin [2 ]
Zheng, Dezhi [3 ]
Gao, Yue [4 ]
Ng, Derrick Wing Kwan [5 ]
Di Renzo, Marco [6 ]
机构
[1] Beijing Inst Technol, Sch Informat & Elect, Beijing 100081, Peoples R China
[2] Beihang Univ, Sch Elect & Informat Engn, Beijing 100191, Peoples R China
[3] Beihang Univ, Sch Instrumentat & Optoelect Engn, Innovat Inst Frontier Sci & Technol, Beijing 100191, Peoples R China
[4] Univ Surrey, Dept Elect & Elect Engn, Surrey GU2 7XH, England
[5] Univ New South Wales, Sch Elect Engn & Telecommun, Sydney, NSW 2025, Australia
[6] Univ Paris Saclay, Lab Signaux & Syst, Cent Supelec, CNRS, F-91192 Gif Sur Yvette, France
来源
IEEE INTERNET OF THINGS JOURNAL | 2022年 / 9卷 / 05期
基金
北京市自然科学基金; 澳大利亚研究理事会; 中国国家自然科学基金;
关键词
Trajectory; Data collection; Sensors; Optimization; Three-dimensional displays; Minimization; Resource management; deep reinforcement learning (DRL); Internet of Things (IoT); trajectory design; unmanned aerial vehicle (UAV) communications; ENERGY-EFFICIENT; RESOURCE-ALLOCATION; COMMUNICATION; OPTIMIZATION;
D O I
10.1109/JIOT.2021.3102185
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this article, we investigate an unmanned aerial vehicle (UAV)-assisted Internet of Things (IoT) system in a sophisticated 3-D environment, where the UAV's trajectory is optimized to efficiently collect data from multiple IoT ground nodes. Unlike existing approaches focusing only on a simplified 2-D scenario and the availability of perfect channel state information (CSI), this article considers a practical 3-D urban environment with imperfect CSI, where the UAV's trajectory is designed to minimize data collection completion time subject to practical throughput and flight movement constraints. Specifically, inspired by the state-of-the-art deep reinforcement learning approaches, we leverage the twin-delayed deep deterministic policy gradient (TD3) to design the UAV's trajectory and we present a TD3-based trajectory design for completion time minimization (TD3-TDCTM) algorithm. In particular, we set an additional information, i.e., the merged pheromone, to represent the state information of the UAV and environment as a reference of reward which facilitates the algorithm design. By taking the service statuses of the IoT nodes, the UAV's position, and the merged pheromone as input, the proposed algorithm can continuously and adaptively learn how to adjust the UAV's movement strategy. By interacting with the external environment in the corresponding Markov decision process, the proposed algorithm can achieve a near-optimal navigation strategy. Our simulation results show the superiority of the proposed TD3-TDCTM algorithm over three conventional nonlearning-based baseline methods.
引用
收藏
页码:3899 / 3912
页数:14
相关论文
共 50 条
  • [21] A precision adjustable trajectory planning scheme for UAV-based data collection in IoTs
    Wang, Zuyan
    Tao, Jun
    Gao, Yang
    Xu, Yifan
    Sun, Weice
    Li, Xiaoyan
    PEER-TO-PEER NETWORKING AND APPLICATIONS, 2021, 14 (02) : 655 - 671
  • [22] A Deep Reinforcement Learning Approach for Federated Learning Optimization with UAV Trajectory Planning
    Zhang, Chunyu
    Liu, Yiming
    Zhang, Zhi
    2023 IEEE 34TH ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS, PIMRC, 2023,
  • [23] Deep Reinforcement Learning Approach for Joint Trajectory Design in Multi-UAV IoT Networks
    Xu, Shu
    Zhan, Xiangyu
    Li, Chunguo
    Wang, Dongming
    Yang, Luxi
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (03) : 3389 - 3394
  • [24] Cellular-Connected UAV Trajectory Design With Connectivity Constraint: A Deep Reinforcement Learning Approach
    Gao, Yunfei
    Xiao, Lin
    Wu, Fahui
    Yang, Dingcheng
    Sun, Zhongxiang
    IEEE TRANSACTIONS ON GREEN COMMUNICATIONS AND NETWORKING, 2021, 5 (03): : 1369 - 1380
  • [25] Energy-Efficient UAV Trajectory Design for Backscatter Communication: A Deep Reinforcement Learning Approach
    Yiwen Nie
    Junhui Zhao
    Jun Liu
    Jing Jiang
    Ruijin Ding
    中国通信, 2020, 17 (10) : 129 - 141
  • [26] Deep Reinforcement Learning Enables Joint Trajectory and Communication in Internet of Robotic Things
    Luo, Ruyu
    Tian, Hui
    Ni, Wanli
    Cheng, Julian
    Chen, Kwang-Cheng
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2024, 23 (12) : 18154 - 18168
  • [27] Energy-Efficient UAV Trajectory Design for Backscatter Communication: A Deep Reinforcement Learning Approach
    Nie, Yiwen
    Zhao, Junhui
    Liu, Jun
    Jiang, Jing
    Ding, Ruijin
    CHINA COMMUNICATIONS, 2020, 17 (10) : 129 - 141
  • [28] UAVFog: A UAV-Based Fog Computing for Internet of Things
    Mohamed, Nader
    Al-Jaroodi, Jameela
    Jawhar, Imad
    Noura, Hassan
    Mahmoud, Sara
    2017 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTED, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI), 2017,
  • [29] UAV-based Localization for Layered Framework of the Internet of Things
    Pandey, Saurabh K.
    Zaveri, Mukesh A.
    Choksi, Meghavi
    Kumar, J. Sathish
    8TH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING & COMMUNICATIONS (ICACC-2018), 2018, 143 : 728 - 735
  • [30] Deep Reinforcement Learning for Trajectory Design and Power Allocation in UAV Networks
    Zhao, Nan
    Cheng, Yiqiang
    Pei, Yiyang
    Liang, Ying-Chang
    Niyato, Dusit
    ICC 2020 - 2020 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2020,