Deep RL-based Trajectory Planning for AoI Minimization in UAV-assisted IoT

被引:0
作者
Zhou, Conghao [1 ]
He, Hongli [2 ]
Yang, Peng [1 ]
Lyu, Feng [1 ]
Wu, Wen [1 ]
Cheng, Nan [3 ]
Shen, Xuemin [1 ]
机构
[1] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON, Canada
[2] Zhejiang Univ, Coll Informat Sci & Elect Engn, Hangzhou, Peoples R China
[3] Xidian Univ, Sch Telecommun, Xian, Peoples R China
来源
2019 11TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP) | 2019年
基金
中国国家自然科学基金; 加拿大自然科学与工程研究理事会;
关键词
Internet of Things; age-of-information; unmanned aerial vehicle; trajectory planning; deep reinforcement learning; COMMUNICATION;
D O I
10.1109/wcsp.2019.8928091
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the flexibility and low deployment cost, unmanned aerial vehicles (UAVs) have been widely used to assist cellular networks in providing extended coverage for Internet of Things (IoT) networks. Existing throughput or delay-based UAV trajectory planning methods cannot meet the requirement of collecting fresh information from IoT devices. In this paper, by taking age-of-information (AoI) as a measure of information freshness, we investigate AoI-based UAV trajectory planning for fresh data collection. To model the complicated association and interaction pattern between UAV and IoT devices, the UAV trajectory planning problem is formulated as a Markov decision process (MDP) to capture the dynamics of UAV locations. As network topology and traffic generation pattern are unknown ahead, we propose an AoI-based trajectory planning (A-TP) algorithm using deep reinforcement learning (RL) technique. To accelerate the learning process during online decision making, the off-line pre-training of deep neural networks is performed. Extensive simulation results demonstrate that the proposed algorithm can significantly reduce the AoI of collected IoT data, as compared to other benchmark approaches.
引用
收藏
页数:6
相关论文
共 18 条
[1]  
Abadi M., 2015, P 12 USENIX S OPERAT
[2]   Average Peak Age-of-Information Minimization in UAV-Assisted IoT Networks [J].
Abd-Elmagid, Mohamed A. ;
Dhillon, Harpreet S. .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2019, 68 (02) :2003-2008
[3]   Space/Aerial-Assisted Computing Offloading for IoT Applications: A Learning-Based Approach [J].
Cheng, Nan ;
Lyu, Feng ;
Quan, Wei ;
Zhou, Conghao ;
He, Hongli ;
Shi, Weisen ;
Shen, Xuemin .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2019, 37 (05) :1117-1129
[4]  
He HL, 2018, IEEE GLOB COMM CONF
[5]   Wireless Scheduling for Information Freshness and Synchrony: Drift-Based Design and Heavy-Traffic Analysis [J].
Joo, Changhee ;
Eryilmaz, Atilla .
IEEE-ACM TRANSACTIONS ON NETWORKING, 2018, 26 (06) :2556-2568
[6]   Energy-Efficient UAV Control for Effective and Fair Communication Coverage: A Deep Reinforcement Learning Approach [J].
Liu, Chi Harold ;
Chen, Zheyu ;
Tang, Jian ;
Xu, Jie ;
Piao, Chengzhe .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2018, 36 (09) :2059-2070
[7]   Age-based Scheduling: Improving Data Freshness for Wireless Real-Time Traffic [J].
Lu, Ning ;
Ji, Bo ;
Li, Bin .
PROCEEDINGS OF THE 2018 THE NINETEENTH INTERNATIONAL SYMPOSIUM ON MOBILE AD HOC NETWORKING AND COMPUTING (MOBIHOC '18), 2018, :191-200
[8]  
Lyu F., 2019, IEEE T INTELL TRANSP, P1
[9]   Human-level control through deep reinforcement learning [J].
Mnih, Volodymyr ;
Kavukcuoglu, Koray ;
Silver, David ;
Rusu, Andrei A. ;
Veness, Joel ;
Bellemare, Marc G. ;
Graves, Alex ;
Riedmiller, Martin ;
Fidjeland, Andreas K. ;
Ostrovski, Georg ;
Petersen, Stig ;
Beattie, Charles ;
Sadik, Amir ;
Antonoglou, Ioannis ;
King, Helen ;
Kumaran, Dharshan ;
Wierstra, Daan ;
Legg, Shane ;
Hassabis, Demis .
NATURE, 2015, 518 (7540) :529-533
[10]  
Puterman Martin L, 1994, Markov Decision Processes: Discrete Stochastic Dynamic Programming