Energy-Efficient Trajectory Optimization With Wireless Charging in UAV-Assisted MEC Based on Multi-Objective Reinforcement Learning

被引:14
作者
Song, Fuhong [1 ]
Deng, Mingsen [1 ]
Xing, Huanlai [2 ]
Liu, Yanping [3 ]
Ye, Fei [4 ]
Xiao, Zhiwen [2 ]
机构
[1] Guizhou Univ Finance & Econ, Sch Informat, Guiyang 550025, Peoples R China
[2] Southwest Jiaotong Univ, Sch Comp & Artificial Intelligence, Chengdu 611756, Peoples R China
[3] Guizhou Univ Finance & Econ, Coll Big Data Stat, Guiyang 550025, Peoples R China
[4] Univ York, Dept Comp Sci, York YO10 5GH, England
基金
中国国家自然科学基金;
关键词
Autonomous aerial vehicles; Task analysis; Energy efficiency; Laser beams; Heuristic algorithms; Reinforcement learning; Inductive charging; Mobile edge computing; multi-objective reinforcement learning; trajectory optimization; unmanned aerial vehicle; wireless charging; ALLOCATION; TASK; CONSUMPTION; ALGORITHM;
D O I
10.1109/TMC.2024.3384405
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper investigates the problem of energy-efficient trajectory optimization with wireless charging (ETWC) in an unmanned aerial vehicle (UAV)-assisted mobile edge computing system. A UAV is dispatched to collect computation tasks from specific ground smart devices (GSDs) within its coverage while transmitting energy to the other GSDs. In addition, a high-altitude platform with a laser beam is deployed in the stratosphere to charge the UAV, so as to maintain its flight mission. The ETWC problem is characterized by multi-objective optimization, aiming to maximize both the energy efficiency of the UAV and the number of tasks collected via optimizing the UAV's flight trajectories. The conflict between the two objectives in the problem makes it quite challenging. Recently, some single-objective reinforcement learning (SORL) algorithms have been introduced to address the aforementioned problem. Nevertheless, these SORLs adopt linear scalarization to define the user utility, thus ignoring the conflict between objectives. Furthermore, in dynamic MEC scenarios, the relative importance assigned to each objective may vary over time, posing significant challenges for conventional SORLs. To solve the challenge, we first build a multi-objective Markov decision process that has a vectorial reward mechanism. There is a corresponding relationship between each component of the reward and one of the two objectives. Then, we propose a new trace-based experience replay scheme to modify sample efficiency and reduce replay buffer bias, resulting in a modified multi-objective reinforcement learning algorithm. The experiment results validate that the proposed algorithm can obtain better adaptability to dynamic preferences and a more favorable balance between objectives compared with several algorithms.
引用
收藏
页码:10867 / 10884
页数:18
相关论文
共 43 条
[1]  
Abels A., 2019, P ACM INT C MACH LEA, P1
[2]   Deep Reinforcement Learning based Contract Incentive for UAVs and Energy Harvest Assisted Computing [J].
Chen, Che ;
Gong, Shimin ;
Zhang, Wenjie ;
Zheng, Yifeng ;
Kiat, Yeo Chai .
2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, :2224-2229
[3]   Joint Task Scheduling, Routing, and Charging for Multi-UAV Based Mobile Edge Computing [J].
Chen, Jun ;
Xie, Junfei .
IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022,
[4]   Information Freshness-Aware Task Offloading in Air-Ground Integrated Edge Computing Systems [J].
Chen, Xianfu ;
Wu, Celimuge ;
Chen, Tao ;
Liu, Zhi ;
Zhang, Honggang ;
Bennis, Mehdi ;
Liu, Hang ;
Ji, Yusheng .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2022, 40 (01) :243-258
[5]   Deep reinforcement learning-based joint task and energy offloading in UAV-aided 6G intelligent edge networks [J].
Cheng, Zhipeng ;
Liwang, Minghui ;
Chen, Ning ;
Huang, Lianfen ;
Du, Xiaojiang ;
Guizani, Mohsen .
COMPUTER COMMUNICATIONS, 2022, 192 :234-244
[6]   Joint Optimization of Energy Consumption and Latency in Mobile Edge Computing for Internet of Things [J].
Cui, Laizhong ;
Xu, Chong ;
Yang, Shu ;
Huang, Joshua Zhexue ;
Li, Jianqiang ;
Wang, Xizhao ;
Ming, Zhong ;
Lu, Nan .
IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (03) :4791-4803
[7]   An Evolutionary Many-Objective Optimization Algorithm Using Reference-Point-Based Nondominated Sorting Approach, Part I: Solving Problems With Box Constraints [J].
Deb, Kalyanmoy ;
Jain, Himanshu .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2014, 18 (04) :577-601
[8]   Deep Reinforcement Learning for UAV Routing in the Presence of Multiple Charging Stations [J].
Fan, Mingfeng ;
Wu, Yaoxin ;
Liao, Tianjun ;
Cao, Zhiguang ;
Guo, Hongliang ;
Sartoretti, Guillaume ;
Wu, Guohua .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (05) :5732-5746
[9]   Energy-Efficient UAV-Enabled Data Collection via Wireless Charging: A Reinforcement Learning Approach [J].
Fu, Shu ;
Tang, Yujie ;
Wu, Yuan ;
Zhang, Ning ;
Gu, Huaxi ;
Chen, Chen ;
Liu, Min .
IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (12) :10209-10219
[10]   UAV-Aided Energy-Efficient Edge Computing Networks: Security Offloading Optimization [J].
Gu, Xiaohui ;
Zhang, Guoan ;
Wang, Mingxing ;
Duan, Wei ;
Wen, Miaowen ;
Ho, Pin-Han .
IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (06) :4245-4258