共 50 条
Joint Optimization of Trajectory and User Association via Reinforcement Learning for UAV-Aided Data Collection in Wireless Networks
被引:22
作者:
Chen, Gong
[1
,2
,3
]
Zhai, Xiangping Bryce
[1
,2
]
Li, Congduan
[3
]
机构:
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing 211106, Peoples R China
[2] Collaborat Innovat Ctr Novel Software Technol & In, Nanjing 210023, Jiangsu, Peoples R China
[3] Sun Yat Sen Univ, Sch Elect & Commun Engn, Shenzhen 518107, Peoples R China
基金:
美国国家科学基金会;
关键词:
Trajectory;
Optimization;
Games;
Throughput;
Wireless networks;
Resource management;
Interference;
UAV trajectory design;
fair throughputs;
energy-efficiency;
coalition formation games;
multi-agent deep reinforcement learning;
ENERGY-EFFICIENT;
COMMUNICATION;
ALLOCATION;
DESIGN;
SPECTRUM;
MEC;
D O I:
10.1109/TWC.2022.3216049
中图分类号:
TM [电工技术];
TN [电子技术、通信技术];
学科分类号:
0808 ;
0809 ;
摘要:
Unmanned Aerial Vehicles (UAVs) can be used as aerial base stations for data collection in next-generation wireless networks due to their high adaptability and maneuverability. This paper investigates the scenario where multiple UAVs cooperatively fly over heterogeneous ground users (GUs) and collect data without a central controller. With the consideration of signal-to-interference-and-noise ratio (SINR) and fairness among users, we jointly optimize the trajectories of UAVs and the GUs associations to maximize the total throughput and energy efficiency. We formulate the long-term optimization problem as a decentralized partially observed Markov decision processes (DEC-POMDP) and derive an approach combining the coalition formation game (CFG) and multi-agent deep reinforcement learning (MADRL). We first formulate the discrete association scheduling problem as a non-cooperative theoretical game and use the CFG algorithm to achieve a decentralized scheme converging to Nash equilibrium (NE). Then, a MARL-based technique is developed to optimize the trajectories and energy consumption continuously in a centralized-training but decentralized-execution manner. Simulation results demonstrate that the proposed algorithm outperforms the commonly used schemes in the literature, regarding the fair throughput and energy consumption in a distributed manner.
引用
收藏
页码:3128 / 3143
页数:16
相关论文
共 50 条