Joint Relay Selection and Power Allocation for Time-Varying Energy Harvesting-Driven UASNs: A Stratified Reinforcement Learning Approach

被引：13

作者：

Han, Song ^{[1
]}

Li, Luo ^{[1
]}

Li, Xinbin ^{[1
]}

Liu, Zhixin ^{[1
]}

Yan, Lei ^{[1
]}

Zhang, TongWei ^{[2
]}

机构：

[1] Yanshan Univ, Key Lab Ind Comp Control Engn Hebei Prov, Qinhuangdao 066004, Hebei, Peoples R China

[2] Natl Deep Sea Ctr, Dept Technol, Qingdao 266237, Peoples R China

来源：

IEEE SENSORS JOURNAL | 2022年 / 22卷 / 20期

基金：

中国国家自然科学基金;

关键词：

Relays; Resource management; Heuristic algorithms; Optimization; Uplink; Batteries; Underwater acoustics; Deep reinforcement learning (DRL); energy harvesting (EH); resource allocation; underwater acoustic sensor networks (UASNs); NETWORKS;

D O I：

10.1109/JSEN.2022.3203028

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this article, the joint relay selection and power allocation problem is studied to maximize the uplink cumulative performance for the time-varying energy harvesting-driven underwater acoustic sensor networks (EH-UASNs). We propose a stratification-based model-free deep reinforcement learning framework, which consists of deep deterministic policy gradient (DDPG) and deep Q network (DQN) algorithms, to solve the complex joint optimization problem. More specifically, the DQN is employed to optimize the discrete relay selection strategies; the DDPG is employed to optimize the continuous power allocation strategies. The stratification-based framework can intelligently track the complex state in a divide-and-conquer perspective; as a result, the proposed algorithm can explore larger solution space with high learning efficiency. Thereinto, we reconstruct the state by introducing available outdated channel information and the capacity of the battery for enriching effective learning information. Furthermore, to equilibrate the instantaneous demand and long-term quality of service (QoS), we propose a reward mechanism that can induce the agent to adaptively adjust the power allocation strategies to match the dynamic environment. Simulation results validate the high effectiveness of our algorithm.

引用

页码：20063 / 20072

页数：10

共 23 条

[1] A Survey on MAC Protocol Approaches for Underwater Wireless Sensor Networks [J].

Al Guqhaiman, Ahmed ;

Akanbi, Oluwatobi ;

Aljaedi, Amer ;

Chow, Chinghua Edward .

IEEE SENSORS JOURNAL, 2021, 21 (03) :3916-3932

[2]

[Anonymous], 2006, P ACM INT WORKSH UND, DOI 10.1145/1347364.1347373

[3] Adaptive Relay Selection and Power Allocation for OFDM Cooperative Underwater Acoustic Systems [J].

Doosti-Aref, Abdollah ;

Ebrahimzadeh, Ataollah .

IEEE TRANSACTIONS ON MOBILE COMPUTING, 2018, 17 (01) :1-15

[4] An Energy-Balanced Trust Cloud Migration Scheme for Underwater Acoustic Sensor Networks [J].

Han, Guangjie ;

Du, Jiaxin ;

Lin, Chuan ;

Wu, Hongyi ;

Guizani, Mohsen .

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2020, 19 (03) :1636-1649

[5] Deep Q-Network-Based Cooperative Transmission Joint Strategy Optimization Algorithm for Energy Harvesting-Powered Underwater Acoustic Sensor Networks [J].

Han, Song ;

Li, Luo ;

Li, Xinbin .

SENSORS, 2020, 20 (22) :1-27

[6] Joint Buffer-Aided Hybrid-Duplex Relay Selection and Power Allocation for Secure Cognitive Networks With Double Deep Q-Network [J].

Huang, Chong ;

Chen, Gaojie ;

Gong, Yu ;

Han, Zhu .

IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2021, 7 (03) :834-844

[7] Buffer-Aided Relay Selection for Cooperative Hybrid NOMA/OMA Networks With Asynchronous Deep Reinforcement Learning [J].

Huang, Chong ;

Chen, Gaojie ;

Gong, Yu ;

Xu, Peng ;

Han, Zhu ;

Chambers, Jonathon A. .

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2021, 39 (08) :2514-2525

[8] Energy Management and Power Allocation for Underwater Acoustic Sensor Network [J].

Jing, Lianyou ;

He, Chengbing ;

Huang, Jianguo ;

Ding, Zhi .

IEEE SENSORS JOURNAL, 2017, 17 (19) :6451-6462

[9]

Kingma DP, 2014, ADV NEUR IN, V27

[10]

Lillicrap T.P., 2019, arXiv

← 1 2 3 →