HRL-Based Access Control for Wireless Communications With Energy Harvesting

被引：3

作者：

Wang, Yingkai ^{[1
]}

Wang, Qingshan ^{[1
,2
]}

Wang, Qi ^{[1
,3
]}

Zheng, Zhiwen ^{[4
]}

机构：

[1] Hefei Univ Technol, Sch Math, Hefei 230001, Peoples R China

[2] Univ Sci & Technol China USTC, Comp Sci, Hefei, Peoples R China

[3] Hefei Univ Technol, Comp Sci, Hefei, Peoples R China

[4] Hefei Univ Technol, Informat & Comp Sci, Hefei, Peoples R China

来源：

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING | 2024年 / 21卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Access control; Wireless communication; Energy harvesting; Batteries; Throughput; Task analysis; Fading channels; Neural network applications; decision-making; knowledge based system; access control; energy harvesting; MULTIPLE-ACCESS; POWER-CONTROL; RESOURCE-ALLOCATION; TRANSMISSION; PREDICTION; NETWORKS; SYSTEMS;

D O I：

10.1109/TASE.2023.3235316

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper studies the access control problem of long-term throughput maximization in wireless communication systems with Energy Harvesting (EH). In the existing research, many access schemes based on accurate environmental information have been proposed, such as channel information and the EH process. However, access to environmental information is costly, and traditional access control frameworks are expensive to explore in high-dimensional spaces. Thus, an access control framework based on hierarchical reinforcement learning (HRL) is proposed in this paper. In HRL, the control problem in the Markov decision process (MDP) form is decomposed into a multilevel sequential control problem. It includes high-level channel number selection, mid-level channel selection, and low-level channel matching subproblems. The scheme is obtained by combining the solutions of subproblems at different level which are solved in sequence. In addition, to improve learning efficiency, the deterministic action (DA) module and the prior knowledge (PK) module are put forward. The DA module solves the channel matching problem under the additional guidance given by the previous subproblem, which selects definite good low-level actions. The PK module provides the framework with the common knowledge of the system structure learned from the hypothetical environment, so as to obtain better initial performance. Experimental results show that our framework achieves better performance and better learning efficiency compared with several recent transmission schemes. Note to Practitioners-Access control is an important issue in wireless communication systems, and users need to be scheduled to solve the constraint of limited resources, such as energy usually provided by batteries. In recent years, in order to overcome the energy limitation, energy harvesting devices have been developed and applied to wireless communication systems. However, the energy collection ability of the system is greatly influenced by the environment, which leads to the poor performance of most traditional control schemes that rely on the prior knowledge of the environment. Therefore, this paper proposes a novel hierarchical reinforcement learning (HRL)-based model-free access control framework for wireless communication system to maximize the system throughput without any prior environmental knowledge. The scheme abstracts the original control problem into three sub-control sub control problems according to tasks and solves them sequentially, thus simplifying the original control problem. This scheme can not only learn independently, but also does not depend on the prior knowledge of the environment. Moreover, this method is also suitable for the large-scale environment while the conventional end-to-end reinforcement learning is not suitable for. Compared with traditional algorithms, our method has better performance and higher learning efficiency.

引用

页码：1000 / 1011

页数：12

共 44 条

[1]

Abel D, 2018, PR MACH LEARN RES, V80

[2] Dynamic Resource Allocation for IRS Assisted Energy Harvesting Systems With Statistical Delay Constraint [J].

Ahmed, Imtiaz ;

Yan, Su ;

Rawat, Danda B. ;

Pu, Cong .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (02) :2158-2163

[3] Optimal Stochastic Power Control for Energy Harvesting Systems With Delay Constraints [J].

Ahmed, Imtiaz ;

Khoa Tran Phan ;

Tho Le-Ngoc .

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2016, 34 (12) :3512-3527

[4] Reinforcement Learning Framework for Delay Sensitive Energy Harvesting Wireless Sensor Networks [J].

Al-Tous, Hanan ;

Barhumi, Imad .

IEEE SENSORS JOURNAL, 2021, 21 (05) :7103-7113

[5] Online Power Control Optimization for Wireless Transmission With Energy Harvesting and Storage [J].

Amirnavaei, Fatemeh ;

Dong, Min .

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2016, 15 (07) :4888-4901

[6] RLMan: An Energy Manager Based on Reinforcement Learning for Energy Harvesting Wireless Sensor Networks [J].

Aoudia, Faycal Ait ;

Gautier, Matthieu ;

Berder, Olivier .

IEEE TRANSACTIONS ON GREEN COMMUNICATIONS AND NETWORKING, 2018, 2 (02) :408-417

[7] Transmit Power Control Policies for Energy Harvesting Sensors With Retransmissions [J].

Aprem, Anup ;

Murthy, Chandra R. ;

Mehta, Neelesh B. .

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2013, 7 (05) :895-906

[8]

Arulkumaran K., 2019, ARXIV

[9]

Berner C, 2019, DOTA 2 LARGE SCALE D

[10] A Learning Theoretic Approach to Energy Harvesting Communication System Optimization [J].

Blasco, Pol ;

Guenduez, Deniz ;

Dohler, Mischa .

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2013, 12 (04) :1872-1882

← 1 2 3 4 5 →