Q-learning-based algorithms for dynamic transmission control in IoT equipment

被引：3

作者：

Malekijou, Hanieh ^{[1
]}

Hakami, Vesal ^{[1
]}

Javan, Nastooh Taheri ^{[2
]}

Malekijoo, Amirhossein ^{[3
]}

机构：

[1] Iran Univ Sci & Technol, Sch Comp Engn, Tehran, Iran

[2] Imam Khomeini Int Univ, Comp Engn Dept, Qazvin, Iran

[3] Semnan Univ, Dept Elect & Comp Engn, Semnan, Iran

来源：

JOURNAL OF SUPERCOMPUTING | 2023年 / 79卷 / 01期

关键词：

Delay; Energy harvesting; Jitter; Transmission control; Markov decision process; Reinforcement learning; POWER ALLOCATION; ENERGY; COMPRESSION; COMMUNICATION; POLICY;

D O I：

10.1007/s11227-022-04643-9

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

We investigate an energy-harvesting IoT device transmitting (delay/jitter)-sensitive data over a wireless fading channel. The sensory module on the device injects captured event packets into its transmission buffer and relies on the random supply of the energy harvested from the environment to transmit them. Given the limited harvested energy, our goal is to compute optimal transmission control policies that decide on how many packets of data should be transmitted from the buffer's head-of-line at each discrete timeslot such that a long-run criterion involving the average delay/jitter is either minimized or never exceeds a pre-specified threshold. We realistically assume that no advance knowledge is available regarding the random processes underlying the variations in the channel, captured events, or harvested energy dynamics. Instead, we utilize a suite of Q-learning-based techniques (from the reinforcement learning theory) to optimize the transmission policy in a model-free fashion. In particular, we come up with three Q-learning algorithms: a constrained Markov decision process (CMDP)-based algorithm for optimizing energy consumption under a delay constraint, an MDP-based algorithm for minimizing the average delay under the limitations imposed by the energy harvesting process, and finally, a variance-penalized MDP-based algorithm to minimize a linearly combined cost function consisting of both delay and delay variation. Extensive numerical results are presented for performance evaluation.

引用

页码：75 / 108

页数：34

共 50 条

[41] FQ-SAT: A fuzzy Q-learning-based MPQUIC scheduler for data transmission optimization
Nguyen, Thanh Trung
Vu, Minh Hai
Dinh, Thi Ha Ly
Nguyen, Thanh Hung
Nguyen, Phi Le
Nguyen, Kien
COMPUTER COMMUNICATIONS, 2024, 226
[42] Q-learning-based Model-free Swing Up Control of an Inverted Pendulum
Ghio, Alessio
Ramos, Oscar E.
PROCEEDINGS OF THE 2019 IEEE XXVI INTERNATIONAL CONFERENCE ON ELECTRONICS, ELECTRICAL ENGINEERING AND COMPUTING (INTERCON), 2019,
[43] A Q-learning-based network content caching method
Chen, Haijun
Tan, Guanzheng
EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2018,
[44] A Q-learning-based network content caching method
Haijun Chen
Guanzheng Tan
EURASIP Journal on Wireless Communications and Networking, 2018
[45] A Q-learning-based algorithm for the block relocation problem
Liu, Liqun
Feng, Yuanjun
Zeng, Qingcheng
Chen, Zhijun
Li, Yaqiu
JOURNAL OF HEURISTICS, 2025, 31 (01)
[46] DTWN: Q-learning-based Transmit Power Control for Digital Twin WiFi Networks
Cakir L.V.
Huseynov K.
Ak E.
Canberk B.
EAI. Endorsed. Trans. Ind. Netw. Intell. Syst., 2022, 31
[47] Q-Learning-Based Model Predictive Control for Nonlinear Continuous-Time Systems
Zhang, Hao
Li, Shaoyuan
Zheng, Yi
INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2020, 59 (40) : 17987 - 17999
[48] A Q-learning-based Downlink Power Control Algorithm for Energy Efficiency in LTE Femtocells
Huang, Lianfen
Wen, Bin
Gao, Zhibin
Cai, Hongxiang
Li, Yujie
MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 1766 - +
[49] Scheduling Multiobjective Dynamic Surgery Problems via Q-Learning-Based Meta-Heuristics
Yu, Hui
Gao, Kaizhou
Wu, Naiqi
Zhou, MengChu
Suganthan, Ponnuthurai N.
Wang, Shouguang
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (06): : 3321 - 3333
[50] A Q-learning-based memetic algorithm for multi-objective dynamic software project scheduling
Shen, Xiao-Ning
Minku, Leandro L.
Marturi, Naresh
Guo, Yi-Nan
Han, Ying
INFORMATION SCIENCES, 2018, 428 : 1 - 29

← 1 2 3 4 5 →