Q-learning-based algorithms for dynamic transmission control in IoT equipment

被引:3
|
作者
Malekijou, Hanieh [1 ]
Hakami, Vesal [1 ]
Javan, Nastooh Taheri [2 ]
Malekijoo, Amirhossein [3 ]
机构
[1] Iran Univ Sci & Technol, Sch Comp Engn, Tehran, Iran
[2] Imam Khomeini Int Univ, Comp Engn Dept, Qazvin, Iran
[3] Semnan Univ, Dept Elect & Comp Engn, Semnan, Iran
来源
JOURNAL OF SUPERCOMPUTING | 2023年 / 79卷 / 01期
关键词
Delay; Energy harvesting; Jitter; Transmission control; Markov decision process; Reinforcement learning; POWER ALLOCATION; ENERGY; COMPRESSION; COMMUNICATION; POLICY;
D O I
10.1007/s11227-022-04643-9
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We investigate an energy-harvesting IoT device transmitting (delay/jitter)-sensitive data over a wireless fading channel. The sensory module on the device injects captured event packets into its transmission buffer and relies on the random supply of the energy harvested from the environment to transmit them. Given the limited harvested energy, our goal is to compute optimal transmission control policies that decide on how many packets of data should be transmitted from the buffer's head-of-line at each discrete timeslot such that a long-run criterion involving the average delay/jitter is either minimized or never exceeds a pre-specified threshold. We realistically assume that no advance knowledge is available regarding the random processes underlying the variations in the channel, captured events, or harvested energy dynamics. Instead, we utilize a suite of Q-learning-based techniques (from the reinforcement learning theory) to optimize the transmission policy in a model-free fashion. In particular, we come up with three Q-learning algorithms: a constrained Markov decision process (CMDP)-based algorithm for optimizing energy consumption under a delay constraint, an MDP-based algorithm for minimizing the average delay under the limitations imposed by the energy harvesting process, and finally, a variance-penalized MDP-based algorithm to minimize a linearly combined cost function consisting of both delay and delay variation. Extensive numerical results are presented for performance evaluation.
引用
收藏
页码:75 / 108
页数:34
相关论文
共 50 条
  • [41] FQ-SAT: A fuzzy Q-learning-based MPQUIC scheduler for data transmission optimization
    Nguyen, Thanh Trung
    Vu, Minh Hai
    Dinh, Thi Ha Ly
    Nguyen, Thanh Hung
    Nguyen, Phi Le
    Nguyen, Kien
    COMPUTER COMMUNICATIONS, 2024, 226
  • [42] Q-learning-based Model-free Swing Up Control of an Inverted Pendulum
    Ghio, Alessio
    Ramos, Oscar E.
    PROCEEDINGS OF THE 2019 IEEE XXVI INTERNATIONAL CONFERENCE ON ELECTRONICS, ELECTRICAL ENGINEERING AND COMPUTING (INTERCON), 2019,
  • [43] A Q-learning-based network content caching method
    Chen, Haijun
    Tan, Guanzheng
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2018,
  • [44] A Q-learning-based network content caching method
    Haijun Chen
    Guanzheng Tan
    EURASIP Journal on Wireless Communications and Networking, 2018
  • [45] A Q-learning-based algorithm for the block relocation problem
    Liu, Liqun
    Feng, Yuanjun
    Zeng, Qingcheng
    Chen, Zhijun
    Li, Yaqiu
    JOURNAL OF HEURISTICS, 2025, 31 (01)
  • [46] DTWN: Q-learning-based Transmit Power Control for Digital Twin WiFi Networks
    Cakir L.V.
    Huseynov K.
    Ak E.
    Canberk B.
    EAI. Endorsed. Trans. Ind. Netw. Intell. Syst., 2022, 31
  • [47] Q-Learning-Based Model Predictive Control for Nonlinear Continuous-Time Systems
    Zhang, Hao
    Li, Shaoyuan
    Zheng, Yi
    INDUSTRIAL & ENGINEERING CHEMISTRY RESEARCH, 2020, 59 (40) : 17987 - 17999
  • [48] A Q-learning-based Downlink Power Control Algorithm for Energy Efficiency in LTE Femtocells
    Huang, Lianfen
    Wen, Bin
    Gao, Zhibin
    Cai, Hongxiang
    Li, Yujie
    MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 1766 - +
  • [49] Scheduling Multiobjective Dynamic Surgery Problems via Q-Learning-Based Meta-Heuristics
    Yu, Hui
    Gao, Kaizhou
    Wu, Naiqi
    Zhou, MengChu
    Suganthan, Ponnuthurai N.
    Wang, Shouguang
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (06): : 3321 - 3333
  • [50] A Q-learning-based memetic algorithm for multi-objective dynamic software project scheduling
    Shen, Xiao-Ning
    Minku, Leandro L.
    Marturi, Naresh
    Guo, Yi-Nan
    Han, Ying
    INFORMATION SCIENCES, 2018, 428 : 1 - 29