Q-learning-based algorithms for dynamic transmission control in IoT equipment

被引:3
|
作者
Malekijou, Hanieh [1 ]
Hakami, Vesal [1 ]
Javan, Nastooh Taheri [2 ]
Malekijoo, Amirhossein [3 ]
机构
[1] Iran Univ Sci & Technol, Sch Comp Engn, Tehran, Iran
[2] Imam Khomeini Int Univ, Comp Engn Dept, Qazvin, Iran
[3] Semnan Univ, Dept Elect & Comp Engn, Semnan, Iran
来源
JOURNAL OF SUPERCOMPUTING | 2023年 / 79卷 / 01期
关键词
Delay; Energy harvesting; Jitter; Transmission control; Markov decision process; Reinforcement learning; POWER ALLOCATION; ENERGY; COMPRESSION; COMMUNICATION; POLICY;
D O I
10.1007/s11227-022-04643-9
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We investigate an energy-harvesting IoT device transmitting (delay/jitter)-sensitive data over a wireless fading channel. The sensory module on the device injects captured event packets into its transmission buffer and relies on the random supply of the energy harvested from the environment to transmit them. Given the limited harvested energy, our goal is to compute optimal transmission control policies that decide on how many packets of data should be transmitted from the buffer's head-of-line at each discrete timeslot such that a long-run criterion involving the average delay/jitter is either minimized or never exceeds a pre-specified threshold. We realistically assume that no advance knowledge is available regarding the random processes underlying the variations in the channel, captured events, or harvested energy dynamics. Instead, we utilize a suite of Q-learning-based techniques (from the reinforcement learning theory) to optimize the transmission policy in a model-free fashion. In particular, we come up with three Q-learning algorithms: a constrained Markov decision process (CMDP)-based algorithm for optimizing energy consumption under a delay constraint, an MDP-based algorithm for minimizing the average delay under the limitations imposed by the energy harvesting process, and finally, a variance-penalized MDP-based algorithm to minimize a linearly combined cost function consisting of both delay and delay variation. Extensive numerical results are presented for performance evaluation.
引用
收藏
页码:75 / 108
页数:34
相关论文
共 50 条
  • [1] Q-learning-based algorithms for dynamic transmission control in IoT equipment
    Hanieh Malekijou
    Vesal Hakami
    Nastooh Taheri Javan
    Amirhossein Malekijoo
    The Journal of Supercomputing, 2023, 79 : 75 - 108
  • [2] Q-learning-based dynamic joint control of interference and transmission opportunities for cognitive radio
    Jang, Sung-Jeen
    Yoo, Sang-Jo
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2018,
  • [3] Q-learning-based dynamic joint control of interference and transmission opportunities for cognitive radio
    Sung-Jeen Jang
    Sang-Jo Yoo
    EURASIP Journal on Wireless Communications and Networking, 2018
  • [4] Q-learning-based multirate transmission control scheme for RRM in multimedia WCDMA systems
    Chen, YS
    Chang, CJ
    Ren, FC
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2004, 53 (01) : 38 - 48
  • [5] Deep Federated Q-Learning-Based Network Slicing for Industrial IoT
    Messaoud, Seifeddine
    Bradai, Abbas
    Ben Ahmed, Olfa
    Pham Tran Anh Quang
    Atri, Mohamed
    Hossain, M. Shamim
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (08) : 5572 - 5582
  • [6] Q-learning-based H∞ control for LPV systems
    Wang, Hongye
    Wen, Jiwei
    Wan, Haiying
    Xue, Huiwen
    ASIAN JOURNAL OF CONTROL, 2024,
  • [7] Deep Q-Learning-Based Dynamic Management of a Robotic Cluster
    Gautier, Paul
    Laurent, Johann
    Diguet, Jean-Philippe
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2023, 20 (04) : 2503 - 2515
  • [8] A Q-learning-based multi-rate transmission control scheme for RRC in WCDMA systems
    Ren, FC
    Chang, CJ
    Chen, YS
    13TH IEEE INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS, VOL 1-5, PROCEEDINGS: SAILING THE WAVES OF THE WIRELESS OCEANS, 2002, : 1422 - 1426
  • [9] Deep Q-Learning-Based Transmission Power Control of a High Altitude Platform Station with Spectrum Sharing
    Jo, Seongjun
    Yang, Wooyeol
    Choi, Haing Kun
    Noh, Eonsu
    Jo, Han-Shin
    Park, Jaedon
    SENSORS, 2022, 22 (04)
  • [10] Iterative Q-Learning-Based Nonlinear Optimal Tracking Control
    Wei, Qinglai
    Song, Ruizhuo
    Xu, Yancai
    Liu, Derong
    PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,