Reinforcement Learning Exploration Algorithms for Energy Harvesting Communications Systems

被引:0
作者
Masadeh, Ala'eddin [1 ]
Wang, Zhengdao [1 ]
Kamal, Ahmed E. [1 ]
机构
[1] ISU, Ames, IA 50011 USA
来源
2018 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC) | 2018年
基金
美国国家科学基金会;
关键词
Energy harvesting communications; Markov decision process; Reinforcement learning; Exploration; Exploitation;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Prolonging the lifetime, and maximizing the throughput are important factors in designing an efficient communications system, especially for energy harvesting-based systems. In this work, the problem of maximizing the throughput of point-to-point energy harvesting communications system, while prolonging its lifetime is investigated. This work considers more real communications system, where this system does not have a priori knowledge about the environment. This system consists of a transmitter and receiver. The transmitter is equipped with an infinite buffer to store data, and energy harvesting capability to harvest renewable energy and store it in a finite battery. The problem of finding an efficient power allocation policy is formulated as a reinforcement learning problem. Two different exploration algorithms are used, which are the convergence-based and the epsilon-greedy algorithms. The first algorithm uses the action-value function convergence error and the exploration time threshold to balance between exploration and exploitation. On the other hand, the second algorithm tries to achieve balancing through the exploration probability (i.e. epsilon). Simulation results show that the convergence-based algorithm outperforms the epsilon-greedy algorithm. Then, the effects of the parameters of each algorithm are investigated.
引用
收藏
页数:6
相关论文
共 17 条
[1]  
[Anonymous], IEEE T NEURAL NETWOR
[2]  
[Anonymous], 2016 IEEE S SER COMP
[3]  
[Anonymous], 2015, Reinforcement Learning: An Introduction
[4]  
[Anonymous], 2010, Algorithms for Reinforcement Learning
[5]   A Learning Theoretic Approach to Energy Harvesting Communication System Optimization [J].
Blasco, Pol ;
Guenduez, Deniz ;
Dohler, Mischa .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2013, 12 (04) :1872-1882
[6]  
Emre M, 2015, IEEE INT CONF COMM, P2799, DOI 10.1109/ICCW.2015.7247603
[7]  
Heidrich-Meisner V., 2007, ESANN, P277
[8]  
Liu H, 2016, Adv Inform Managemen, P1, DOI 10.1109/IMCEC.2016.7867101
[9]   Average reward reinforcement learning: Foundations, algorithms, and empirical results [J].
Mahadevan, S .
MACHINE LEARNING, 1996, 22 (1-3) :159-195
[10]   Reinforcement Learning for Energy Harvesting Point-to-Point Communications [J].
Ortiz, Andrea ;
Al-Shatri, Hussein ;
Li, Xiang ;
Weber, Tobias ;
Klein, Anja .
2016 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2016,