Stochastic Power Adaptation with Multiagent Reinforcement Learning for Cognitive Wireless Mesh Networks

被引:57
作者
Chen, Xianfu [1 ]
Zhao, Zhifeng [2 ]
Zhang, Honggang [2 ]
机构
[1] VTT Tech Res Ctr Finland, FI-90571 Oulu, Finland
[2] Zhejiang Univ, Dept Informat Sci & Elect Engn, Hangzhou 310027, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Cognitive radio; resource allocation; algorithm/protocol design and analysis; reinforcement learning; RADIO; GAME;
D O I
10.1109/TMC.2012.178
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As the scarce spectrum resource is becoming overcrowded, cognitive radio indicates great flexibility to improve the spectrum efficiency by opportunistically accessing the authorized frequency bands. One of the critical challenges for operating such radios in a network is how to efficiently allocate transmission powers and frequency resource among the secondary users (SUs) while satisfying the quality-of-service constraints of the primary users. In this paper, we focus on the noncooperative power allocation problem in cognitive wireless mesh networks formed by a number of clusters with the consideration of energy efficiency. Due to the SUs' dynamic and spontaneous properties, the problem is modeled as a stochastic learning process. We first extend the single-agent Q-learning to a multiuser context, and then propose a conjecture-based multiagent Q-learning algorithm to achieve the optimal transmission strategies with only private and incomplete information. An intelligent SU performs Q-function updates based on the conjecture over the other SUs' stochastic behaviors. This learning algorithm provably converges given certain restrictions that arise during the learning procedure. Simulation experiments are used to verify the performance of our algorithm and demonstrate its effectiveness of improving the energy efficiency.
引用
收藏
页码:2155 / 2166
页数:12
相关论文
共 28 条
  • [1] Akyildiz Ian F., 2009, Ad Hoc Networks, V7, P810, DOI 10.1016/j.adhoc.2009.01.001
  • [2] Chen T., 2007, P IEEE 2 INT S NEW F
  • [3] Optimally Sensing a Single Channel Without Prior Information: The Tiling Algorithm and Regret Bounds
    Filippi, Sarah
    Cappe, Olivier
    Garivier, Aurelien
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2011, 5 (01) : 68 - 76
  • [4] Learning to Compete for Resources in Wireless Stochastic Games
    Fu, Fangwen
    van der Schaar, Mihaela
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2009, 58 (04) : 1904 - 1919
  • [5] Fudenberg D., 1992, GAME THEORY
  • [6] Distributed Energy Efficient Spectrum Access in Cognitive Radio Wireless Ad Hoc Networks
    Gao, Song
    Qian, Lijun
    Vaman, Dhadesugoor R.
    [J]. IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2009, 8 (10) : 5202 - 5213
  • [7] Gomes E.R., P 26 INT C MACH LEAR
  • [8] GREENWALD A, 2003, P 20 INT C MACH LEAR
  • [9] Cognitive radio: Brain-empowered wireless communications
    Haykin, S
    [J]. IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2005, 23 (02) : 201 - 220
  • [10] Power Control and Channel Allocation in Cognitive Radio Networks with Primary Users' Cooperation
    Hoang, Anh Tuan
    Liang, Ying-Chang
    Islam, Md Habibul
    [J]. IEEE TRANSACTIONS ON MOBILE COMPUTING, 2010, 9 (03) : 348 - 360