Cell selection for open-access femtocell networks: Learning in changing environment

被引：5

作者：

Dhahri, Chaima ^{[1
]}

Ohtsuki, Tomoaki ^{[2
]}

机构：

[1] Keio Univ, Dept Comp & Informat Sci, Yokohama, Kanagawa 2238522, Japan

[2] Keio Univ, Fac Sci & Technol, Yokohama, Kanagawa 2238522, Japan

来源：

PHYSICAL COMMUNICATION | 2014年 / 13卷

关键词：

Femtocell networks; Cell selection; Learning in dynamic environment; Multi-armed bandit (MAB); Meta-bandit;

D O I：

10.1016/j.phycom.2014.04.008

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper addresses the problem of cell selection in dynamic open-access femtocell networks. We model this problem as decentralized restless multi-armed bandit (MAB) with unknown dynamics and multiple players. Each channel is modeled as an arbitrary finitestate Markov chain with different state space and statistics. Each user tries to learn the best channel that maximizes its capacity and reduces its number of handovers. This is a classic exploration/exploitation problem, where the reward of each channel is considered to be Markovian. In addition, the reward process is restless because the state of each Markov chain evolves independently of the user action. This leads to a decentralized restless bandit problem. To solve this problem, we refer to the decentralized restless upper confidence bound (RUCB) algorithm that achieves a logarithmic regret over time for the MAB problem (proposal 1). Then, we extend this algorithm to cope with dynamic environment by applying a change point detection test based on the Page-Hinkley test (PHT) (proposal 2). However, this test would entail some waste of time if the change-point detection was actually a false alarm. To face this problem, we extend our previous proposal by referring to a meta-bandit algorithm (proposal 3) to solve the dilemma between Exploration and Exploitation after the change-point detection occurs. Simulation results show that our proposal come close to the performance of opportunistic method in terms of capacity, while fewer average number of handovers is required. The use of a change point test and meta-bandit algorithm allows better performance than RUCB in terms of capacity particularly in a changing environment. (C) 2014 Elsevier B.V. All rights reserved.

引用

页码：42 / 52

页数：11

共 20 条

[1]

[Anonymous], 2012, P IEEE 75 VEH TECHN

[2]

[Anonymous], 2011, 2011 IEEE GLOB TEL C

[3] Finite-time analysis of the multiarmed bandit problem [J].

Auer, P ;

Cesa-Bianchi, N ;

Fischer, P .

MACHINE LEARNING, 2002, 47 (2-3) :235-256

[4] Femtocell Networks: A Survey [J].

Chandrasekhar, Vikram ;

Andrews, Jeffrey G. ;

Gatherer, Alan .

IEEE COMMUNICATIONS MAGAZINE, 2008, 46 (09) :59-67

[5]

CORNETT KG, 1991, 41ST IEEE VEHICULAR TECHNOLOGY CONFERENCE, P543, DOI 10.1109/VETEC.1991.140550

[6]

Dhahri C, 2012, IEEE GLOB COMM CONF, P4975, DOI 10.1109/GLOCOM.2012.6503908

[7]

FemtoForum, 2010, INT MAN OFDMA FEMT M

[8]

Frech E. A., 1989, 39th IEEE Vehicular Technology Conference (IEEE Cat. No.89CH2739-1), P128, DOI 10.1109/VETEC.1989.40061

[9]

HARTLAND C, 2006, ONL TRAD EXPL EXPL W

[10]

Hartland Cedric, 2007, CAP 2007, P237

← 1 2 →