Optimality of Myopic Sensing in Multichannel Opportunistic Access

被引：223

作者：

Ahmad, Sahand Haji Ali ^{[1
]}

Liu, Mingyan ^{[1
]}

Javidi, Tara ^{[2
]}

Zhao, Qing ^{[3
]}

Krishnamachari, Bhaskar ^{[4
]}

机构：

[1] Univ Michigan, Dept Elect Engn & Comp Sci, Ann Arbor, MI 48105 USA

[2] Univ Calif San Diego, Dept Elect & Comp Engn, La Jolla, CA 92093 USA

[3] Univ Calif Davis, Dept Elect & Comp Engn, Davis, CA 95616 USA

[4] Univ So Calif, Ming Hsieh Dept Elect Engn, Los Angeles, CA 90089 USA

来源：

IEEE TRANSACTIONS ON INFORMATION THEORY | 2009年 / 55卷 / 09期

基金：

美国国家科学基金会;

关键词：

Cognitive radio; Gittins index; myopic policy; opportunistic access; partially observed Markov decision process (POMDP); restless bandit; Whittle's index; RESTLESS BANDITS; SPECTRUM ACCESS; NETWORKS;

D O I：

10.1109/TIT.2009.2025561

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper considers opportunistic communication over multiple channels where the state ("good" or "bad") of each channel evolves as independent and identically distributed (i.i.d.) Markov processes. A user, with limited channel sensing capability, chooses one channel to sense and decides whether to use the channel (based on the sensing result) in each time slot. A reward is obtained whenever the user senses and accesses a "good" channel. The objective is to design a channel selection policy that maximizes the expected total (discounted or average) reward accrued over a finite or infinite horizon. This problem can be cast as a partially observed Markov decision process (POMDP) or a restless multiarmed bandit process, to which optimal solutions are often intractable. This paper shows that a myopic policy that maximizes the immediate one-step reward is optimal when the state transitions are positively correlated over time. When the state transitions are negatively correlated, we show that the same policy is optimal when the number of channels is limited to two or three, while presenting a counterexample for the case of four channels. This result finds applications in opportunistic transmission scheduling in a fading environment, cognitive radio networks for spectrum overlay, and resource-constrained jamming and antijamming.

引用

页码：4040 / 4050

页数：11

共 29 条

[1]

[Anonymous], 2008, Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data

[2]

[Anonymous], OPER RES

[3] DISCRETE-TIME CONTROLLED MARKOV-PROCESSES WITH AVERAGE COST CRITERION - A SURVEY [J].

ARAPOSTATHIS, A ;

BORKAR, VS ;

FERNANDEZGAUCHERAND, E ;

GHOSH, MK ;

MARCUS, SI .

SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1993, 31 (02) :282-344

[4] Restless bandits, linear programming relaxations, and a primal-dual index heuristic [J].

Bertsimas, D ;

Niño-Mora, J .

OPERATIONS RESEARCH, 2000, 48 (01) :80-90

[5]

Chang NB, 2007, MOBICOM'07: PROCEEDINGS OF THE THIRTEENTH ACM INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING, P27

[6]

EHSAN N, IEEE T WIRE IN PRESS

[7]

Fernandez-Gaucherand E., 1991, Annals of Operations Research, V29, P439, DOI 10.1007/BF02283610

[8]

GANTI A, 2003, THESIS MIT CAMBRIDGE

[9] Optimal transmission scheduling in symmetric communication models with intermittent connectivity [J].

Ganti, Anand ;

Modiano, Eytan ;

Tsitsiklis, John N. .

IEEE TRANSACTIONS ON INFORMATION THEORY, 2007, 53 (03) :998-1008

[10]

GITTINS JC, 1972, J ROY STAT SOC, V14, P148

← 1 2 3 →