An RL-based scheduling algorithm for video traffic in high-rate wireless personal area networks

被引：4

作者：

Moradi, Shahab ^{[1
]}

Mohsenian-Rad, A. Hamed ^{[1
]}

Wong, Vincent W. S. ^{[1
]}

机构：

[1] Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V5Z 1M9, Canada

来源：

COMPUTER NETWORKS | 2009年 / 53卷 / 18期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Wireless personal area networks; Scheduling; QoS; Ultra-wide band (UWB); Markov decision process (MDP); Reinforcement learning; REINFORCEMENT LEARNING APPROACH; QOS;

D O I：

10.1016/j.comnet.2009.07.012

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The emerging high-rate wireless personal area network (WPAN) technology is capable of supporting high-speed and high-quality real-time multimedia applications. In particular, video streams are deemed to be a dominant traffic type, and require quality of service (QoS) support. However, in the current IEEE 802.15.3 standard for MAC (media access control) of high-rate WPANs, the implementation details of some key issues such as scheduling and QoS provisioning have not been addressed. In this paper, we first propose a Markov decision process (MDP) model for optimal scheduling for video flows in high-rate WPANs. Using this model, we also propose a scheduler that incorporates compact state space representation, function approximation, and reinforcement learning (RL). Simulation results show that our proposed RL scheduler achieves nearly optimal performance and performs better than F-SRPT, EDD + SRPT, and PAP scheduling algorithms in terms of a lower decoding failure rate. (C) 2009 Elsevier B.V. All rights reserved.

引用

页码：2997 / 3010

页数：14

共 34 条

[1]

[Anonymous], 1998, 144962 IS

[2] An energy diffserv and application-aware MAC scheduling for VBR streaming video in the IEEE 802.15.3 high-rate wireless personal area networks [J].

Chen, Xi ;

Xiao, Yang ;

Cai, Yu ;

Lu, Jianhua ;

Zhou, Zucheng .

COMPUTER COMMUNICATIONS, 2006, 29 (17) :3516-3526

[3]

Chu YC, 2004, IEEE MILIT COMMUN C, P1100

[4]

Darken C., 1992, P NEUR NETW SIGN PRO

[5] Solving semi-Markov decision problems using average reward reinforcement learning [J].

Das, TK ;

Gosavi, A ;

Mahadevan, S ;

Marchalleck, N .

MANAGEMENT SCIENCE, 1999, 45 (04) :560-574

[6] MPEG-4 and H.263 video traces for network performance evaluation [J].

Fitzek, FHP ;

Reisslein, M .

IEEE NETWORK, 2001, 15 (06) :40-54

[7]

Giménez-Guzmán JM, 2006, LECT NOTES COMPUT SC, V3883, P115

[8] Reinforcement learning for long-run average cost [J].

Gosavi, A .

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2004, 155 (03) :654-674

[9]

Gosavi A, 2002, IIE TRANS, V34, P729

[10]

HU C, 2006, P IEEE INF BARC SPAI

← 1 2 3 4 →