Deep Reinforcement Learning With Spatio-Temporal Traffic Forecasting for Data-Driven Base Station Sleep Control

被引：70

作者：

Wu, Qiong ^{[1
]}

Chen, Xu ^{[1
]}

Zhou, Zhi ^{[1
]}

Chen, Liang ^{[2
]}

Zhang, Junshan ^{[3
]}

机构：

[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China

[2] Tencent Inc, Dept Financial Technol, Shenzhen, Peoples R China

[3] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85287 USA

来源：

IEEE-ACM TRANSACTIONS ON NETWORKING | 2021年 / 29卷 / 02期

关键词：

Base stations; Forecasting; Correlation; Cellular networks; Quality of service; Energy consumption; Switches; Base station sleep control; spatio-temporal traffic forecasting; deep reinforcement learning; ENERGY-DELAY TRADEOFFS;

D O I：

10.1109/TNET.2021.3053771

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

To meet the ever increasing mobile traffic demand in 5G era, base stations (BSs) have been densely deployed in radio access networks (RANs) to increase the network coverage and capacity. However, as the high density of BSs is designed to accommodate peak traffic, it would consume an unnecessarily large amount of energy if BSs are on during off-peak time. To save the energy consumption of cellular networks, an effective way is to deactivate some idle base stations that do not serve any traffic demand. In this paper, we develop a traffic-aware dynamic BS sleep control framework, named DeepBSC, which presents a novel data-driven learning approach to determine the BS active/sleep modes while meeting lower energy consumption and satisfactory Quality of Service (QoS) requirements. Specifically, the traffic demands are predicted by the proposed GS-STN model, which leverages the geographical and semantic spatial-temporal correlations of mobile traffic. With accurate mobile traffic forecasting, the BS sleep control problem is cast as a Markov Decision Process that is solved by Actor-Critic reinforcement learning methods. To reduce the variance of cost estimation in the dynamic environment, we propose a benchmark transformation method that provides robust performance indicator for policy update. To expedite the training process, we adopt a Deep Deterministic Policy Gradient (DDPG) approach, together with an explorer network, which can strengthen the exploration further. Extensive experiments with a real-world dataset corroborate that our proposed framework significantly outperforms the existing methods.

引用

页码：935 / 948

页数：14

共 42 条

[1]

Marsan MA, 2009, 2009 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION WORKSHOPS, VOLS 1 AND 2, P438

[2] A multi-source dataset of urban life in the city of Milan and the Province of Trentino [J].

Barlacchi, Gianni ;

De Nadai, Marco ;

Larcher, Roberto ;

Casella, Antonio ;

Chitic, Cristiana ;

Torrisi, Giovanni ;

Antonelli, Fabrizio ;

Vespignani, Alessandro ;

Pentland, Alex ;

Lepri, Bruno .

SCIENTIFIC DATA, 2015, 2

[3]

Bergstra J, 2012, J MACH LEARN RES, V13, P281

[4] Power consumption model for macrocell and microcell base stations [J].

Deruyck, Margot ;

Joseph, Wout ;

Martens, Luc .

TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES, 2014, 25 (03) :320-333

[5]

Furno A, 2017, IEEE INFOCOM SER

[6]

Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1162/neco.1997.9.8.1735, 10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]

[7] OPTIMAL OPERATING POLICIES FOR M/G/1 QUEUING SYSTEMS [J].

HEYMAN, DP .

OPERATIONS RESEARCH, 1968, 16 (02) :362-&

[8]

Hochreiter S., FIELD GUIDE DYNAMICA

[9] MULTILAYER FEEDFORWARD NETWORKS ARE UNIVERSAL APPROXIMATORS [J].

HORNIK, K ;

STINCHCOMBE, M ;

WHITE, H .

NEURAL NETWORKS, 1989, 2 (05) :359-366

[10] A Study of Deep Learning Networks on Mobile Traffic Forecasting [J].

Huang, Chih-Wei ;

Chiang, Chiu-Ti ;

Li, Qiuhui .

2017 IEEE 28TH ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR, AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2017,

← 1 2 3 4 5 →