Transforming Cooling Optimization for Green Data Center via Deep Reinforcement Learning

被引:119
作者
Li, Yuanlong [1 ]
Wen, Yonggang [1 ]
Tao, Dacheng [2 ,3 ]
Guan, Kyle [4 ]
机构
[1] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
[2] Univ Sydney, Fac Engn, UBTECH Sydney Artificial Intelligence Ctr, Darlington 2008, Nsw, England
[3] Univ Sydney, Fac Engn, Sch Comp Sci, Darlington, NSW 2008, Australia
[4] Nokia, Bell Labs, Holmdel, NJ 07733 USA
关键词
Cooling; Optimization; Mathematical model; Computational modeling; Software algorithms; Data models; Atmospheric modeling; Data center (DC) cooling optimization; deep learning; reinforcement learning (RL); SYSTEMS; ALGORITHM;
D O I
10.1109/TCYB.2019.2927410
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data center (DC) plays an important role to support services, such as e-commerce and cloud computing. The resulting energy consumption from this growing market has drawn significant attention, and noticeably almost half of the energy cost is used to cool the DC to a particular temperature. It is thus an critical operational challenge to curb the cooling energy cost without sacrificing the thermal safety of a DC. The existing solutions typically follow a two-step approach, in which the system is first modeled based on expert knowledge and, thus, the operational actions are determined with heuristics and/or best practices. These approaches are often hard to generalize and might result in suboptimal performances due to intrinsic model errors for large-scale systems. In this paper, we propose optimizing the DC cooling control via the emerging deep reinforcement learning (DRL) framework. Compared to the existing approaches, our solution lends itself an end-to-end cooling control algorithm (CCA) via an off-policy offline version of the deep deterministic policy gradient (DDPG) algorithm, in which an evaluation network is trained to predict the DC energy cost along with resulting cooling effects, and a policy network is trained to gauge optimized control settings. Moreover, we introduce a de-underestimation (DUE) validation mechanism for the critic network to reduce the potential underestimation of the risk caused by neural approximation. Our proposed algorithm is evaluated on an EnergyPlus simulation platform and on a real data trace collected from the National Super Computing Centre (NSCC) of Singapore. The resulting numerical results show that the proposed CCA can achieve up to 11% cooling cost reduction on the simulation platform compared with a manually configured baseline control algorithm. In the trace-based study of conservative nature, the proposed algorithm can achieve about 15% cooling energy savings on the NSCC data trace. Our pioneering approach can shed new light on the application of DRL to optimize and automate DC operations and management, potentially revolutionizing digital infrastructure management with intelligence.
引用
收藏
页码:2002 / 2013
页数:12
相关论文
共 43 条
[1]   Optimal control development for chilled water plants using a quadratic representation [J].
Ahn, BC ;
Mitchell, JW .
ENERGY AND BUILDINGS, 2001, 33 (04) :371-378
[2]  
[Anonymous], 2018, PROC AAAI C ARTIF IN
[3]  
[Anonymous], 2013, PLAYING ATARI DEEP R
[4]  
[Anonymous], REINFORCEMENT LEARNI, DOI DOI 10.1007/S10903-013-9802-Z
[5]  
[Anonymous], 2012, ARXIV12125701CSLG
[6]  
[Anonymous], 1998, Introduction to Reinforcement Learning, DOI DOI 10.5555/551283
[7]  
[Anonymous], 2015, ABS150902971 CORR
[8]   Integrating cooling awareness with thermal aware workload placement for HPC data centers [J].
Banerjee, Ayan ;
Mukherjee, Tridib ;
Varsamopoulos, Georgios ;
Gupta, Sandeep K. S. .
SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2011, 1 (02) :134-150
[9]   NEURONLIKE ADAPTIVE ELEMENTS THAT CAN SOLVE DIFFICULT LEARNING CONTROL-PROBLEMS [J].
BARTO, AG ;
SUTTON, RS ;
ANDERSON, CW .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1983, 13 (05) :834-846
[10]  
BRAUN JE, 1990, ASHRAE TRAN, V96, P876