An Alternative Reinforcement Learning (ARL) control strategy for data center air-cooled HVAC systems

被引:8
作者
Lu, Ruyuan [1 ]
Li, Xin [1 ]
Chen, Ronghao [1 ]
Lei, Aimin [2 ]
Ma, Xiaoming [1 ]
机构
[1] Peking Univ, Sch Environm & Energy, Shenzhen 518055, Peoples R China
[2] Vertiv Tech Co Ltd, Shenzhen 518055, Peoples R China
关键词
Reinforcement learning (RL); Deep deterministic policy gradient (DDPG); Proximal policy optimization (PPO); Air-cooled HVAC systems; Data center; ENERGY; INTERNET; COMFORT;
D O I
10.1016/j.energy.2024.132977
中图分类号
O414.1 [热力学];
学科分类号
摘要
Energy efficiency of data center is of great concern globally due to their large amount of energy consumption and the foreseeable growth in the demand of digital services in the future. Advanced control strategies are needed to reduce energy consumption. However, the optimization of data center HVAC control is a challenge task due to the complexity of the thermal dynamic models of buildings and uncertainties associated with both server loads and outdoor temperature. In this paper, we propose a new control strategy called Alternately- Reinforcement Learning-control(ARL), which realizes the alternating control of RL and proportional, integral and derivative (PID) for optimizing the control of air-cooled HVAC systems in data centers. The control object of the ARL is the speed set of the compressor and the condensing fan, and the control goal is to minimize energy consumption while maintaining temperature stability. The applied RL algorithm is Deep deterministic policy gradient (DDPG) and Proximal policy optimization (PPO). We pre-train the ARL strategy in offline environment firstly and deploy it in real data center environment for online testing. The test results show that the ARL has significant advantages in maintaining temperature stability while reducing energy consumption, and the PPO algorithm performs better than the DDPG algorithm. Compared with the PID algorithm, the PPO algorithm can save energy by 5.27%, and the temperature control effect can be improved by 3.27%,which indicates the feasibility of the ARL for implementation in real data centers.
引用
收藏
页数:15
相关论文
共 41 条
[41]   Reinforcement-Learning-Based Fuzzy Adaptive Finite-Time Optimal Resilient Control for Large-Scale Nonlinear Systems Under False Data Injection Attacks [J].
Zhao, Jipeng ;
Yang, Guang-Hong .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2024, 32 (04) :2483-2495