Reinforcement Learning with Dual Safety Policies for Energy Savings in Building Energy Systems

被引:4
作者
Lin, Xingbin [1 ]
Yuan, Deyu [1 ]
Li, Xifei [1 ]
机构
[1] Gridsum Inc, 229 North 4th Ring Rd, Beijing 100083, Peoples R China
关键词
reinforcement learning; safety policy; HVAC system control; energy saving; HVAC CONTROL-SYSTEMS; CLASSIFICATION;
D O I
10.3390/buildings13030580
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Reinforcement learning (RL) is being gradually applied in the control of heating, ventilation and air-conditioning (HVAC) systems to learn the optimal control sequences for energy savings. However, due to the "trial and error" issue, the output sequences of RL may cause potential operational safety issues when RL is applied in real systems. To solve those problems, an RL algorithm with dual safety policies for energy savings in HVAC systems is proposed. In the proposed dual safety policies, the implicit safety policy is a part of the RL model, which integrates safety into the optimization target of RL, by adding penalties in reward for actions that exceed the safety constraints. In explicit safety policy, an online safety classifier is built to filter the actions outputted by RL; thus, only those actions that are classified as safe and have the highest benefits will be finally selected. In this way, the safety of controlled HVAC systems running with proposed RL algorithms can be effectively satisfied while reducing the energy consumptions. To verify the proposed algorithm, we implemented the control algorithm in a real existing commercial building. After a certain period of self-studying, the energy consumption of HVAC had been reduced by more than 15.02% compared to the proportional-integral-derivative (PID) control. Meanwhile, compared to the independent application of the RL algorithm without safety policy, the proportion of indoor temperature not meeting the demand is reduced by 25.06%.
引用
收藏
页数:20
相关论文
共 43 条
[1]   Theory and applications of HVAC control systems - A review of model predictive control (MPC) [J].
Afram, Abdul ;
Janabi-Sharifi, Farrokh .
BUILDING AND ENVIRONMENT, 2014, 72 :343-355
[2]   Modeling techniques used in building HVAC control systems: A review [J].
Afroz, Zakia ;
Shafiullah, G. M. ;
Urmee, Tania ;
Higgins, Gary .
RENEWABLE & SUSTAINABLE ENERGY REVIEWS, 2018, 83 :64-84
[3]   Computational Simulation and Dimensioning of Solar-Combi Systems for Large-Size Sports Facilities: A Case Study for the Pancretan Stadium, Crete, Greece [J].
Al Katsaprakakis, Dimitris .
ENERGIES, 2020, 13 (09)
[4]   A just-in-time adaptive classification system based on the intersection of confidence intervals rule [J].
Alippi, Cesare ;
Boracchi, Giacomo ;
Roveri, Manuel .
NEURAL NETWORKS, 2011, 24 (08) :791-800
[5]   Reinforcement learning for whole-building HVAC control and demand response [J].
Azuatalam, Donald ;
Lee, Wee-Lih ;
de Nijs, Frits ;
Liebman, Ariel .
ENERGY AND AI, 2020, 2
[6]   Infinite-horizon policy-gradient estimation [J].
Baxter, J ;
Bartlett, PL .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2001, 15 :319-350
[7]   Experimental evaluation of model-free reinforcement learning algorithms for continuous HVAC control [J].
Biemann, Marco ;
Scheller, Fabian ;
Liu, Xiufeng ;
Huang, Lizhen .
APPLIED ENERGY, 2021, 298
[8]  
Cesa-Bianchi N., 2006, PREDICTION LEARNING
[9]  
Chow Y, 2018, ADV NEUR IN, V31
[10]  
Ditzler G, 2014, IEEE IJCNN, P595, DOI 10.1109/IJCNN.2014.6889909