Application of two promising Reinforcement Learning algorithms for load shifting in a cooling supply system

被引：55

作者：

Schreiber, Thomas ^{[1
]}

Eschweiler, Soeren ^{[1
]}

Baranski, Marc ^{[1
]}

Mueller, Dirk ^{[1
]}

机构：

[1] Rhein Westfal TH Aachen, EON Energy Res Ctr, Inst Energy Efficient Bldg & Indoor Climate, Aachen, Germany

来源：

ENERGY AND BUILDINGS | 2020年 / 229卷

关键词：

Reinforcement Learning; Load shifting; Optimal control; Thermal systems; Building automation and control; Simulation; MODEL-PREDICTIVE CONTROL; ENERGY MANAGEMENT; HVAC;

D O I：

10.1016/j.enbuild.2020.110490

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

With the increasing use of volatile renewable energies, the requirements for building automation and control systems (BACS) are increasing. Load shifting within local energy systems stabilizes fluctuations in the grid and can be triggered by price signals. The energy purchase can thus be considered and solved as an optimal control problem. Classical approaches, often based on the optimization of mathematical models, are uneconomical in many cases, due to the high effort involved in the model creation. Algorithms from the field of Reinforcement Learning (RL), on the other hand, have a high potential for the automation of energy system optimization, due to their model-free and data-driven characteristics. However, there is still a lack of studies that examine algorithms for BACS-related applications in a structured way. Therefore, we present a study, investigating the potential of two different RL algorithms for load shifting in a cooling supply system. We combine the benefits of Modelica, a powerful modeling language, with RL algorithms and demonstrate how generalized relationships and control decisions can be learned. The case study is modeled according to a cooling supply system in Berlin, Germany. The two different algorithms (DQN and DDPG) are used to control the operation parameters of a central compression chiller, with respect to a price signal. While real monitoring data are used as exogenous influences, the thermal dynamics of the cooling network are simulated. With the learned policies, flexibility in the network is used which leads on average to weekly cost savings of 14 %, compared to direct load coverage. Our results suggest that, under certain conditions, RL is a suitable alternative to established methods. However, we also acknowledge that there are still research questions to address before RL can be applied in real BACS. (C) 2020 Elsevier B.V. All rights reserved.

引用

页数：11

共 50 条

[1] Theory and applications of HVAC control systems - A review of model predictive control (MPC) [J].

Afram, Abdul ;

Janabi-Sharifi, Farrokh .

BUILDING AND ENVIRONMENT, 2014, 72 :343-355

[2] Demand-Side Management of Domestic Electric Water Heaters Using Approximate Dynamic Programming [J].

Al-jabery, Khalid ;

Xu, Zhezhao ;

Yu, Wenjian ;

Wunsch, Donald C., II ;

Xiong, Jinjun ;

Shi, Yiyu .

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2017, 36 (05) :775-788

[3]

Alfred R., 2016, 2016 2 INT C SCI INF, P1, DOI 10.1109/ICSITech.2016.7852593

[4]

[Anonymous], 2015, ARXIV160304467V2

[5]

[Anonymous], DEEP REINFORCEMENT L

[6]

[Anonymous], 2012, P 9 INT MODELICA C, DOI DOI 10.3384/ECP12076173

[7] An Online Learning Algorithm for Demand Response in Smart Grid [J].

Bahraini, Shahab ;

Wong, Vincent W. S. ;

Huang, Jianwei .

IEEE TRANSACTIONS ON SMART GRID, 2018, 9 (05) :4712-4725

[8]

Bellman R., 1956, DYNAMIC PROGRAMMING

[9]

Brockman Greg, 2016, ARXIV160601540

[10]

Bschorer S., 2019, ENERGIENETZ BERLIN A

← 1 2 3 4 5 →