A Multi-Agent Reinforcement Learning Approach to Price and Comfort Optimization in HVAC-Systems

被引:14
作者
Blad, Christian [1 ,2 ,4 ]
Bogh, Simon [1 ]
Kallesoe, Carsten [2 ,3 ]
机构
[1] Aalborg Univ, Dept Mat & Prod, Robot & Automat Grp, DK-9220 Aalborg, Denmark
[2] Grundfos, Technol & Innovat, Control Dept, DK-8850 Bjerringbro, Denmark
[3] Aalborg Univ, Dept Elect Syst, DK-9220 Aalborg, Denmark
[4] Fibigerstr 16, DK-9220 Aalborg, Denmark
关键词
deep reinforcement learning; artificial intelligence; HVAC-systems; underfloor heating; energy in buildings; predictive analytics; DEMAND RESPONSE; BUILDINGS; ENERGY;
D O I
10.3390/en14227491
中图分类号
TE [石油、天然气工业]; TK [能源与动力工程];
学科分类号
0807 ; 0820 ;
摘要
This paper addresses the challenge of minimizing training time for the control of Heating, Ventilation, and Air-conditioning (HVAC) systems with online Reinforcement Learning (RL). This is done by developing a novel approach to Multi-Agent Reinforcement Learning (MARL) to HVAC systems. In this paper, the environment formed by the HVAC system is formulated as a Markov Game (MG) in a general sum setting. The MARL algorithm is designed in a decentralized structure, where only relevant states are shared between agents, and actions are shared in a sequence, which are sensible from a system's point of view. The simulation environment is a domestic house located in Denmark and designed to resemble an average house. The heat source in the house is an air-to-water heat pump, and the HVAC system is an Underfloor Heating system (UFH). The house is subjected to weather changes from a data set collected in Copenhagen in 2006, spanning the entire year except for June, July, and August, where heat is not required. It is shown that: (1) When comparing Single Agent Reinforcement Learning (SARL) and MARL, training time can be reduced by 70% for a four temperature-zone UFH system, (2) the agent can learn and generalize over seasons, (3) the cost of heating can be reduced by 19% or the equivalent to 750 kWh of electric energy per year for an average Danish domestic house compared to a traditional control method, and (4) oscillations in the room temperature can be reduced by 40% when comparing the RL control methods with a traditional control method.
引用
收藏
页数:20
相关论文
共 42 条
[1]  
Banerjee B., 2003, The Conference on Autonomous Agents and Multiagent Systems, P686
[2]   Autonomous HVAC Control, A Reinforcement Learning Approach [J].
Barrett, Enda ;
Linder, Stephen .
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT III, 2015, 9286 :3-19
[3]   DYNAMIC PROGRAMMING [J].
BELLMAN, R .
SCIENCE, 1966, 153 (3731) :34-&
[4]   Multiagent Reinforcement Learning: Rollout and Policy Iteration [J].
Bertsekas, Dimitri .
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2021, 8 (02) :249-272
[5]   Control of HVAC-systems with Slow Thermodynamic Using Reinforcement Learning [J].
Blad, C. ;
Koch, S. ;
Ganeswarathas, S. ;
Kallesoe, C. S. ;
Bogh, S. .
29TH INTERNATIONAL CONFERENCE ON FLEXIBLE AUTOMATION AND INTELLIGENT MANUFACTURING (FAIM 2019): BEYOND INDUSTRY 4.0: INDUSTRIAL ADVANCES, ENGINEERING EDUCATION AND INTELLIGENT MANUFACTURING, 2019, 38 :1308-1315
[6]  
Blad C, 2020, IEEE/SICE I S SYS IN, P938, DOI [10.1109/sii46433.2020.9026189, 10.1109/SII46433.2020.9026189]
[7]   Multiagent learning using a variable learning rate [J].
Bowling, M ;
Veloso, M .
ARTIFICIAL INTELLIGENCE, 2002, 136 (02) :215-250
[8]  
Busoniu L, 2010, STUD COMPUT INTELL, V310, P183
[9]  
Crites RH, 1996, ADV NEUR IN, V8, P1017
[10]   The impact of the energy performance regulations' updated on the construction technology, economics and energy aspects of new residential buildings: The case of Greece [J].
Gaglia, Athina G. ;
Tsikaloudaki, Aikaterini G. ;
Laskos, Costantinos M. ;
Dialynas, Evangelos N. ;
Argiriou, Athanassios A. .
ENERGY AND BUILDINGS, 2017, 155 :225-237