Multi-Zone HVAC Control With Model-Based Deep Reinforcement Learning

被引：5

作者：

Ding, Xianzhong ^{[1
]}

Cerpa, Alberto ^{[1
]}

Du, Wan ^{[1
]}

机构：

[1] Univ Calif Merced, Dept Comp Sci & Engn, Merced, CA 95343 USA

来源：

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING | 2024年

关键词：

HVAC; Buildings; Heuristic algorithms; Adaptation models; Neural networks; Predictive models; Optimization; HVAC control; model-based deep reinforcement learning; model predictive control; energy efficiency; optimal control; BUILDINGS; DYNAMICS; IOT;

D O I：

10.1109/TASE.2024.3410951

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The application of reinforcement learning in controlling Heating, Ventilation, and Air Conditioning (HVAC) systems has been extensively researched. Existing studies primarily focus on Model-Free Reinforcement Learning (MFRL), which involves trial-and-error interactions with real buildings to train the agent. However, MFRL encounters a significant challenge: it requires a large amount of training data to achieve satisfactory performance. While simulation models have been used to generate training data and expedite the training process, they necessitate high-fidelity building models that are difficult to calibrate. As a result, Model-Based Reinforcement Learning (MBRL) has been employed for HVAC control. Although MBRL demonstrates remarkable sample efficiency, it often falls short in terms of asymptotic control performance, particularly in achieving substantial energy savings while ensuring occupants' thermal comfort. In this study, we conduct experiments to analyze the limitations of current MBRL-based HVAC control methods, focusing on model uncertainty and controller effectiveness. Leveraging the insights gained from these experiments, we develop (MBC)-C-2, an innovative MBRL-based HVAC control system that combines high control performance with exceptional sample efficiency. (MBC)-C-2 learns the dynamics of the building by employing an ensemble of environment-conditioned neural networks and utilizes a novel control method called Model Predictive Path Integral (MPPI) for HVAC control. MPPI generates candidate action sequences using an importance sampling weighted algorithm, which is well-suited for multi-zone buildings with high state and action dimensions. We evaluate (MBC)-C-2 using EnergyPlus simulations in a five-zone office building, and the results demonstrate that (MBC)-C-2 achieves 8.23% higher energy savings compared to the state-of-the-art MBRL solution while maintaining comparable thermal comfort. Moreover, significantly reduces the required training data set by an order of magnitude (10.52 x ) while delivering performance on par with MFRL approaches.

引用

页码：4408 / 4426

页数：19

共 67 条

[1]

An ZY, 2024, Arxiv, DOI arXiv:2403.00172

[2] CLUE: Safe Model-Based RL HVAC ControL Using Epistemic Uncertainty Estimation [J].

An, Zhiyu ;

Ding, Xianzhong ;

Rathee, Arya ;

Du, Wan .

PROCEEDINGS OF THE 10TH ACM INTERNATIONAL CONFERENCE ON SYSTEMS FOR ENERGY-EFFICIENT BUILDINGS, CITIES, AND TRANSPORTATION, BUILDSYS 2023, 2023, :149-158

[3]

[Anonymous], 2012, P 2 INT C BUILD EN E

[4] Model predictive HVAC load control in buildings using real-time electricity pricing [J].

Avci, Mesut ;

Erkoc, Murat ;

Rahmani, Amir ;

Asfour, Shihab .

ENERGY AND BUILDINGS, 2013, 60 :199-209

[5] Model-based and model-free "plug-and-play" building energy efficient control [J].

Baldi, Simone ;

Michailidis, Iakovos ;

Ravanis, Christos ;

Kosmatopoulos, Elias B. .

APPLIED ENERGY, 2015, 154 :829-841

[6] Development of an IoT-Driven Building Environment for Prediction of Electric Energy Consumption [J].

Bedi, Guneet ;

Venayagamoorthy, Ganesh Kumar ;

Singh, Rajendra .

IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (06) :4912-4921

[7]

Beltran Alex, 2014, P 1 ACM C EMB SYST E, P168, DOI [10.1145/2674061.2674072, DOI 10.1145/2674061.2674072]

[8]

Botev ZI, 2013, HANDB STAT, V31, P35, DOI 10.1016/B978-0-444-53859-8.00003-5

[9] Deep Learning-Based Trajectory Planning and Control for Autonomous Ground Vehicle Parking Maneuver [J].

Chai, Runqi ;

Liu, Derong ;

Liu, Tianhao ;

Tsourdos, Antonios ;

Xia, Yuanqing ;

Chai, Senchun .

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2023, 20 (03) :1633-1647

[10] Gnu-RL: A Practical and Scalable Reinforcement Learning Solution for Building HVAC Control Using a Differentiable MPC Policy [J].

Chen, Bingqing ;

Cai, Zicheng ;

Berges, Mario .

FRONTIERS IN BUILT ENVIRONMENT, 2020, 6

← 1 2 3 4 5 6 7 →