Transfer Learning Applied to Reinforcement Learning-Based HVAC Control

被引:0
作者
Lissa P. [1 ]
Schukat M. [1 ]
Barrett E. [1 ]
机构
[1] College of Science and Engineering, National University of Ireland, Galway
关键词
Autonomous HVAC; Q-learning; Reinforcement learning; Transfer learning;
D O I
10.1007/s42979-020-00146-7
中图分类号
学科分类号
摘要
Modern control solutions for HVAC have demonstrated excellent cost and energy savings through the utilisation of machine learning techniques. However, a challenging problem faced by most machine learning tasks is the amount of time and data required to train effective policies in the absence of prior knowledge. Considering that buildings from a specific geographical location share common environmental and structural features, this paper investigates the impact of spatial changes on performance accuracy through the use of transfer learning applied to reinforcement learning based HVAC control. We propose the development of an adapted RL (Q-learning) algorithm which can transfer HVAC control polices, adjusting themselves according to spatial changes. We examine the performance of our approach across multiple different locations. Moreover, an analysis of the user’s time out comfort has been made, comparing models with and without transfer learning. The results from different cases show that after applying transfer learning the learning time to train optimal or near-optimal control policies was reduced by more than a factor of 6 when comparing to experiments without it. Furthermore, the test case where the spatial variation was lower than 50% achieved a similar performance for both dynamic and static HVAC control, presenting an average time out comfort error of 2.55% and 3.83%, respectively. From the user’s perspective, it means they will not feel any additional discomfort, as the number of minutes out of the comfort zone for the static version is approximately the same for a 1-day interval. Finally, when examining the effect of transfer learning on geographical changes, the proposed method demonstrated higher performance in countries where the temperature variation is lower, reducing time out comfort by one-third. If an agent receives a policy from a place where the environmental conditions are very different the proposed method will still work and find the best policy, but not as fast as receiving it from a similar place. © 2020, Springer Nature Singapore Pte Ltd.
引用
收藏
相关论文
共 29 条
[11]  
Grzes M., Kudenko D., Learning shaping rewards in model-based reinforcement learning, : Proceedings of AAMAS, 115, (2009)
[12]  
Wei T., Wang Y., Zhu Q., Deep reinforcement learning for building HVAC control, 54Th ACM/EDAC/IEEE Design Automation Conference (DAC). Austin
[13]  
20, pp. 1-6, (2017)
[14]  
Adam N., Hussain K., Farah C., Johan D., Deep Reinforcement Learning for Optimal Control of Space Heating, (2018)
[15]  
Wang Y., Velswamy K., Huang B., A long-short term memory recurrent neural network based reinforcement learning controller for office heating ventilation and air conditioning systems, Processes, (2017)
[16]  
Guanyu G., Jie L.I., Wen, Yonggang. Energy-efficient thermal comfort control in smart buildings via deep reinforcement, Learning, (2019)
[17]  
Shepherd A., Batty W., Fuzzy control strategies to provide cost and energy efficient high quality indoor environments in buildings with high occupant densities, Build Serv Eng Res Technol, 24, 1, pp. 35-45, (2003)
[18]  
Calvino F., La Gennusa M., Rizzo G., Scaccianoce G., The control of indoor thermal comfort conditions: introducing a fuzzy adaptive controller, Energy Build, 36, 2, pp. 97-102, (2004)
[19]  
Wei T., Design and management for energy-efficient cyber-physical systems, (2018)
[20]  
Taylor M.E., Stone P., Transfer learning for reinforcement learning domains: a survey, J Mach Learn Res, 10, pp. 1633-1685, (2009)