Reinforcement learning for control of flexibility providers in a residential microgrid

被引:31
作者
Mbuwir, Brida V. [1 ,2 ,3 ]
Geysen, Davy [1 ,2 ]
Spiessens, Fred [1 ,2 ]
Deconinck, Geert [2 ,3 ]
机构
[1] VITO, Boeretang 200, B-2400 Mol, Belgium
[2] EnergyVille, Thor Pk, B-3600 Genk, Belgium
[3] Katholieke Univ Leuven, ESAT Electa, Kasteelpk Arenberg 10 Bus 2445, B-3001 Leuven, Belgium
关键词
learning (artificial intelligence); multi-agent systems; power consumption; distributed power generation; photovoltaic power systems; power engineering computing; iterative methods; power generation scheduling; heat pumps; stochastic processes; control engineering computing; power generation control; residential microgrid; smart grid paradigm; smart meters; machine learning; model-free reinforcement learning techniques; single-agent stochastic microgrid settings; rule-based controller; model-based optimal controller; electricity consumption patterns; power system planning; RL techniques; policy iteration; PI; fitted Q-iteration; FQI; heat pump; multiagent collaborative microgrid settings; photovoltaic production; MODEL-PREDICTIVE CONTROL; ENERGY MANAGEMENT;
D O I
10.1049/iet-stg.2019.0196
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The smart grid paradigm and the development of smart meters have led to the availability of large volumes of data. This data is expected to assist in power system planning/operation and the transition from passive to active electricity users. With recent advances in machine learning, this data can be used to learn system dynamics. This study explores two model-free reinforcement learning (RL) techniques - policy iteration (PI) and fitted Q-iteration (FQI) for scheduling the operation of flexibility providers - battery and heat pump in a residential microgrid. The proposed algorithms are data-driven and can be easily generalised to fit the control of any flexibility provider without requiring expert knowledge to build a detailed model of the flexibility provider and/or microgrid. The algorithms are tested in multi-agent collaborative and single-agent stochastic microgrid settings - with the uncertainty due to lack of knowledge on future electricity consumption patterns and photovoltaic production. Simulation results show that PI outperforms FQI with a 7.2% increase in photovoltaic self-consumption in the multi-agent setting and a 3.7% increase in the single-agent setting. Both RL algorithms perform better than a rule-based controller, and compete with a model-based optimal controller, and are thus, a valuable alternative to model- and rule-based controllers.
引用
收藏
页码:98 / 107
页数:10
相关论文
共 35 条
[1]  
[Anonymous], 2010, IEEE PES GEN M
[2]  
[Anonymous], P AD LEARN AG WORKSH
[3]  
[Anonymous], 2015, Reinforcement Learning: An Introduction
[4]  
[Anonymous], 2017, IEEE PES INNOVATIVE, DOI DOI 10.1109/ISGTEUROPE.2017.8260152
[5]   The complexity of decentralized control of Markov decision processes [J].
Bernstein, DS ;
Givan, R ;
Immerman, N ;
Zilberstein, S .
MATHEMATICS OF OPERATIONS RESEARCH, 2002, 27 (04) :819-840
[6]  
Busoniu L, 2010, AUTOM CONTROL ENG SE, P1, DOI 10.1201/9781439821091-f
[7]  
Driesse A, 2008, IEEE PHOT SPEC CONF, P1935
[8]  
Ernst D, 2005, J MACH LEARN RES, V6, P503
[9]  
Fonteneau R., 2016, EUROPEAN WORKSHOP RE
[10]   Development of a simplified and accurate building model based on electrical analogy [J].
Fraisse, G ;
Viardot, C ;
Lafabrie, O ;
Achard, G .
ENERGY AND BUILDINGS, 2002, 34 (10) :1017-1031