A Cooperative Multi-Agent Deep Reinforcement Learning Framework for Real-Time Residential Load Scheduling

被引:22
作者
Zhang, Chi [1 ]
Kuppannagari, Sanmukh R. [2 ]
Xiong, Chuanxiu [1 ]
Kannan, Rajgopal [3 ]
Prasanna, Viktor K. [2 ]
机构
[1] Univ Southern Calif, Dept Comp Sci, Los Angeles, CA 90089 USA
[2] Univ Southern Calif, Ming Hsieh Dept Elect & Comp Engn, Los Angeles, CA USA
[3] US Army, Res Lab West, Playa Vista, CA USA
来源
PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTERNET OF THINGS DESIGN AND IMPLEMENTATION (IOTDI '19) | 2019年
基金
美国国家科学基金会;
关键词
multi-agent; deep reinforcement learning; smart home; real-time load scheduling; internet-of-things;
D O I
10.1145/3302505.3310069
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Internet-of-Things (IoT) enabled monitoring and control capabilities are enabling increasing numbers of household users with controllable loads to actively participate in smart grid energy management. Realizing an efficient real-time energy management system that takes advantage of these developments requires novel techniques for managing the increased complexity of the control action space in resolving multiple challenges such as the uncertainty in energy prices and renewable energy output along with the need to satisy physical grid constraints such as transformer capacity. Addressing these challenges, we develop a multi-household energy management framework for residential units connected to the same transformer and containing DERs such as PV, ESS and controllable loads. The goal of our framework is to schedule controllable house-hold appliances and ESS such that the cost of procuring electricity from the utility over a horizon is minimized while physical grid constraints are satisfied at each scheduling step. Traditional energy management frameworks either perform global optimization to satisfy grid constraints but suffer from high computational complexity (for example Integer Program, Mixed Integer Programming frameworks and centralized reinforcement learning) or perform decentralized real-time energy management without satisfying global grid constraints (for example multi-agent reinforcement learning with no cooperation). In contrast, we propose a cooperative multiagent reinforcement learning (MARL) framework that i) operates in real-time, and ii) performs explicit collaboration to satisfy global grid constraints. The novelty in our framework is two fold. Firstly, our framework trains multiple independent learners (IL) for each household in parallel using historical data and performs real-time inferencing of control actions using the most recent system state. Secondly, our framework contains a low complexity knapsack based cooperation agent which combines the outputs of ILs to minimize cost while satisfying grid constraints. Simulation results show that our cooperative MARL approach achieves significant cost improvement over centralized reinforcement learning and day-ahead planning baselines. Moreover, our approach strictly satisfies physical constraints with no apriori knowledge of system dynamics while the baseline approaches have occasional violations. We also measure the training and inference time by ranging the number of households from 1 to 25. Results show that our cooperative MARL approach scales best among various approaches.
引用
收藏
页码:59 / 69
页数:11
相关论文
共 29 条
[1]   Load Scheduling for Household Energy Consumption Optimization [J].
Agnetis, Alessandro ;
de Pascale, Gianluca ;
Detti, Paolo ;
Vicino, Antonio .
IEEE TRANSACTIONS ON SMART GRID, 2013, 4 (04) :2364-2373
[2]   A Framework for Volt-VAR Optimization in Distribution Systems [J].
Ahmadi, Hamed ;
Marti, Jose R. ;
Dommel, Hermann W. .
IEEE TRANSACTIONS ON SMART GRID, 2015, 6 (03) :1473-1483
[3]  
[Anonymous], 2012, SunShot Vision Study
[4]  
[Anonymous], 2018, TESLA POWERWALL COMP
[5]  
Busoniu L, 2010, STUD COMPUT INTELL, V310, P183
[6]  
Cheung V, 2016, OPENAI GYM
[7]  
DaftLogic, 2018, LIST POW CONS TYP HO
[8]  
Ergon Energy, 2018, CHARG YOUR EL VEH
[9]   Approximating multiobjective knapsack problems [J].
Erlebach, T ;
Kellerer, H ;
Pferschy, U .
MANAGEMENT SCIENCE, 2002, 48 (12) :1603-1612
[10]  
Farivar M, 2012, IEEE DECIS CONTR P, P3672, DOI 10.1109/CDC.2012.6425870