Reward shaping-based deep reinforcement learning for look-ahead dispatch with rolling-horizon

被引:0
作者
Xu, Hongsheng [1 ]
Xu, Yungui [1 ]
Wang, Ke [1 ]
Li, Yaping [2 ]
Al Ahad, Abdullah [1 ]
机构
[1] Hohai Univ, Sch Elect & Power Engn, Nanjing 211100, Peoples R China
[2] China Elect Power Res Inst, Dept Power Automat, Nanjing 210000, Peoples R China
关键词
Look-ahead dispatch; Rolling-horizon; Deep reinforcement learning; Reward shaping; Soft actor-critic;
D O I
10.1016/j.ijepes.2025.110673
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The increasing penetration of renewable energy exacerbates the challenges in designing an effective and adaptable model-driven Look-ahead Dispatch (LAD) method. Recently, deep reinforcement learning (DRL) methods show enormous potential in developing a dispatching agent with self-learning ability, attributed to their superior generalization, adaptability, and computational efficiency. However, existing DRL-based LAD methods overlook the discounting effect when calculating the immediate total reward for LAD and lack attention to trial-and-error reward design and expected discounted returns that could reflect the true performance metrics of LAD. Therefore, this paper proposes novel reward shaping (RS)-based DRL algorithms for the rolling-horizon LAD problem. We propose the method for accurately estimating the look-ahead discounted factor that best matches different look-ahead horizons (LAHs). The shaped reward functions are designed and an RS-based regularization is also proposed by employing a potential function. Case studies on the SG 126-bus and IEEE 118-bus systems demonstrate the effectiveness of the proposed improved measures, as well as the superiority and adaptability of the proposed improved DRL algorithms in training and testing performance. (c) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页数:21
相关论文
共 45 条
  • [1] [Anonymous], 2021, Intelligent arrangement of grid operation organization
  • [2] Multiple Time Resolution Stochastic Scheduling for Systems With High Renewable Penetration
    Bakirtzis, Emmanouil A.
    Biskas, Pandelis N.
    [J]. IEEE TRANSACTIONS ON POWER SYSTEMS, 2017, 32 (02) : 1030 - 1040
  • [3] Booth S, 2023, AAAI CONF ARTIF INTE, P5920
  • [4] A scalable graph reinforcement learning algorithm based stochastic dynamic dispatch of power system under high penetration of renewable energy
    Chen, Junbin
    Yu, Tao
    Pan, Zhenning
    Zhang, Mengyue
    Deng, Bairong
    [J]. INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2023, 152
  • [5] Improved Proximal Policy Optimization Algorithm for Sequential Security-Constrained Optimal Power Flow Based on Expert Knowledge and Safety Layer
    Chen, Yanbo
    Du, Qintao
    Liu, Honghai
    Cheng, Liangcheng
    Younis, Muhammad Shahzad
    [J]. JOURNAL OF MODERN POWER SYSTEMS AND CLEAN ENERGY, 2024, 12 (03) : 742 - 753
  • [6] Cheng L, 2023, CSEE Journal of Power and Energy Systems
  • [7] [成梁成 Cheng Liangcheng], 2024, [电网技术, Power System Technology], V48, P3133
  • [8] [冯斌 Feng Bin], 2023, [电力系统自动化, Automation of Electric Power Systems], V47, P187
  • [9] Fujimoto S, 2018, PR MACH LEARN RES, V80
  • [10] Haarnoja T, 2019, Arxiv, DOI arXiv:1812.05905