Learning Stochastic Optimal Policies via Gradient Descent

被引:3
作者
Massaroli, Stefano [1 ]
Poli, Michael [2 ]
Peluchetti, Stefano [3 ]
Park, Jinkyoo [2 ]
Yamashita, Atsushi [1 ]
Asama, Hajime [1 ]
机构
[1] Univ Tokyo, Dept Precis Engn, Tokyo 1138656, Japan
[2] Korea Adv Inst Sci & Technol, Dept Ind & Syst, Daejeon 305335, South Korea
[3] Cogent Labs, Daejeon 305701, South Korea
来源
IEEE CONTROL SYSTEMS LETTERS | 2022年 / 6卷
基金
新加坡国家研究基金会;
关键词
Optimal control; Indium tin oxide; Stochastic processes; Process control; Optimization; Neural networks; Noise measurement; stochastic processes; machine learning; PORTFOLIO SELECTION; CONVERGENCE; ITO;
D O I
10.1109/LCSYS.2021.3086672
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We systematically develop a learning-based treatment of stochastic optimal control (SOC), relying on direct optimization of parametric control policies. We propose a derivation of adjoint sensitivity results for stochastic differential equations through direct application of variational calculus. Then, given an objective function for a predetermined task specifying the desiderata for the controller, we optimize their parameters via iterative gradient descent methods. In doing so, we extend the range of applicability of classical SOC techniques, often requiring strict assumptions on the functional form of system and control. We verify the performance of the proposed approach on a continuous-time, finite horizon portfolio optimization with proportional transaction costs.
引用
收藏
页码:1094 / 1099
页数:6
相关论文
共 50 条
  • [21] Learning Rates for Stochastic Gradient Descent With Nonconvex Objectives
    Lei, Yunwen
    Tang, Ke
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (12) : 4505 - 4511
  • [22] Stochastic Gradient Descent in Continuous Time
    Sirignano, Justin
    Spiliopoulos, Konstantinos
    SIAM JOURNAL ON FINANCIAL MATHEMATICS, 2017, 8 (01): : 933 - 961
  • [23] Efficiency Ordering of Stochastic Gradient Descent
    Hu, Jie
    Doshi, Vishwaraj
    Eun, Do Young
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [24] SA-MARL: Novel Self-Attention-Based Multi-Agent Reinforcement Learning With Stochastic Gradient Descent
    Younas, Rabbiya
    Rehman, Hafiz Muhammad Raza Ur
    Lee, Ingyu
    On, Byung-Won
    Yi, Sungwon
    Choi, Gyu Sang
    IEEE ACCESS, 2025, 13 : 35674 - 35687
  • [25] OPTIMAL GRADIENT DESCENT LEARNING FOR BIDIRECTIONAL ASSOCIATIVE MEMORIES
    PERFETTI, R
    ELECTRONICS LETTERS, 1993, 29 (17) : 1556 - 1557
  • [26] Adadb: Adaptive Diff-Batch Optimization Technique for Gradient Descent
    Khan, Muhammad U. S.
    Jawad, Muhammad
    Khan, Samee U.
    IEEE ACCESS, 2021, 9 : 99581 - 99588
  • [27] Energy-entropy competition and the effectiveness of stochastic gradient descent in machine learning
    Zhang, Yao
    Saxe, Andrew M.
    Advani, Madhu S.
    Lee, Alpha A.
    MOLECULAR PHYSICS, 2018, 116 (21-22) : 3214 - 3223
  • [28] Overlap Removal by Stochastic Gradient Descent With(out) Shape Awareness
    Giovannangeli, Loann
    Lalanne, Frederic
    Giot, Romain
    Bourqui, Romain
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (12) : 7500 - 7517
  • [29] Convergence in High Probability of Distributed Stochastic Gradient Descent Algorithms
    Lu, Kaihong
    Wang, Hongxia
    Zhang, Huanshui
    Wang, Long
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (04) : 2189 - 2204
  • [30] A Limitation of Gradient Descent Learning
    Sum, John
    Leung, Chi-Sing
    Ho, Kevin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (06) : 2227 - 2232