Learning Stochastic Optimal Policies via Gradient Descent

被引:3
作者
Massaroli, Stefano [1 ]
Poli, Michael [2 ]
Peluchetti, Stefano [3 ]
Park, Jinkyoo [2 ]
Yamashita, Atsushi [1 ]
Asama, Hajime [1 ]
机构
[1] Univ Tokyo, Dept Precis Engn, Tokyo 1138656, Japan
[2] Korea Adv Inst Sci & Technol, Dept Ind & Syst, Daejeon 305335, South Korea
[3] Cogent Labs, Daejeon 305701, South Korea
来源
IEEE CONTROL SYSTEMS LETTERS | 2022年 / 6卷
基金
新加坡国家研究基金会;
关键词
Optimal control; Indium tin oxide; Stochastic processes; Process control; Optimization; Neural networks; Noise measurement; stochastic processes; machine learning; PORTFOLIO SELECTION; CONVERGENCE; ITO;
D O I
10.1109/LCSYS.2021.3086672
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We systematically develop a learning-based treatment of stochastic optimal control (SOC), relying on direct optimization of parametric control policies. We propose a derivation of adjoint sensitivity results for stochastic differential equations through direct application of variational calculus. Then, given an objective function for a predetermined task specifying the desiderata for the controller, we optimize their parameters via iterative gradient descent methods. In doing so, we extend the range of applicability of classical SOC techniques, often requiring strict assumptions on the functional form of system and control. We verify the performance of the proposed approach on a continuous-time, finite horizon portfolio optimization with proportional transaction costs.
引用
收藏
页码:1094 / 1099
页数:6
相关论文
共 50 条
  • [31] Parallel Adaptive Stochastic Gradient Descent Algorithms for Latent Factor Analysis of High-Dimensional and Incomplete Industrial Data
    Qin, Wen
    Luo, Xin
    Li, Shuai
    Zhou, MengChu
    [J]. IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (03) : 2716 - 2729
  • [32] Convergent Stochastic Almost Natural Gradient Descent
    Sanchez-Lopez, Borja
    Cerquides, Jesus
    [J]. ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2019, 319 : 54 - 63
  • [33] Unforgeability in Stochastic Gradient Descent
    Baluta, Teodora
    Nikolic, Ivica
    Jain, Racchit
    Aggarwal, Divesh
    Saxena, Prateek
    [J]. PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 1138 - 1152
  • [34] Machine learning for inverse lithography: using stochastic gradient descent for robust photomask synthesis
    Jia, Ningning
    Lam, Edmund Y.
    [J]. JOURNAL OF OPTICS, 2010, 12 (04)
  • [35] Communication-Efficient Distributed Learning via Sparse and Adaptive Stochastic Gradient
    Deng, Xiaoge
    Li, Dongsheng
    Sun, Tao
    Lu, Xicheng
    [J]. IEEE TRANSACTIONS ON BIG DATA, 2025, 11 (01) : 234 - 246
  • [36] Incremental PID Controller-Based Learning Rate Scheduler for Stochastic Gradient Descent
    Wang, Zenghui
    Zhang, Jun
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (05) : 7060 - 7071
  • [37] The Powerball Method With Biased Stochastic Gradient Estimation for Large-Scale Learning Systems
    Yang, Zhuang
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, : 7435 - 7447
  • [38] Gradient Descent Using Stochastic Circuits for Efficient Training of Learning Machines
    Liu, Siting
    Jiang, Honglan
    Liu, Leibo
    Han, Jie
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2018, 37 (11) : 2530 - 2541
  • [39] Adaptive stochastic gradient descent for optimal control of parabolic equations with random parameters
    Cao, Yanzhao
    Das, Somak
    Wyk, Hans-Werner
    [J]. NUMERICAL METHODS FOR PARTIAL DIFFERENTIAL EQUATIONS, 2022, 38 (06) : 2104 - 2122
  • [40] OPTIMAL SURVEY SCHEMES FOR STOCHASTIC GRADIENT DESCENT WITH APPLICATIONS TO M-ESTIMATION
    Clemencon, Stephan
    Bertail, Patrice
    Chautru, Emilie
    Papa, Guillaume
    [J]. ESAIM-PROBABILITY AND STATISTICS, 2019, 23 : 310 - 337