Learning Stochastic Optimal Policies via Gradient Descent

被引：3

作者：

Massaroli, Stefano ^{[1
]}

Poli, Michael ^{[2
]}

Peluchetti, Stefano ^{[3
]}

Park, Jinkyoo ^{[2
]}

Yamashita, Atsushi ^{[1
]}

Asama, Hajime ^{[1
]}

机构：

[1] Univ Tokyo, Dept Precis Engn, Tokyo 1138656, Japan

[2] Korea Adv Inst Sci & Technol, Dept Ind & Syst, Daejeon 305335, South Korea

[3] Cogent Labs, Daejeon 305701, South Korea

来源：

IEEE CONTROL SYSTEMS LETTERS | 2022年 / 6卷

基金：

新加坡国家研究基金会;

关键词：

Optimal control; Indium tin oxide; Stochastic processes; Process control; Optimization; Neural networks; Noise measurement; stochastic processes; machine learning; PORTFOLIO SELECTION; CONVERGENCE; ITO;

D O I：

10.1109/LCSYS.2021.3086672

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We systematically develop a learning-based treatment of stochastic optimal control (SOC), relying on direct optimization of parametric control policies. We propose a derivation of adjoint sensitivity results for stochastic differential equations through direct application of variational calculus. Then, given an objective function for a predetermined task specifying the desiderata for the controller, we optimize their parameters via iterative gradient descent methods. In doing so, we extend the range of applicability of classical SOC techniques, often requiring strict assumptions on the functional form of system and control. We verify the performance of the proposed approach on a continuous-time, finite horizon portfolio optimization with proportional transaction costs.

引用

页码：1094 / 1099

页数：6

共 50 条

[41] Robust Stochastic Gradient Descent With Student-t Distribution Based First-Order Momentum [J].

Ilboudo, Wendyam Eric Lionel ;

Kobayashi, Taisuke ;

Sugimoto, Kenji .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (03) :1324-1337

[42] Sign Based Derivative Filtering for Stochastic Gradient Descent [J].

Berestizshevsky, Konstantin ;

Even, Guy .

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: DEEP LEARNING, PT II, 2019, 11728 :208-219

[43] Brain Source Localization Using Stochastic Gradient Descent [J].

Al-Momani, Sajedah ;

Mir, Hasan ;

Al-Nashash, Hasan ;

Al-Kaylani, Muhammad .

IEEE SENSORS JOURNAL, 2021, 21 (06) :8375-8383

[44] Dendrite morphological neurons trained by stochastic gradient descent [J].

Zamora, Erik ;

Sossa, Humberto .

NEUROCOMPUTING, 2017, 260 :420-431

[45] Guided Stochastic Gradient Descent Algorithm for inconsistent datasets [J].

Sharma, Anuraganand .

APPLIED SOFT COMPUTING, 2018, 73 :1068-1080

[46] The Improved Stochastic Fractional Order Gradient Descent Algorithm [J].

Yang, Yang ;

Mo, Lipo ;

Hu, Yusen ;

Long, Fei .

FRACTAL AND FRACTIONAL, 2023, 7 (08)

[47] Adjusted stochastic gradient descent for latent factor analysis [J].

Li, Qing ;

Xiong, Diwen ;

Shang, Mingsheng .

INFORMATION SCIENCES, 2022, 588 :196-213

[48] Online Projected Gradient Descent for Stochastic Optimization With Decision-Dependent Distributions [J].

Wood, Killian ;

Bianchin, Gianluca ;

Dall'Anese, Emiliano .

IEEE CONTROL SYSTEMS LETTERS, 2022, 6 :1646-1651

[49] Randomized Stochastic Gradient Descent Ascent [J].

Sebbouh, Othmane ;

Cuturi, Marco ;

Peyre, Gabriel .

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151

[50] Stability and Fuzzy Optimal Control for Nonlinear Ito Stochastic Markov Jump Systems via Hybrid Reinforcement Learning [J].

Pang, Zhen ;

Wang, Hai ;

Cheng, Jun ;

Tang, Shengda ;

Park, Ju H. .

IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2024, 32 (11) :6472-6485

← 1 2 3 4 5 →