Reinforcement learning with dynamic convex risk measures

被引：7

作者：

Coache, Anthony ^{[1
]}

Jaimungal, Sebastian ^{[1
,2
]}

机构：

[1] Univ Toronto, Dept Stat Sci, Toronto, ON, Canada

[2] Univ Oxford, Oxford Man Inst, Oxford, England

来源：

MATHEMATICAL FINANCE | 2024年 / 34卷 / 02期

基金：

加拿大自然科学与工程研究理事会;

关键词：

actor-critic algorithm; dynamic risk measures; financial hedging; policy gradient; reinforcement learning; robot control; time-consistency; trading strategies; APPROXIMATE; NETWORKS;

D O I：

10.1111/mafi.12388

中图分类号：

F8 [财政、金融];

学科分类号：

0202 ;

摘要：

We develop an approach for solving time-consistent risk-sensitive stochastic optimization problems using model-free reinforcement learning (RL). Specifically, we assume agents assess the risk of a sequence of random variables using dynamic convex risk measures. We employ a time-consistent dynamic programming principle to determine the value of a particular policy, and develop policy gradient update rules that aid in obtaining optimal policies. We further develop an actor-critic style algorithm using neural networks to optimize over policies. Finally, we demonstrate the performance and flexibility of our approach by applying it to three optimization problems: statistical arbitrage trading strategies, financial hedging, and obstacle avoidance robot control.

引用

页码：557 / 587

页数：31

共 67 条

[1] Acciaio B, 2011, ADVANCED MATHEMATICAL METHODS FOR FINANCE, P1, DOI 10.1007/978-3-642-18412-3_1
[2] Agarwal Alekh, 2021, JOURNAL OF MACHINE LEARNING RESEARCH, V22
[3] Ahmadi M, 2021, AAAI CONF ARTIF INTE, V35, P11718
[4] Al-Aradi A., 2018, SOLVING NONLINEAR HI
[5] [Anonymous], 2004, New RiskMeasures for the 21st Century
[6] Coherent measures of risk
Artzner, P
Delbaen, F
Eber, JM
Heath, D
[J]. MATHEMATICAL FINANCE, 1999, 9 (03) : 203 - 228
[7] Nash equilibria for relative investors via no-arbitrage arguments
Baeuerle, Nicole
Goell, Tamara
[J]. MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2023, 97 (01) : 1 - 23
[8] Minimizing spectral risk measures applied to Markov decision processes
Baeuerle, Nicole
Glauner, Alexander
[J]. MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2021, 94 (01) : 35 - 69
[9] Markov decision processes with recursive risk measures
Baeuerle, Nicole
Glauner, Alexander
[J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2022, 296 (03) : 953 - 966
[10] Bellemare MG, 2017, PR MACH LEARN RES, V70

← 1 2 3 4 5 6 7 →