Deep Neural Networks Algorithms for Stochastic Control Problems on Finite Horizon: Numerical Applications

被引:35
作者
Bachouch, Achref [1 ]
Hure, Come [2 ]
Langrene, Nicolas [3 ]
Huyen Pham [2 ,4 ]
机构
[1] Malardalen Univ, Sch Educ Culture & Commun UKK, Div Math & Phys, Vasteras, Sweden
[2] Univ Paris Diderot, LPSM, Paris, France
[3] CSIRO Data61, RiskLab Australia, Docklands, Australia
[4] CREST ENSAE, Paris, France
关键词
Deep learning; Policy learning; Performance iteration; Value iteration; Monte Carlo; Quantization; SIMULATION;
D O I
10.1007/s11009-019-09767-9
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This paper presents several numerical applications of deep learning-based algorithms for discrete-time stochastic control problems in finite time horizon that have been introduced in Hure et al. (2018). Numerical and comparative tests using TensorFlow illustrate the performance of our different algorithms, namely control learning by performance iteration (algorithms NNcontPI and ClassifPI), control learning by hybrid iteration (algorithms Hybrid-Now and Hybrid-LaterQ), on the 100-dimensional nonlinear PDEs examples from Weinan et al. (2017) and on quadratic backward stochastic differential equations as in Chassagneux and Richou (2016). We also performed tests on low-dimension control problems such as an option hedging problem in finance, as well as energy storage problems arising in the valuation of gas storage and in microgrid management. Numerical results and comparisons to quantization-type algorithms Qknn, as an efficient algorithm to numerically solve low-dimensional control problems, are also provided.
引用
收藏
页码:143 / 178
页数:36
相关论文
共 20 条
[1]  
Alasseur Clemence, 2019, ESAIM: Proceedings and Surveys, V65, P46, DOI 10.1051/proc/201965046
[2]  
[Anonymous], 2018, ARXIV181204300
[3]   Finite-time analysis of the multiarmed bandit problem [J].
Auer, P ;
Cesa-Bianchi, N ;
Fischer, P .
MACHINE LEARNING, 2002, 47 (2-3) :235-256
[4]  
Balata Alessandro, 2019, ESAIM: Proceedings and Surveys, V65, P115, DOI 10.1051/proc/201965114
[5]   Hedging derivative securities and incomplete markets:: An ε-arbitrage approach [J].
Bertsimas, D ;
Kogan, L ;
Lo, AW .
OPERATIONS RESEARCH, 2001, 49 (03) :372-397
[6]   Valuation of energy storage: an optimal switching approach [J].
Carmona, Rene ;
Ludkovski, Michael .
QUANTITATIVE FINANCE, 2010, 10 (04) :359-374
[7]   Machine Learning for Semi Linear PDEs [J].
Chan-Wai-Nam, Quentin ;
Mikael, Joseph ;
Warin, Xavier .
JOURNAL OF SCIENTIFIC COMPUTING, 2019, 79 (03) :1667-1712
[8]   NUMERICAL SIMULATION OF QUADRATIC BSDES [J].
Chassagneux, Jean-Francois ;
Richou, Adrien .
ANNALS OF APPLIED PROBABILITY, 2016, 26 (01) :262-304
[9]   Deep Learning-Based Numerical Methods for High-Dimensional Parabolic Partial Differential Equations and Backward Stochastic Differential Equations [J].
E, Weinan ;
Han, Jiequn ;
Jentzen, Arnulf .
COMMUNICATIONS IN MATHEMATICS AND STATISTICS, 2017, 5 (04) :349-380
[10]  
Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1