Hedging of financial derivative contracts via Monte Carlo tree search

被引:0
作者
Szehr, Oleg [1 ]
机构
[1] SUPSI USI, Dalle Molle Inst Artificial Intelligence IDSIA, Via La Santa 1, CH-6962 Lugano, Switzerland
关键词
reinforcement learning; Monte Carlo tree search (MCTS); pricing and hedging of derivative contracts; AlphaZero; utility optimization; CONTINGENT CLAIMS; ALGORITHM; OPTIONS; GAME; GO;
D O I
10.21314/JCF.2023.009
中图分类号
F8 [财政、金融];
学科分类号
0202 ;
摘要
The construction of replication strategies for the pricing and hedging of derivative contracts in incomplete markets is a key problem in financial engineering. We interpret this problem as a "game with the world", where one player (the investor) bets on what will happen and the other player (the market) decides what will happen. Inspired by the success of the Monte Carlo tree search (MCTS) in a variety of games and stochastic multiperiod planning problems, we introduce this algorithm as a method for replication in the presence of risk and market friction. Unlike modelfree reinforcement learning methods (such as Q-learning), MCTS makes explicit use of an environment model. The role of this model is taken by a market simulator, which is frequently adopted even in the training of model-free methods, but its use allows MCTS to plan for the consequences of decisions prior to the execution of actions. We conduct experiments with the AlphaZero variant of MCTS on toy examples of simple market models and derivatives with simple payoff structures. We show that MCTS is capable of maximizing the utility of the investor's terminal wealth in a setting where no external pricing information is available and rewards are granted only as a result of contractual cashflows. In this setting, we observe that MCTS hassuperior performance compared with the deep Q-network algorithm and comparable performance to "deep-hedging" methods.
引用
收藏
页码:47 / 80
页数:34
相关论文
共 60 条
[2]  
[Anonymous], 2006, The Mathematics of Arbitrage
[3]  
[Anonymous], 2008, AAAI
[4]  
Anthony T, 2017, ADV NEUR IN, V30
[5]   Finite-time analysis of the multiarmed bandit problem [J].
Auer, P ;
Cesa-Bianchi, N ;
Fischer, P .
MACHINE LEARNING, 2002, 47 (2-3) :235-256
[6]   A STOCHASTIC-CONTROL APPROACH TO THE PRICING OF OPTIONS [J].
BARRON, EN ;
JENSEN, R .
MATHEMATICS OF OPERATIONS RESEARCH, 1990, 15 (01) :49-79
[7]  
Bisi L, 2020, PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P4583
[8]   PRICING OF OPTIONS AND CORPORATE LIABILITIES [J].
BLACK, F ;
SCHOLES, M .
JOURNAL OF POLITICAL ECONOMY, 1973, 81 (03) :637-654
[9]  
BOUCHAUD JP, 1994, J PHYS I, V4, P863
[10]  
Boyle Phelim, 1986, International Options Journal, V3, P7