RANDOMIZED DYNAMIC PROGRAMMING PRINCIPLE AND FEYNMAN-KAC REPRESENTATION FOR OPTIMAL CONTROL OF MCKEAN-VLASOV DYNAMICS

被引:57
作者
Bayraktar, Erhan [1 ]
Cosso, Andrea [2 ,3 ]
Pham, Huyen [4 ,5 ]
机构
[1] Univ Michigan, Dept Math, 530 Church St, Ann Arbor, MI 48109 USA
[2] Politecn Milan, Dipartimento Matemat, Via Bonardi 9, I-20133 Milan, Italy
[3] Univ Bologna, Dipartimento Matemat, Piazza Porta S Donato 5, I-40126 Bologna, Italy
[4] Univ Paris Diderot, Lab Probabilites & Modeles Aleatoires, CNRS, UMR 7599, F-75205 Paris 13, France
[5] Univ Paris Diderot, Lab Probabilites & Modeles Aleatoires, CNRS, UMR 7599, Crest, France
基金
美国国家科学基金会;
关键词
Controlled McKean-Vlasov stochastic differential equations; dynamic programming principle; randomization method; forward-backward stochastic differential equations; STOCHASTIC DIFFERENTIAL-EQUATIONS; BACKWARD SDE REPRESENTATION; MEAN-FIELD GAMES;
D O I
10.1090/tran/7118
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
We analyze a stochastic optimal control problem, where the state process follows a McKean-Vlasov dynamics and the diffusion coefficient can be degenerate. We prove that its value function V admits a nonlinear FeynmanKac representation in terms of a class of forward-backward stochastic differential equations, with an autonomous forward process. We exploit this probabilistic representation to rigorously prove the dynamic programming principle (DPP) for V. The Feynman-Kac representation we obtain has an important role beyond its intermediary role in obtaining our main result: in fact it would be useful in developing probabilistic numerical schemes for V. The DPP is important in obtaining a characterization of the value function as a solution of a nonlinear partial differential equation (the so-called HamiltonJacobi-Belman equation), in this case on the Wasserstein space of measures. We should note that the usual way of solving these equations is through the Pontryagin maximum principle, which requires some convexity assumptions. There were attempts in using the dynamic programming approach before, but these works assumed a priori that the controls were of Markovian feedback type, which helps write the problem only in terms of the distribution of the state process (and the control problem becomes a deterministic problem). In this paper, we will consider open-loop controls and derive the dynamic programming principle in this most general case. In order to obtain the FeynmanKac representation and the randomized dynamic programming principle, we implement the so-called randomization method, which consists of formulating a new McKean-Vlasov control problem, expressed in weak form taking the supremum over a family of equivalent probability measures. One of the main results of the paper is the proof that this latter control problem has the same value function V of the original control problem.
引用
收藏
页码:2115 / 2160
页数:46
相关论文
共 32 条
[1]   A Maximum Principle for SDEs of Mean-Field Type [J].
Andersson, Daniel ;
Djehiche, Boualem .
APPLIED MATHEMATICS AND OPTIMIZATION, 2011, 63 (03) :341-356
[2]  
[Anonymous], 1999, FUNDAMENTAL PRINCIPL
[3]  
Aronszajn N., 1956, Pacific J. Math., V6, P405
[4]  
Bain A, 2009, STOCH MOD APPL PROBA, V60, P1, DOI 10.1007/978-0-387-76896-0_1
[5]  
Bandini E., 2015, ARXIV151109274V1
[6]   On the interpretation of the Master Equation [J].
Bensoussan, A. ;
Frehse, J. ;
Yam, S. C. P. .
STOCHASTIC PROCESSES AND THEIR APPLICATIONS, 2017, 127 (07) :2093-2137
[7]  
BENSOUSSAN A., 2013, SPRINGER BRIEFS MATH
[8]  
Bertsekas D., 1996, NeuroDynamic Programming
[9]   MEAN-FIELD STOCHASTIC DIFFERENTIAL EQUATIONS AND ASSOCIATED PDES [J].
Buckdahn, Rainer ;
Li, Juan ;
Peng, Shige ;
Rainer, Catherine .
ANNALS OF PROBABILITY, 2017, 45 (02) :824-878
[10]   A General Stochastic Maximum Principle for SDEs of Mean-field Type [J].
Buckdahn, Rainer ;
Djehiche, Boualem ;
Li, Juan .
APPLIED MATHEMATICS AND OPTIMIZATION, 2011, 64 (02) :197-216