共 37 条
[1]
[Anonymous], 1989, LEARNING DELAYED REW
[3]
Carroll J., 2006, MEASUREMENT ERROR NO, V2nd edn
[6]
Chakraborty B., 2013, Statistical Methods for Dynamic Treatment Regimes: Reinforcement Learning, Causal Inference, and Personalized Medicine, DOI DOI 10.1007/978-1-4614-7428-9
[10]
Fuller Wayne A., 1987, WILEY SERIES PROBABI