A tutorial on value function approximation for stochastic and dynamic transportation

被引:0
|
作者
Heinold, Arne [1 ]
机构
[1] Univ Kiel, Sch Econ & Business, Kiel, Germany
来源
4OR-A QUARTERLY JOURNAL OF OPERATIONS RESEARCH | 2024年 / 22卷 / 01期
关键词
Tutorial; Markov decision process; Approximate dynamic programming; Value function approximation; Reinforcement learning; OPTIMIZATION;
D O I
10.1007/s10288-023-00539-3
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
This paper provides an introductory tutorial on Value Function Approximation (VFA), a solution class from Approximate Dynamic Programming. VFA describes a heuristic way for solving sequential decision processes like a Markov Decision Process. Real-world problems in supply chain management (and beyond) containing dynamic and stochastic elements might be modeled as such processes, but large-scale instances are intractable to be solved to optimality by enumeration due to the curses of dimensionality. VFA can be a proper method for these cases and this tutorial is designed to ease its use in research, practice, and education. For this, the tutorial describes VFA in the context of stochastic and dynamic transportation and makes three main contributions. First, it gives a concise theoretical overview of VFA's fundamental concepts, outlines a generic VFA algorithm, and briefly discusses advanced topics of VFA. Second, the VFA algorithm is applied to the taxicab problem that describes an easy-to-understand transportation planning task. Detailed step-by-step results are presented for a small-scale instance, allowing readers to gain an intuition about VFA's main principles. Third, larger instances are solved by enhancing the basic VFA algorithm demonstrating its general capability to approach more complex problems. The experiments are done with artificial instances and the respective Python scripts are part of an electronic appendix. Overall, the tutorial provides the necessary knowledge to apply VFA to a wide range of stochastic and dynamic settings and addresses likewise researchers, lecturers, tutors, students, and practitioners.
引用
收藏
页码:145 / 173
页数:29
相关论文
共 50 条
  • [31] Stochastic Dual Dynamic Programming for transportation planning under demand uncertainty
    Fhoula, Boutheina
    Hajji, Adnene
    Rekik, Monia
    2013 INTERNATIONAL CONFERENCE ON ADVANCED LOGISTICS AND TRANSPORT (ICALT), 2013, : 550 - 555
  • [32] Anticipatory approach for dynamic and stochastic shipment matching in hinterland synchromodal transportation
    Guo, Wenjing
    Atasoy, Bilge
    van Blokland, Wouter Beelaerts
    Negenborn, Rudy R.
    FLEXIBLE SERVICES AND MANUFACTURING JOURNAL, 2022, 34 (02) : 483 - 517
  • [33] Greedy feature replacement for online value function approximation
    Zhao, Feng-fei
    Qin, Zheng
    Shao, Zhuo
    Fang, Jun
    Ren, Bo-yan
    JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2014, 15 (03): : 223 - 231
  • [34] Greedy feature replacement for online value function approximation
    Feng-fei Zhao
    Zheng Qin
    Zhuo Shao
    Jun Fang
    Bo-yan Ren
    Journal of Zhejiang University SCIENCE C, 2014, 15 : 223 - 231
  • [35] Greedy feature replacement for online value function approximation
    Feng-fei ZHAO
    Zheng QIN
    Zhuo SHAO
    Jun FANG
    Bo-yan REN
    Journal of Zhejiang University-Science C(Computers & Electronics), 2014, 15 (03) : 223 - 231
  • [36] Control of a Water Tank System with Value Function Approximation
    Lalvani, Shamal
    Katsaggelos, Aggelos
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2023, PT I, 2023, 675 : 36 - 44
  • [37] Restricted gradient-descent algorithm for value-function approximation in reinforcement learning
    Salles Barreto, Andre da Motta
    Anderson, Charles W.
    ARTIFICIAL INTELLIGENCE, 2008, 172 (4-5) : 454 - 482
  • [38] OPTIMIZATION OF DYNAMIC RAMP METERING CONTROL WITH SIMULTANEOUS PERTURBATION STOCHASTIC APPROXIMATION
    Chien, S. I.
    Luo, J.
    CONTROL AND INTELLIGENT SYSTEMS, 2008, 36 (01)
  • [39] Value function gradient learning for large-scale multistage stochastic programming problems
    Lee, Jinkyu
    Bae, Sanghyeon
    Kim, Woo Chang
    Lee, Yongjae
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2023, 308 (01) : 321 - 335
  • [40] Discrete Simultaneous Perturbation Stochastic Approximation on Loss Function with Noisy Measurements
    Wang, Qi
    Spall, James C.
    2011 AMERICAN CONTROL CONFERENCE, 2011, : 4520 - 4525