Optimal pivot path of the simplex method for linear programming based on reinforcement learning

被引:1
|
作者
Li, Anqi [1 ]
Guo, Tiande [1 ]
Han, Congying [1 ]
Li, Bonan [1 ]
Li, Haoran [1 ]
机构
[1] Univ Chinese Acad Sci, Sch Math Sci, Beijing 100049, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
simplex method; linear programming; pivot rules; reinforcement learning; CARLO TREE-SEARCH; OPTIMIZATION; NETWORKS; RULE; GAME; GO;
D O I
10.1007/s11425-022-2259-1
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Based on the existing pivot rules, the simplex method for linear programming is not polynomial in the worst case. Therefore, the optimal pivot of the simplex method is crucial. In this paper, we propose the optimal rule to find all the shortest pivot paths of the simplex method for linear programming problems based on Monte Carlo tree search. Specifically, we first propose the SimplexPseudoTree to transfer the simplex method into tree search mode while avoiding repeated basis variables. Secondly, we propose four reinforcement learning models with two actions and two rewards to make the Monte Carlo tree search suitable for the simplex method. Thirdly, we set a new action selection criterion to ameliorate the inaccurate evaluation in the initial exploration. It is proved that when the number of vertices in the feasible region is Cnm, our method can generate all the shortest pivot paths, which is the polynomial of the number of variables. In addition, we experimentally validate that the proposed schedule can avoid unnecessary search and provide the optimal pivot path. Furthermore, this method can provide the best pivot labels for all kinds of supervised learning methods to solve linear programming problems.
引用
收藏
页码:1263 / 1286
页数:24
相关论文
共 50 条
  • [31] Linear quadratic optimal control method based on output feedback inverse reinforcement Q-learning
    Liu, Wen
    Fan, Jia-Lu
    Xue, Wen-Qian
    Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2024, 41 (08): : 1469 - 1479
  • [32] SIMPLEX METHOD OF LINEAR PROGRAMMING USING LU DECOMPOSITION
    BARTELS, RH
    GOLUB, GH
    COMMUNICATIONS OF THE ACM, 1969, 12 (05) : 266 - &
  • [33] On a variant of the simplex method for a linear semidefinite programming problem
    Zhadan, V. G.
    TRUDY INSTITUTA MATEMATIKI I MEKHANIKI URO RAN, 2015, 21 (03): : 117 - 127
  • [34] Reinforcement Learning Based Optimal Control of Linear Singularly Perturbed Systems
    Zhao, Jianguo
    Yang, Chunyu
    Gao, Weinan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (03) : 1362 - 1366
  • [35] Personas-based Student Grouping using reinforcement learning and linear programming
    Ma, Shaojie
    Luo, Yawei
    Yang, Yi
    KNOWLEDGE-BASED SYSTEMS, 2023, 281
  • [36] Learning Path Recommendation Based on Reinforcement Learning
    Li, Ji
    Yu, Simiao
    Zhang, Tiancheng
    ENGINEERING LETTERS, 2024, 32 (09) : 1823 - 1832
  • [37] Multiconstrained Gliding Guidance Based on Optimal and Reinforcement Learning Method
    Zhe, Luo
    Xinsan, Li
    Lixin, Wang
    Qiang, Shen
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021
  • [38] Efficient Deep Reinforcement Learning for Optimal Path Planning
    Ren, Jing
    Huang, Xishi
    Huang, Raymond N.
    ELECTRONICS, 2022, 11 (21)
  • [39] Improved Robot Path Planning Method Based on Deep Reinforcement Learning
    Han, Huiyan
    Wang, Jiaqi
    Kuang, Liqun
    Han, Xie
    Xue, Hongxin
    SENSORS, 2023, 23 (12)
  • [40] A path planning method based on deep reinforcement learning for crowd evacuation
    Meng X.
    Liu H.
    Li W.
    Journal of Ambient Intelligence and Humanized Computing, 2024, 15 (6) : 2925 - 2939