Optimal pivot path of the simplex method for linear programming based on reinforcement learning

被引:1
|
作者
Li, Anqi [1 ]
Guo, Tiande [1 ]
Han, Congying [1 ]
Li, Bonan [1 ]
Li, Haoran [1 ]
机构
[1] Univ Chinese Acad Sci, Sch Math Sci, Beijing 100049, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
simplex method; linear programming; pivot rules; reinforcement learning; CARLO TREE-SEARCH; OPTIMIZATION; NETWORKS; RULE; GAME; GO;
D O I
10.1007/s11425-022-2259-1
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Based on the existing pivot rules, the simplex method for linear programming is not polynomial in the worst case. Therefore, the optimal pivot of the simplex method is crucial. In this paper, we propose the optimal rule to find all the shortest pivot paths of the simplex method for linear programming problems based on Monte Carlo tree search. Specifically, we first propose the SimplexPseudoTree to transfer the simplex method into tree search mode while avoiding repeated basis variables. Secondly, we propose four reinforcement learning models with two actions and two rewards to make the Monte Carlo tree search suitable for the simplex method. Thirdly, we set a new action selection criterion to ameliorate the inaccurate evaluation in the initial exploration. It is proved that when the number of vertices in the feasible region is Cnm, our method can generate all the shortest pivot paths, which is the polynomial of the number of variables. In addition, we experimentally validate that the proposed schedule can avoid unnecessary search and provide the optimal pivot path. Furthermore, this method can provide the best pivot labels for all kinds of supervised learning methods to solve linear programming problems.
引用
收藏
页码:1263 / 1286
页数:24
相关论文
共 50 条
  • [1] Optimal pivot path of the simplex method for linear programming based on reinforcement learning
    Anqi Li
    Tiande Guo
    Congying Han
    Bonan Li
    Haoran Li
    Science China(Mathematics), 2024, 67 (06) : 1263 - 1286
  • [2] Reinforcement learning of simplex pivot rules: a proof of concept
    Suriyanarayana, Varun
    Tavaslioglu, Onur
    Patel, Ankit B.
    Schaefer, Andrew J.
    OPTIMIZATION LETTERS, 2022, 16 (08) : 2513 - 2525
  • [3] Reinforcement learning of simplex pivot rules: a proof of concept
    Varun Suriyanarayana
    Onur Tavaslıoğlu
    Ankit B. Patel
    Andrew J. Schaefer
    Optimization Letters, 2022, 16 : 2513 - 2525
  • [4] A Hybrid Linear Programming-Reinforcement Learning Method for Optimal Energy Hub Management
    Ghadertootoonchi, Alireza
    Moeini-Aghtaie, Moein
    Davoudi, Mehdi
    IEEE TRANSACTIONS ON SMART GRID, 2023, 14 (01) : 157 - 166
  • [5] Reinforcement Learning for Optimal Path Length of Nanobots Using Dynamic Programming
    Lambe, Amruta
    2017 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (ICIIS), 2017, : 414 - 419
  • [6] A new admissible pivot method for linear programming
    Lim, S
    Park, S
    ASIA-PACIFIC JOURNAL OF OPERATIONAL RESEARCH, 2004, 21 (04) : 421 - 434
  • [7] A Simplex Based Parametric Programming Method for the Large Linear Programming Problem
    Huang, Rujun
    Lou, Xinyuan
    INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTIST, IMECS 2012, VOL II, 2012, : 1486 - 1490
  • [8] COMPARISON OF PRIMAL-SIMPLEX AND COMPLEMENTARY PIVOT METHODS FOR LINEAR PROGRAMMING
    RAVINDRAN, A
    NAVAL RESEARCH LOGISTICS, 1973, 20 (01) : 95 - 100
  • [9] The Optimal Path Finding Algorithm Based on Reinforcement Learning
    Khekare, Ganesh
    Verma, Pushpneel
    Dhanre, Urvashi
    Raut, Seema
    Sheikh, Shahrukh
    INTERNATIONAL JOURNAL OF SOFTWARE SCIENCE AND COMPUTATIONAL INTELLIGENCE-IJSSCI, 2020, 12 (04): : 1 - 18
  • [10] The (Dantzig) simplex method for linear programming
    Nash, JC
    COMPUTING IN SCIENCE & ENGINEERING, 2000, 2 (01) : 29 - 31