Simulated annealing with reinforcement learning for the set team orienteering problem with time windows

被引:7
作者
Yu, Vincent F. [1 ,2 ]
Salsabila, Nabila Yuraisyah [1 ]
Lin, Shih-Wei [3 ,4 ,5 ]
Gunawan, Aldy [6 ]
机构
[1] Natl Taiwan Univ Sci & Technol, Dept Ind Management, Taipei, Taiwan
[2] Natl Taiwan Univ Sci & Technol, Ctr Cyber Phys Syst Innovat, Taipei, Taiwan
[3] Chang Gung Univ, Dept Informat Management, Taoyuan, Taiwan
[4] Ming Chi Univ Technol, Dept Ind Engn & Management, Taipei, Taiwan
[5] Keelung Chang Gung Mem Hosp, Dept Emergency Med, Keelung City, Taiwan
[6] Singapore Management Univ, Sch Comp & Informat Syst, Singapore, Singapore
关键词
Team orienteering problem with time windows; Set orienteering problem; Simulated annealing; LOCAL SEARCH; HEURISTICS; ALGORITHMS;
D O I
10.1016/j.eswa.2023.121996
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This research investigates the Set Team Orienteering Problem with Time Windows (STOPTW), a new variant of the well-known Team Orienteering Problem with Time Windows and Set Orienteering Problem. In the STOPTW, customers are grouped into clusters. Each cluster is associated with a profit attainable when a customer in the cluster is visited within the customer's time window. A Mixed Integer Linear Programming model is formulated for STOPTW to maximizing total profit while adhering to time window constraints. Since STOPTW is an NP-hard problem, a Simulated Annealing with Reinforcement Learning (SARL) algorithm is developed. The proposed SARL incorporates the core concepts of reinforcement learning, utilizing the epsilon-greedy algorithm to learn the fitness values resulting from neighborhood moves. Numerical experiments are conducted to assess the performance of SARL, comparing the results with those obtained by CPLEX and Simulated Annealing (SA). For small instances, both SARL and SA algorithms outperform CPLEX by obtaining eight optimal solutions and 12 better solutions. For large instances, both algorithms obtain better solutions to 28 out of 29 instances within shorter computational times compared to CPLEX. Overall, SARL outperforms SA by resulting in lower gap percentages within the same computational times. Specifically, SARL outperforms SA in solving 13 large STOPTW benchmark instances. Finally, a sensitivity analysis is conducted to derive managerial insights.
引用
收藏
页数:15
相关论文
共 45 条
  • [1] Effective neighborhood search with optimal splitting and adaptive memory for the team orienteering problem with time windows
    Amarouche, Youcef
    Guibadj, Rym Nesrine
    Chaalal, Elhadja
    Moukrim, Aziz
    [J]. COMPUTERS & OPERATIONS RESEARCH, 2020, 123
  • [2] The Clustered Orienteering Problem
    Angelelli, E.
    Archetti, C.
    Vindigni, M.
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2014, 238 (02) : 404 - 414
  • [3] Metaheuristics for the team orienteering problem
    Archetti, Claudia
    Hertz, Alain
    Speranza, Maria Grazia
    [J]. JOURNAL OF HEURISTICS, 2007, 13 (01) : 49 - 76
  • [4] The Set Orienteering Problem
    Archetti, Claudia
    Carrabs, Francesco
    Cerulli, Raffaele
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2018, 267 (01) : 264 - 272
  • [5] Archetti C, 2014, MOS-SIAM SER OPTIMIZ, P273
  • [6] Boussier S., 2006, An exact algorithm for team orienteering problems, V4or, P211
  • [7] The team orienteering problem
    Chao, IM
    Golden, BL
    Wasil, EA
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1996, 88 (03) : 464 - 474
  • [8] A fast and effective heuristic for the orienteering problem
    Chao, IM
    Golden, BL
    Wasil, EA
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1996, 88 (03) : 475 - 489
  • [9] Dontas M., 2023, European Journal of Operational Research
  • [10] A multi-objective open set orienteering problem
    Dutta, Joydeep
    Barma, Partha Sarathi
    Mukherjee, Anupam
    Kar, Samarjit
    De, Tanmay
    [J]. NEURAL COMPUTING & APPLICATIONS, 2020, 32 (17) : 13953 - 13969