Multiobjective Combinatorial Optimization Using a Single Deep Reinforcement Learning Model

被引：26

作者：

Wang, Zhenkun ^{[1
,2
]}

Yao, Shunyu ^{[1
]}

Li, Genghui ^{[1
]}

Zhang, Qingfu ^{[3
]}

机构：

[1] Southern Univ Sci & Technol, Sch Syst Design & Intelligent Mfg, Shenzhen 518055, Peoples R China

[2] Southern Univ Sci & Technol, Dept Comp Sci & Engn, Shenzhen 518055, Peoples R China

[3] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2024年 / 54卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Attention mechanism; combinatorial optimization (CO); deep reinforcement learning; multiobjective optimization; routing network; traveling salesman problem; EVOLUTIONARY ALGORITHM; LOCAL SEARCH; DECOMPOSITION;

D O I：

10.1109/TCYB.2023.3312476

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This article proposes utilizing a single deep reinforcement learning model to solve combinatorial multiobjective optimization problems. We use the well-known multiobjective traveling salesman problem (MOTSP) as an example. Our proposed method employs an encoder-decoder framework to learn the mapping from the MOTSP instance to its Paretooptimal set. Specifically, it leverages a novel routing encoder to extract information for both the entire multiobjective aspect and every individual objective from the MOTSP instance. The global embeddings and each objective's embeddings are adaptively aggregated via a routing network to form the subproblems' embedding that can well represent the MOTSP features. Using a modified context embedding, the subproblems' embeddings are fed into a decoder to produce a set of approximate Paretooptimal solutions in parallel. Additionally, we develop a Top-k baseline to enable more efficient data utilization and lightweight training for our proposed method. We compare our method with heuristic-based and learning-based ones on various types of MOTSP instances, and the experimental results show that our method can solve MOTSP instances in real-time and outperform the other algorithms, especially on large-scale problem instances.

引用

页码：1984 / 1996

页数：13

共 69 条

[1] Multiobjective Flexible Job-Shop Rescheduling With New Job Insertion and Machine Preventive Maintenance [J].

An, Youjun ;

Chen, Xiaohui ;

Gao, Kaizhou ;

Li, Yinghe ;

Zhang, Lin .

IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (05) :3101-3113

[2]

Ba J.L., 2016, arXiv preprint arXiv:1607.06450, DOI DOI 10.48550/ARXIV.1607.06450

[3] HypE: An Algorithm for Fast Hypervolume-Based Many-Objective Optimization [J].

Bader, Johannes ;

Zitzler, Eckart .

EVOLUTIONARY COMPUTATION, 2011, 19 (01) :45-76

[4]

Bansal S., 2018, Asian Journal of Computer Science and Technology, V7, P1, DOI DOI 10.1007/S00216-018-1378-Y

[5]

Bello I., 2017, P INT C LEARN REPR

[6] Machine learning for combinatorial optimization: A methodological tour d'horizon [J].

Bengio, Yoshua ;

Lodi, Andrea ;

Prouvost, Antoine .

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2021, 290 (02) :405-421

[7] Decomposition-Based Lin-Kernighan Heuristic With Neighborhood Structure Transfer for Multi/Many-Objective Traveling Salesman Problem [J].

Cai, Xinye ;

Wang, Kang ;

Mei, Yi ;

Li, Zhenhua ;

Zhao, Jun ;

Zhang, Qingfu .

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2023, 27 (06) :1604-1617

[8] A Grid-Based Inverted Generational Distance for Multi/Many-Objective Optimization [J].

Cai, Xinye ;

Xiao, Yushun ;

Li, Miqing ;

Hu, Han ;

Ishibuchi, Hisao ;

Li, Xiaoping .

IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2021, 25 (01) :21-34

[9] The Collaborative Local Search Based on Dynamic-Constrained Decomposition With Grids for Combinatorial Multiobjective Optimization [J].

Cai, Xinye ;

Xia, Chao ;

Zhang, Qingfu ;

Mei, Zhiwei ;

Hu, Han ;

Wang, Lisong ;

Hu, Jun .

IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (05) :2639-2650

[10]

Chen XY, 2019, ADV NEUR IN, V32

← 1 2 3 4 5 6 7 →