Implementing a GPU-based parallel MAX-MIN Ant System

被引:19
作者
Skinderowicz, Rafal [1 ]
机构
[1] Univ Silesia Katowice, Fac Sci & Technol, Bedzinska 39, PL-41205 Sosnowiec, Poland
来源
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2020年 / 106卷
关键词
Parallel MAX-MIN Ant System; Weighted reservoir sampling; Ant Colony Optimization; GPU; CUDA; COLONY OPTIMIZATION ALGORITHM;
D O I
10.1016/j.future.2020.01.011
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The MAX-MIN Ant System (MMAS) is one of the best-known Ant Colony Optimization (ACO) algorithms proven to be efficient at finding satisfactory solutions to many difficult combinatorial optimization problems. The slow-down in Moore's law, and the availability of graphics processing units (GPUs) capable of conducting general-purpose computations at high speed, has sparked considerable research efforts into the development of GPU-based ACO implementations. In this paper, we discuss a range of novel ideas for improving the GPU-based parallel MMAS implementation, allowing it to better utilize the computing power offered by two subsequent Nvidia GPU architectures. Specifically, based on the weighted reservoir sampling algorithm we propose a novel parallel implementation of the node selection procedure, which is at the heart of the MMAS and other ACO algorithms. We also present a memory-efficient implementation of another key-component - the tabu list structure - which is used in the ACO's solution construction stage. The proposed implementations, combined with the existing approaches, lead to a total of six MMAS variants, which are evaluated on a set of Traveling Salesman Problem (TSP) instances ranging from 198 to 3795 cities. The results show that our MMAS implementation is competitive with state-of-the-art GPU-based and multi-core CPU-based parallel ACO implementations: in fact, the times obtained for the Nvidia V100 Volta GPU were up to 7.18x and 21.79x smaller, respectively. The fastest of the proposed MMAS variants is able to generate over 1 million candidate solutions per second when solving a 1002-city instance. Moreover, we show that, combined with the 2-opt local search heuristic, the proposed parallel MMAS finds high-quality solutions for the TSP instances with up to 18,512 nodes. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:277 / 295
页数:19
相关论文
共 52 条
[1]  
[Anonymous], 2005, Fundamentals of Computational Swarm Intelligence
[2]  
[Anonymous], 2018, CORR
[3]  
[Anonymous], Top500 Supercomputer Sites
[4]  
[Anonymous], NVIDIA TESL V100 GPU
[5]  
Applegate D. L., 2007, Princeton Series in Applied Mathematics
[6]  
Bentley J. L., 1992, ORSA Journal on Computing, V4, P387, DOI 10.1287/ijoc.4.4.387
[7]   Comparing GPU-parallelized metaheuristics to branch-and-bound for batch plants optimization [J].
Borisenko, Andrey ;
Gorlatch, Sergei .
JOURNAL OF SUPERCOMPUTING, 2019, 75 (12) :7921-7933
[8]   Strategies for accelerating Ant Colony Optimization algorithms on Graphical Processing Units [J].
Catala, Alejandro ;
Jaen, Javier ;
Mocholi, Jose A. .
2007 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-10, PROCEEDINGS, 2007, :492-+
[9]   High-throughput Ant Colony Optimization on graphics processing units [J].
Cecilia, Jose M. ;
Llanes, Antonio ;
Abellan, Jose L. ;
Gomez-Luna, Juan ;
Chang, Li-Wen ;
Hwu, Wen-Mei W. .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2018, 113 :261-274
[10]   Enhancing data parallelism for Ant Colony Optimization on GPUs [J].
Cecilia, Jose M. ;
Garcia, Jose M. ;
Nisbet, Andy ;
Amos, Martyn ;
Ujaldon, Manuel .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (01) :42-51