High performance ant colony system based on GPU warp specialization with a static-dynamic balanced candidate set strategy

被引:6
作者
Zhi-bin, Huang [1 ]
Guang-Tao, Fu [1 ]
Tian-Hao, Fa [1 ]
Dan-Yang, Dong [1 ]
Peng, Bai [2 ]
Chen, Xiao [1 ]
机构
[1] Beijing Univ Posts & Telecommun BUPT, Sch Comp Sci, Beijing Key Lab Intelligent Telecommun Software &, Beijing 100876, CO, Peoples R China
[2] China Acad Aerosp Aerodynam, Beijing 100074, CO, Peoples R China
来源
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2021年 / 125卷
关键词
ACS/ACO; Candidate set strategy; GPU warp specialization; Producer-consumer parallel model; TSP; OPTIMIZATION; CLASSIFICATION; ALGORITHM;
D O I
10.1016/j.future.2021.06.041
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
To improve the performance of parallel algorithms, it is necessary to make full utilization of computing resources and computing power of parallel hardware. However, the utilization efficiency of the computation must also be considered. Ant Colony System (ACS) has natural parallelism, and the procedure "Selecting the next element" is one of its main key computational components for combinatorial optimization problems. If all elements in the candidate set have been visited, a global search in the complete element domain needs to be performed and its computational overhead is enormous. Based on extensive experiments, we show that the results of global searches are also valuable for subsequent iterations of ACS. Therefore, an innovative static-dynamic balanced candidate set strategy, denoted by ID-CS, is proposed. ID-CS saves the result of global searches in previous iterations in order to reuse them in later iterations so that it can decrease the number of global searches. Furthermore, a novel GPU parallel ACS algorithm, ACS_GPU_WSP, is proposed based on the producer-consumer parallel model by GPU Warp Specialization. Each ant divides its computational work into private work and public work. For 11 large-scale typical TSP problems, compared with the state-of-art GPU data-level parallel implementation, it has achieved a large performance improvement. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:136 / 150
页数:15
相关论文
共 37 条
[1]  
[Anonymous], 2019, J SUPERCOMPUT
[2]  
[Anonymous], 1991, ANT SYSTEM AUTOCATAL
[3]   Energy-Aware Virtual Machine Consolidation Algorithm Based on Ant Colony System [J].
Aryania, Azra ;
Aghdasi, Hadi S. ;
Khanli, Leyli Mohammad .
JOURNAL OF GRID COMPUTING, 2018, 16 (03) :477-491
[4]   Exploiting Inter-Warp Heterogeneity to Improve GPGPU Performance [J].
Ausavarungnirun, Rachata ;
Ghose, Saugata ;
Kayiran, Onur ;
Loh, Gabriel H. ;
Das, Chita R. ;
Kandemir, Mahmut T. ;
Mutlu, Onur .
2015 INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURE AND COMPILATION (PACT), 2015, :25-38
[5]  
Bauer M, 2014, ACM SIGPLAN NOTICES, V49, P119, DOI [10.1145/2555243.2555258, 10.1145/2692916.2555258]
[6]   A robust ant colony optimization-based algorithm for community mining in large scale oriented social graphs [J].
Ben Romdhane, L. ;
Chaabani, Y. ;
Zardi, H. .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (14) :5709-5718
[7]   Parallel multi-objective Ant Programming for classification using GPUs [J].
Cano, Alberto ;
Luis Olmo, Juan ;
Ventura, Sebastian .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (06) :713-728
[8]   A novel ant colony optimization algorithm for large-distorted fingerprint matching [J].
Cao, Kai ;
Yang, Xin ;
Chen, Xinjian ;
Zang, Yali ;
Liang, Jimin ;
Tian, Jie .
PATTERN RECOGNITION, 2012, 45 (01) :151-161
[9]   Enhancing data parallelism for Ant Colony Optimization on GPUs [J].
Cecilia, Jose M. ;
Garcia, Jose M. ;
Nisbet, Andy ;
Amos, Martyn ;
Ujaldon, Manuel .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (01) :42-51
[10]  
Cekmez U, 2014, INT CONF UNMAN AIRCR, P347, DOI 10.1109/ICUAS.2014.6842273