GPU-accelerated Hungarian algorithms for the Linear Assignment Problem

被引:44
作者
Date, Ketan [1 ]
Nagi, Rakesh [1 ]
机构
[1] Univ Illinois, Dept Ind & Enterprise Syst Engn, 117 Transportat Bldg, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
Linear assignment problem; Parallel algorithm; Graphics processing unit; CUDA;
D O I
10.1016/j.parco.2016.05.012
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we describe parallel versions of two different variants (classical and alternating tree) of the Hungarian algorithm for solving the Linear Assignment Problem (LAP). We have chosen Compute Unified Device Architecture (CUDA) enabled NVIDIA Graphics Processing Units (GPU) as the parallel programming architecture because of its ability to perform intense computations on arrays and matrices. The main contribution of this paper is an efficient parallelization of the augmenting path search phase of the Hungarian algorithm. Computational experiments on problems with up to 25 million variables reveal that the GPU-accelerated versions are extremely efficient in solving large problems, as compared to their CPU counterparts. Tremendous parallel speedups are achieved for problems with up to 400 million variables, which are solved within 13 seconds on average. We also tested multi-GPU versions of the two variants on up to 16 GPUs, which show decent scaling behavior for problems with up to 1.6 billion variables and dense cost matrix structure. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:52 / 72
页数:21
相关论文
共 31 条
[1]  
Ahuja RK, 1993, Network flows
[2]  
[Anonymous], 2009, 7 INT C NUM AN APPL
[3]  
[Anonymous], 2007, GPU gems
[4]  
[Anonymous], 2010, Thrust: A parallel template library
[5]  
[Anonymous], CUDA C PROGR GUID 5
[6]  
[Anonymous], 2001, Combinatorial optimization: networks and matroids
[7]   A PARALLEL SHORTEST AUGMENTING PATH ALGORITHM FOR THE ASSIGNMENT PROBLEM [J].
BALAS, E ;
MILLER, D ;
PEKNY, J ;
TOTH, P .
JOURNAL OF THE ACM, 1991, 38 (04) :985-1004
[8]  
Bazaraa M., 2011, LINEAR PROGRAMMING N
[9]  
Bertsekas D. P., 1993, ORSA Journal on Computing, V5, P261, DOI 10.1287/ijoc.5.3.261
[10]   PARALLEL SYNCHRONOUS AND ASYNCHRONOUS IMPLEMENTATIONS OF THE AUCTION ALGORITHM [J].
BERTSEKAS, DP ;
CASTANON, DA .
PARALLEL COMPUTING, 1991, 17 (6-7) :707-732