GPU-based branch-and-bound method to solve large 0-1 knapsack problems with data-centric strategies

被引：15

作者：

Shen, Jingcheng ^{[1
]}

Shigeoka, Kentaro ^{[2
]}

Ino, Fumihiko ^{[1
]}

Hagihara, Kenichi ^{[1
]}

机构：

[1] Osaka Univ, Grad Sch Informat Sci & Technol, 1-5 Yamadaoka, Suita, Osaka 5650871, Japan

[2] Hitachi High Technol Corp, Tokyo 1058717, Japan

来源：

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE | 2019年 / 31卷 / 04期

基金：

日本学术振兴会;

关键词：

branch and bound; data-centric method; GPU; knapsack; out-of-core computation; ALGORITHMS;

D O I：

10.1002/cpe.4954

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

An out-of-core branch-and-bound (B&B) method to solve large 0-1 knapsack problems on a graphics processing unit (GPU) is proposed. Given a large problem that produces many subproblems, the proposed method dynamically swaps subproblems to CPU memory. Because such a CPU-centric subproblem management scheme increases CPU-GPU data transfer, we adopt three data-centric strategies to eliminate this side effect. The first is an out-of-order search (O3S) strategy that reduces the data transfer overhead by adaptively transferring subproblems between the CPU and GPU. The second is an explicitly-managed pipelining strategy that hides the data transfer overhead by overlapping data transfer with GPU-based B&B operations. The third is a GPU-based stream compaction strategy that reduces the sparseness of arrays to be transferred. Experimental results demonstrate that the proposed out-of-core method stored 41 times as many subproblems as a previous in-core method that manages subproblems in GPU memory, solving approximately twice as many problem instances on the GPU. In addition, compared to a previous breadth-first search (BFS) strategy, the proposed O3S strategy achieved an average speedup of 7.5 times.

引用

页数：15

共 34 条

[1]

[Anonymous], CONCURRENCY COMPUTAT

[2]

[Anonymous], 2017, Airpods: The Next Step in Headphones

[3]

[Anonymous], CONCURRENCY COMPUTAT

[4]

[Anonymous], GPU COMPUTING GEMS

[5]

[Anonymous], 1997, DYNAMIC PROGRAMMING

[6]

[Anonymous], P 20 EUR INT C PAR

[7]

[Anonymous], 2018, CUDA Toolkit Documentation

[8]

[Anonymous], 2011, Cuda c/c++ streams and concurrency

[9]

[Anonymous], CONCURRENCY COMPUTAT

[10]

[Anonymous], 2017, NVIDIA TESLA V100 GPU ARCHITECTURE

← 1 2 3 4 →