An efficient GPU-based parallel tabu search algorithm for hardware/software co-design

被引:69
作者
Hou, Neng [1 ,2 ]
He, Fazhi [1 ]
Zhou, Yi [3 ]
Chen, Yilin [1 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China
[2] Yangtze Univ, Sch Comp Sci, Jingzhou 434023, Peoples R China
[3] Wuhan Univ Sci & Technol, Sch Informat Sci & Engn, Wuhan 430081, Peoples R China
基金
中国国家自然科学基金;
关键词
hardware; software co-design; software partitioning; graphics processing unit; GPU-based parallel tabu search; single kernel implementation; kernel fusion strategy; optimized transfer strategy; HARDWARE-SOFTWARE COSYNTHESIS; GENETIC ALGORITHM; SELECTIVE UNDO; OPTIMIZATION;
D O I
10.1007/s11704-019-8184-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hardware/software partitioning is an essential step in hardware/software co-design. For large size problems, it is difficult to consider both solution quality and time. This paper presents an efficient GPU-based parallel tabu search algorithm (GPTS) for HW/SW partitioning. A single GPU kernel of compacting neighborhood is proposed to reduce the amount of GPU global memory accesses theoretically. A kernel fusion strategy is further proposed to reduce the amount of GPU global memory accesses of GPTS. To further minimize the transfer overhead of GPTS between CPU and GPU, an optimized transfer strategy for GPU-based tabu evaluation is proposed, which considers that all the candidates do not satisfy the given constraint. Experiments show that GPTS outperforms state-of-the-art work of tabu search and is competitive with other methods for HW/SW partitioning. The proposed parallelization is significant when considering the ordinary GPU platform.
引用
收藏
页数:18
相关论文
共 63 条
  • [11] HARDWARE-SOFTWARE COSYNTHESIS FOR MICROCONTROLLERS
    ERNST, R
    HENKEL, J
    BENNER, T
    [J]. IEEE DESIGN & TEST OF COMPUTERS, 1993, 10 (04): : 64 - 75
  • [12] Narrative Collage of Image Collections by Scene Graph Recombination
    Fang, Fei
    Yi, Miao
    Feng, Hui
    Hu, Shenghong
    Xiao, Chunxia
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2018, 24 (09) : 2559 - 2572
  • [13] Farahani AF, 2006, PAR ELEC 2006: INTERNATIONAL SYMPOSIUM ON PARALLEL COMPUTING IN ELECTRICAL ENGINEERING, PROCEEDINGS, P337
  • [14] Ferrandi F, 2013, 2013 NASA/ESA CONFERENCE ON ADAPTIVE HARDWARE AND SYSTEMS (AHS), P47, DOI 10.1109/AHS.2013.6604225
  • [15] Garg K, 2015, 2015 28TH IEEE INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (SOCC), P64, DOI 10.1109/SOCC.2015.7406912
  • [16] PGMA: An algorithmic approach for multi-objective hardware software partitioning
    Govil, Naman
    Shrestha, Rahul
    Chowdhury, Shubhajit Roy
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 2017, 54 : 83 - 96
  • [17] Gupta K., 2012, 2012 INNOVATIVE PARA, P1, DOI DOI 10.1109/INPAR.2012.6339596
  • [18] HARDWARE-SOFTWARE COSYNTHESIS FOR DIGITAL-SYSTEMS
    GUPTA, RK
    DEMICHELI, G
    [J]. IEEE DESIGN & TEST OF COMPUTERS, 1993, 10 (03): : 29 - 41
  • [19] MiBench: A free, commercially representative embedded benchmark suite
    Guthaus, MR
    Ringenberg, JS
    Ernst, D
    Austin, TM
    Mudge, T
    Brown, RB
    [J]. WWC-4: IEEE INTERNATIONAL WORKSHOP ON WORKLOAD CHARACTERIZATION, 2001, : 3 - 14
  • [20] An approach to automated hardware/software partitioning using a flexible granularity that is driven by high-level estimation techniques
    Henkel, J
    Ernst, R
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2001, 9 (02) : 273 - 289