A Parallel Genetic Algorithm With Dispersion Correction for HW/SW Partitioning on Multi-Core CPU and Many-Core GPU

被引:22
作者
Hou, Neng [1 ]
He, Fazhi [1 ,2 ]
Zhou, Yi [3 ]
Chen, Yilin [1 ]
Yan, Xiaohu [1 ]
机构
[1] Wuhan Univ, Sch Comp Sci, State Key Lab Software Engn, Wuhan 430072, Hubei, Peoples R China
[2] State Key Lab Digital Mfg Equipment & Technol, Wuhan 430074, Hubei, Peoples R China
[3] Wuhan Univ Sci & Technol, Sch Informat Sci & Engn, Engn Res Ctr Met Automat & Measurement Technol, Wuhan 430081, Hubei, Peoples R China
基金
美国国家科学基金会;
关键词
Hardware/software co-design; heuristic method; genetic algorithm; multi-core CPU; many-core GPU; HARDWARE-SOFTWARE COSYNTHESIS; HARDWARE/SOFTWARE; OPTIMIZATION; SEGMENTATION; TRACKING; DESIGN;
D O I
10.1109/ACCESS.2017.2776295
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In hardware/software (HW/SW) co-design, hardware/software partitioning is an essential step in that it determines which components to be implemented in hardware and which ones in software. Most of HW/SW partitioning problems are NP hard. For large-size problems, heuristic methods have to be utilized. This paper presents a parallel genetic algorithm with dispersion correction for HW/SW partitioning on CPU-GPU. First, an enhanced genetic algorithm with dispersion correction is presented. The under-constraint individuals are marched to feasible region step by step. In this way, the intensification can be enhanced as well as the constraint problem can be handled. Second, the individuals performing costs computation and dispersion correction are run in parallel. For a given problem size, the overall run-time can be reduced while the diversity of genetic algorithm can be kept. Third, especially when a number of under-constraint individuals should be corrected in an irregular way, the computation process is complicated and the computation overhead is large. Therefore, we present a novel parallel strategy by leveraging the parallel power of a multi-core CPU and that of a many-core GPU. The proposed strategy computes the costs of each individual in parallel on GPU and corrects the under-constraint individuals in parallel on the multi-core CPU. In this way, a highly efficient parallel computing can be achieved in which dozens of irregular correction computing steps are mapped to the multi-core CPU and thousands of regular cost computing steps are mapped to the many-core GPU. Fourth, at each iteration of the hybrid parallel strategy, the solution vectors of individuals are transferred to the GPU and their costs are transferred back to the CPU. In order to further improve the efficiency of proposed algorithm, we propose an asynchronous transfer pattern (stream concurrency pattern) for CPU-GPU, in which the transfer process and computation process are overlapped and eventually the overall run-time can be reduced further. Finally, the experiments show that the solution quality obtained by our method is competitive with existing heuristic methods in reasonable time. Furthermore, by combining with the multi-core CPU and many-core GPU, the running time of the proposed method is efficiently reduced.
引用
收藏
页码:883 / 898
页数:16
相关论文
共 54 条
[1]   An integrated high-level hardware/software partitioning methodology [J].
Abdelhalim, M. B. ;
Habib, S. E-D. .
DESIGN AUTOMATION FOR EMBEDDED SYSTEMS, 2011, 15 (01) :19-50
[2]   Accelerating compute intensive medical imaging segmentation algorithms using hybrid CPU-GPU implementations [J].
Alsmirat, Mohammad A. ;
Jararweh, Yaser ;
Al-Ayyoub, Mahmoud ;
Shehab, Mohammed A. ;
Gupta, Brij B. .
MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (03) :3537-3555
[3]   Algorithmic aspects of hardware/software partitioning [J].
Arató, P ;
Mann, ZA ;
Orbán, A .
ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2005, 10 (01) :136-156
[4]  
Arató P, 2003, I S INTELL SIG PR, P197
[5]   A local start search algorithm to compute exact Hausdorff Distance for arbitrary point sets [J].
Chen, Yilin ;
He, Fazhi ;
Wu, Yiqi ;
Hou, Neng .
PATTERN RECOGNITION, 2017, 67 :139-148
[6]  
[陈志 Chen Zhi], 2013, [计算机学报, Chinese Journal of Computers], V36, P2033
[7]   Meta-operation conflict resolution for human-human interaction in collaborative feature-based CAD systems [J].
Cheng, Yuan ;
He, Fazhi ;
Wu, Yiqi ;
Zhang, Dejun .
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2016, 19 (01) :237-253
[8]   HARDWARE-SOFTWARE COSYNTHESIS FOR MICROCONTROLLERS [J].
ERNST, R ;
HENKEL, J ;
BENNER, T .
IEEE DESIGN & TEST OF COMPUTERS, 1993, 10 (04) :64-75
[9]  
Farahani AF, 2006, PAR ELEC 2006: INTERNATIONAL SYMPOSIUM ON PARALLEL COMPUTING IN ELECTRICAL ENGINEERING, PROCEEDINGS, P337
[10]  
Ferrandi F, 2013, 2013 NASA/ESA CONFERENCE ON ADAPTIVE HARDWARE AND SYSTEMS (AHS), P47, DOI 10.1109/AHS.2013.6604225