Design of a Hybrid Multicore Platform for High Performance Reconfigurable Computing

被引:0
作者
Hussain, Waqar [1 ]
Hoffmann, Henry [2 ]
Ahonen, Tapani [1 ]
Nurmi, Jari [1 ]
机构
[1] Tampere Univ Technol, Dept Elect & Commun Engn, FI-33101 Tampere, Finland
[2] Univ Chicago, Dept Comp Sci, Chicago, IL 60637 USA
来源
2015 NORDIC CIRCUITS AND SYSTEMS CONFERENCE (NORCAS) - NORCHIP & INTERNATIONAL SYMPOSIUM ON SYSTEM-ON-CHIP (SOC) | 2015年
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a loosely-coupled, hybrid architecture of homogeneous and heterogeneous cores integrated together over a Network-on-Chip (NoC). The architecture efficiently utilizes the NoC bandwidth by keeping a balance between the instantiated number of computational and communication nodes. Furthermore, the architecture also provides a mixed flavor of homogeneous general-purpose processing and heterogeneous reconfigurable computing. Prior approaches have mostly considered homogeneous and heterogeneous platforms as two different design paradigms despite both show domain-specific performance advantages over each other. In this context, the proposed architecture is designed for nine NoC nodes, arranged in a topology of three rows and three columns. The middle row contains three homogeneous Reduced Instruction Set Computer (RISC) cores and rest of the nodes are integrated with Coarse-Grain Reconfigurable Arrays (CGRAs) of application-specific sizes. The overall architecture is template-based which can be crafted to application's performance requirements. The NoC allows loose coupling, so all the cores can mutually exchange the data as well as enable independent and simultaneous execution. Contrarily, the user can program the middle layer of RISC cores for specific data/control dependencies among all the cores. The system mitigates power dissipation as the CGRAs are custom tailored for heterogeneous computing. The platform is evaluated for a proof-of-concept test comprising of massively-parallel signal processing algorithms. Synthesis results from a Field Programmable Gate Array device are used to establish comparisons and evaluation against some of the existing state-of-the-art multicore platforms in terms of multiple performance metrics.
引用
收藏
页数:8
相关论文
共 22 条
[1]  
Ahonen T., 2006, TUT PUBLUCATION, V625
[2]  
Airoldi R., 2010, Proceedings 2010 International Symposium on System-on-Chip - SOC, P26, DOI 10.1109/ISSOC.2010.5625562
[3]  
Airoldi R., 2010, VERY LARGE SCALE INT, P26, DOI 10.1109/ISSOC.2010.5625562
[4]   PACT XPP -: A self-reconfigurable data processing architecture [J].
Baumgarte, V ;
Ehlers, G ;
May, F ;
Nückel, A ;
Vorbach, M ;
Weinhardt, M .
JOURNAL OF SUPERCOMPUTING, 2003, 26 (02) :167-184
[5]  
Bonnot P., P DES AUT TEST EUR D, P610
[6]  
Campi F., P DES AUT TEST EUR D, P9
[7]  
Early J., 1960, 1960 IEEE INT SOLID, V3, P78
[8]   Centip3De: A Cluster-Based NTC Architecture With 64 ARM Cortex-M3 Cores in 3D Stacked 130 nm CMOS [J].
Fick, David ;
Dreslinski, Ronald G. ;
Giridhar, Bharan ;
Kim, Gyouho ;
Seo, Sangwon ;
Fojtik, Matthew ;
Satpathy, Sudhir ;
Lee, Yoonmyung ;
Kim, Daeyeon ;
Liu, Nurrachman ;
Wieckowski, Michael ;
Chen, Gregory ;
Mudge, Trevor ;
Blaauw, David ;
Sylvester, Dennis .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2013, 48 (01) :104-117
[9]  
Garzia F., 2009, P 19 INT C FIELD PRO
[10]  
Hussain W., 2012, P SOC 2012 TAMP FINL