TPSS: A Flexible Hardware Support for Unicast and Multicast on Networks-on-Chip

被引:1
作者
Hu, Wenmin [1 ,2 ]
Lu, Zhonghai [2 ]
Liu, Hengzhu [1 ]
Jantsch, Axel [2 ]
机构
[1] Natl Univ Def Technol, Sch Comp, Changsha, Hunan, Peoples R China
[2] KTH Royal Inst Technol, Stockholm, Sweden
基金
美国国家科学基金会;
关键词
Network-on-Chip; System-on-Chip; Multicast;
D O I
10.4304/jcp.7.7.1743-1752
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Multicast is an important traffic mode that runs on multi-core systems, and an efficient hardware support for multicast can greatly improve the performance of the whole system. Most multicast solutions use the dimension-order routing to generate the mutlicast trees, which are neither bandwidth nor power efficient. This article presents a synthesizable router for network-on-chip (NoC) which supports arbitrarily shaped multicast path based on a mesh topology. In our scheme, incremental setup is adopted to simplify the process of multicast tree construction. For each sub-path setup, we present a novel scheme called two period sub-path setup (TPSS). TPSS is divided into two periods: routing to a predeterminate intermediate router, and updating lookup tables from the intermediate router to destination. This novel setup makes it feasible to support arbitrarily shaped path setup. In our case study, Optimized tree algorithm (OPT) and Left-XY-Right-Optimized tree algorithm (LXYROPT) are proposed for power-efficient path searching, but they need to be pre-configured for the reason of high computation cost. Moreover, Virtual Circuit Tree Multicasting (VCTM) is also supported in our scheme for dynamic construction of multicast path, which needs no computation in path searching. The performance is evaluated by using a cycle accurate simulator developed in SystemC, and the hardware overhead is estimated by using a synthesizable HDL model. Compared to VCTM (without FIFO, multicast table and network adapter), the area overhead of implementing our router is negligible (less than 0.5%).
引用
收藏
页码:1743 / 1752
页数:10
相关论文
共 22 条
[1]   Resource deadlocks and performance of wormhole multicast routing algorithms [J].
Boppana, RV ;
Chalasani, S ;
Raghavendra, CS .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1998, 9 (06) :535-549
[2]  
Chi-Ming Chiang, 1994, Parallel Computer Routing and Communication. First International Workshop, PCRCW '94. Proceedings, P146
[3]   Low-distance path-based multicast routing algorithm for network-on-chips [J].
Daneshtalab, M. ;
Ebrahimi, M. ;
Mohammadi, S. ;
Afzali-Kusha, A. .
IET COMPUTERS AND DIGITAL TECHNIQUES, 2009, 3 (05) :430-442
[4]  
Galles M., 1996, P INT S HIGH PERF IN, P141
[5]  
GLASS CJ, 1992, ACM COMP AR, V20, P278, DOI 10.1145/146628.140384
[6]  
Guerrier P., 2000, Proceedings Design, Automation and Test in Europe Conference and Exhibition 2000 (Cat. No. PR00537), P250, DOI 10.1109/DATE.2000.840047
[7]   A 5-GHz mesh interconnect for a teraflops processor [J].
Hoskote, Yatin ;
Vangal, Sriram ;
Singh, Arvind ;
Borkar, Nitin ;
Borkar, Shekhar .
IEEE MICRO, 2007, 27 (05) :51-61
[8]  
Jerger N. E., 2008, P 35 ANN INT S COMP
[9]   MULTICAST COMMUNICATION IN MULTICOMPUTER NETWORKS [J].
LIN, XL ;
NI, LM .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1993, 4 (10) :1105-1117
[10]   DEADLOCK-FREE MULTICAST WORMHOLE ROUTING IN 2-D MESH MULTICOMPUTERS [J].
LIN, XO ;
MCKINLEY, PK ;
NI, LM .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1994, 5 (08) :793-804