Ring-mesh: a scalable and high-performance approach for manycore accelerators

被引:3
作者
Mazumdar, Somnath [1 ]
Scionti, Alberto [2 ]
机构
[1] Univ Siena, Dept Informat Engn & Math, Siena, Italy
[2] LINKS Fdn, Turin, Italy
关键词
Interconnect; Network-on-chip; Manycores; Performance; Energy; Latency; Throughput; NETWORK; TOPOLOGY; INTERCONNECT; GENERATION; DESIGN; TOOL; NOC;
D O I
10.1007/s11227-019-03072-5
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
There is increasing number of works addressing the design challenges of fast, scalable solutions for the growing number of new type of applications. Recently, many of the solutions aimed at improving processing element capabilities to speed up the execution of machine learning application domain. However, only a few works focused on the interconnection subsystem as a potential source of performance improvement. Wrapping many cores together offer excellent parallelism, but it brings other challenges (e.g. adequate interconnections). Scalable, power-aware interconnects are required to support such a growing number of processing elements, as well as modern applications. In this paper, we propose a scalable and energy-efficient network-on-chip architecture fusing the advantages of rings as well as the 2D mesh without using any bridge router to provide high performance. A dynamic adaptation mechanism allows to better adapt to the application requirements. Simulation results show efficient power consumption (up to141.3%saving for connecting 1024 cores),2x (on average) throughput growth with better scalability (up to 1024 processing elements) compared to popular 2D mesh while tested in multiple statistical traffic pattern scenarios.
引用
收藏
页码:6720 / 6752
页数:33
相关论文
共 55 条
[1]   True North: Design and Tool Flow of a 65 mW 1 Million Neuron Programmable Neurosynaptic Chip [J].
Akopyan, Filipp ;
Sawada, Jun ;
Cassidy, Andrew ;
Alvarez-Icaza, Rodrigo ;
Arthur, John ;
Merolla, Paul ;
Imam, Nabil ;
Nakamura, Yutaka ;
Datta, Pallab ;
Nam, Gi-Joon ;
Taba, Brian ;
Beakes, Michael ;
Brezzo, Bernard ;
Kuang, Jente B. ;
Manohar, Rajit ;
Risk, William P. ;
Jackson, Bryan ;
Modha, Dharmendra S. .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2015, 34 (10) :1537-1557
[2]  
[Anonymous], 2016, ABDA, 16
[3]   A case for hierarchical rings with deflection routing: An energy-efficient on-chip communication substrate [J].
Ausavarungnirun, Rachata ;
Fallin, Chris ;
Yu, Xiangyao ;
Chang, Kevin Kai-Wei ;
Nazario, Greg ;
Das, Reetuparna ;
Loh, Gabriel H. ;
Mutlu, Onur .
PARALLEL COMPUTING, 2016, 54 :29-45
[4]  
Balfour J., 2006, ICS '06: Proceedings of the 20th annual international conference on Supercomputing, P187
[5]   ON DISTRIBUTED COMMUNICATIONS NETWORKS [J].
BARAN, P .
IEEE TRANSACTIONS ON COMMUNICATIONS SYSTEMS, 1964, CS12 (01) :1-&
[6]   A Communication Characterisation of Splash-2 and Parsec [J].
Barrow-Williams, Nick ;
Fensch, Christian ;
Moore, Simon .
PROCEEDINGS OF THE 2009 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION, 2009, :86-97
[7]  
Benson T., 2010, P 10 ACM SIGC C INT, P267, DOI [DOI 10.1145/1879141.1879175, 10.1145/1879141.1879175]
[8]  
Besta M, 2018, ACM SIGPLAN NOTICES, V53, P43, DOI [10.1145/3173162.3177158, 10.1145/3296957.3177158]
[9]  
Bohnenstiehl Brent., 2016, 2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits), P1
[10]   Cost considerations in network on chip [J].
Bolotin, E ;
Cidon, I ;
Ginosar, R ;
Kolodny, A .
INTEGRATION-THE VLSI JOURNAL, 2004, 38 (01) :19-42