An efficient shortest path algorithm for content-based routing on 2-D mesh accelerator networks

被引:8
作者
Liu, Jiayu [1 ]
Gu, Huaxi [1 ]
Wei, Wenting [1 ]
Chen, Ziqi [1 ]
Chen, Yawen [2 ]
机构
[1] Xidian Univ, State Key Lab Integrated Serv Networks, Xian, Peoples R China
[2] Univ Otago, Dept Comp Sci, Dunedin, New Zealand
来源
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2021年 / 114卷
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Accelerator networks; Content-based routing; Shortest path; Branch-and-bound; Adaptive routing; DESIGN;
D O I
10.1016/j.future.2020.07.044
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Reconfigurable units such as Field Programmable Gate Arrays (FPGAs) have been used widely as high-performance hardware accelerators for a variety of applications and offer a promising solution to the power bottleneck of current multi-core processors. One emerging trend in hardware acceleration is to build a specialized network for accelerators to support inter-accelerator communications with the goal of increasing acceleration throughput by sharing functions and avoiding frequent reconfiguration. On accelerator networks, data packets are routed in a content-based instead of an address-based manner in that the destinations are determined by the acceleration task-any nodes that support the current acceleration step could be a receiver. Designing an effective and efficient routing algorithm that supports content-based routing becomes an important problem. Adopting shortest path routing for each acceleration task is a standard approach. While Breadth-First Search (BFS) algorithm can be applied, a naive implementation is computationally unaffordable. We propose a new routing algorithm called the Shortest Cycle Routing Algorithm and address the computation inefficiency of standard BFS. We design a branch-and-bound method to effectively prune the search tree without compromising path optimality. The time and space complexities for searching the shortest cycles of an acceleration task of k steps improve from O(n(k)) to O(1) where n is the number of accelerators in the network. We analyze locality and path diversity properties of shortest cycle routing and show how they can be used to develop adaptive routing strategy, restrict global flooding to local neighborhood and reduce the number of path cycles. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:519 / 530
页数:12
相关论文
共 30 条
[1]   Networks-on-Chip for FPGAs: Hard, Soft or Mixed? [J].
Abdelfattah, Mohamed S. ;
Betz, Vaughn .
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2014, 7 (03)
[2]  
[Anonymous], 1999, TECH REP
[3]  
Baxter R, 2007, NASA/ESA CONFERENCE ON ADAPTIVE HARDWARE AND SYSTEMS, PROCEEDINGS, P287
[4]   Considerations in using OpenCL on GPUs and FPGAs for throughput-oriented genomics workloads [J].
Cadenelli, Nicola ;
Jaksic, Zoran ;
Polo, Jorda ;
Carrera, David .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 94 :148-159
[5]  
Carzaniga A, 2004, IEEE INFOCOM SER, P918
[6]  
Carzaniga A, 2003, ACM SIGCOMM COMP COM, V33, P163
[7]  
Caulfield AM, 2016, INT SYMP MICROARCH
[8]  
Chalamalasetti SaiRahul., 2013, Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, P245
[9]   A scalable protocol for content-based routing in overlay networks [J].
Chand, R ;
Felber, PA .
SECOND IEEE INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS, PROCEEDINGS, 2003, :123-130
[10]  
Cypher R., 1995, Proceedings. Seventh IEEE Symposium on Parallel and Distributed Processing (Cat. No.95TB8131), P122, DOI 10.1109/SPDP.1995.530674