Architecture Support for Task Out-of-Order Execution in MPSoCs

被引：16

作者：

Wang, Chao ^{[1
]}

Li, Xi ^{[1
]}

Zhang, Junneng ^{[2
]}

Chen, Peng ^{[1
]}

Chen, Yunji ^{[3
]}

Zhou, Xuehai ^{[4
]}

Cheung, Ray C. C. ^{[5
]}

机构：

[1] Univ Sci & Technol China, Dept Comp Sci, Hefei 230027, Anhui, Peoples R China

[2] Univ Sci & Technol China, Hefei 230027, Anhui, Peoples R China

[3] Chinese Acad Sci, CARCH, State Key Lab, Beijing 100190, Peoples R China

[4] Univ Sci & Technol China, Suzhou Inst, Suzhou 215123, Peoples R China

[5] City Univ Hong Kong, Dept Elect Engn, Kowloon, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON COMPUTERS | 2015年 / 64卷 / 05期

基金：

美国国家科学基金会;

关键词：

Middleware; architecture support; MPSoC; data dependencies; FPGA; out-of-order execution;

D O I：

10.1109/TC.2014.2315628

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multi-processor system on chip (MPSoC) has been widely applied in embedded systems in the past decades. However, it has posed great challenges to efficiently design and implement a rapid prototype for diverse applications due to heterogeneous instruction set architectures (ISA), programming interfaces and software tool chains. In order to solve the problem, this paper proposes a novel high level architecture support for automatic out-of-order (OoO) task execution on FPGA based heterogeneous MPSoCs. The architecture support is composed of a hierarchical middleware with an automatic task level OoO parallel execution engine. Incorporated with a hierarchical OoO layer model, the middleware is able to identify the parallel regions and generate the sources codes automatically. Besides, a runtime middleware Task-Scoreboarding analyzes the inter-task data dependencies and automatically schedules and dispatches the tasks with parameter renaming techniques. The middleware has been verified by the prototype built on FPGA platform. Examples and a JPEG case study demonstrate that our model can largely ease the burden of programmers as well as uncover the task level parallelism.

引用

页码：1296 / 1310

页数：15

共 42 条

[1]

Bellens P, 2009, SCI PROGRAMMING-NETH, V17, P77, DOI [10.1155/2009/561672, 10.3233/SPR-2009-0272]

[2] A distributed, simultaneously multi-threaded (SMT) processor with clustered scheduling windows for scalable DSP performance [J].

Berekovic, Mladen ;

Niggemeier, Tim .

JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2008, 50 (02) :201-229

[3]

BLUMOFE RD, 1995, SIGPLAN NOTICES, V30, P207

[4]

Board O. A. R., 1998, OPENMP C C APPL PROG

[5] The Future of Microprocessors [J].

Borkar, Shekhar ;

Chien, Andrew A. .

COMMUNICATIONS OF THE ACM, 2011, 54 (05) :67-77

[6]

Chao Wang, 2011, 2011 Proceedings of IEEE International Conference on Services Computing (SCC 2011), P709, DOI 10.1109/SCC.2011.26

[7]

Deng D. Y., 2010, Proceedings 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2010), P137, DOI 10.1109/MICRO.2010.17

[8]

Etsion Y., 2010, Proceedings 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2010), P89, DOI 10.1109/MICRO.2010.13

[9]

Ghuloum Anwar, 2007, Intel Technology Journal, V11, P333, DOI 10.1535/itj.1104.07

[10] Platune: A tuning framework for system-on-a-chip platforms [J].

Givargis, T ;

Vahid, F .

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2002, 21 (11) :1317-1327

← 1 2 3 4 5 →