An efficient protocol with synchronization accelerator for multi-processor embedded systems

被引：0

作者：

Yu, Jiyang ^{[1
]}

Liu, Peng ^{[1
]}

Wang, Weidong ^{[1
]}

Huang, Chunming ^{[1
,3
]}

Yang, Jie ^{[1
,4
]}

Jiang, Yingtao ^{[2
]}

Yao, Qingdong ^{[1
]}

机构：

[1] Zhejiang Univ, Dept Informat Sci & Elect Engn, Hangzhou 310003, Zhejiang, Peoples R China

[2] Univ Nevada, Dept Elect & Comp Engn, Las Vegas, NV 89154 USA

[3] Baidu Co Ltd, Dept Mobile Applicat, Shanghai, Peoples R China

[4] NetEase Inc, Pangu Creator Studio, Hangzhou, Zhejiang, Peoples R China

来源：

PARALLEL COMPUTING | 2013年 / 39卷 / 09期

基金：

国家高技术研究发展计划(863计划); 中国国家自然科学基金;

关键词：

Real-time operating system; Parallel programming; Interface protocol; Synchronization; Multicore; SUPPORT;

D O I：

10.1016/j.parco.2013.04.008

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

With the proliferation of multi-processor core systems, parallel programming imposes a difficult challenge where current solutions are far from being considered efficient. In order to alleviate the difficulty of parallel programming, we propose a scheduler, which is part of a master-slave RTOS, to efficiently manage the parallel programs running on a multi-processor core system. We also propose an efficient protocol that serves as the interface between the operating system and application programs. This interface protocol runs on a dedicated control subnet to cut down the synchronization overhead between the parallel tasks. Such synchronization overhead incurred in these multi-core parallel systems has been recognized as one of the severe limiting factors when pushing up the performance envelope. Experimental results, obtained from the register-transfer level simulations of various benchmark parallel programs, show that the proposed protocol and the control subnet can improve the system efficiency by up to 33.5%. This protocol, as it is designed to be compatible with the minimum subset of the massage-passing interface functions (MPI), scales well with the number of cores. (C) 2013 Elsevier B.V. All rights reserved.

引用

页码：461 / 474

页数：14

共 25 条

[1]

Al-Kadi G, 2009, LECT NOTES COMPUT SC, V5409, P140

[2]

Baumann A, 2009, SOSP'09: PROCEEDINGS OF THE TWENTY-SECOND ACM SIGOPS SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES, P29

[3]

Bekooij M, 2005, PHILIPS RES BOOK SER, V3, P81

[4]

Boyd-Wickizer Silas, 2008, P USENIX OSDI, V8, P43

[5]

Cannon LynnElliot., 1969, CELLULAR COMPUTER IM

[6] Computing pi(x): The Meissel, Lehmer, Lagarias, Miller, Odlyzko method [J].

Deleglise, M ;

Rivat, J .

MATHEMATICS OF COMPUTATION, 1996, 65 (213) :235-245

[7] Overview of the Blue Gene/L system architecture [J].

Gara, A ;

Blumrich, MA ;

Chen, D ;

Chiu, GLT ;

Coteus, P ;

Giampapa, ME ;

Haring, RA ;

Heidelberger, P ;

Hoenicke, D ;

Kopcsay, GV ;

Liebsch, TA ;

Ohmacht, M ;

Steinmacher-Burow, BD ;

Takken, T ;

Vranas, P .

IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2005, 49 (2-3) :195-212

[8] Eigenvalue computation in the 20th century [J].

Golub, GH ;

van der Vorst, HA .

JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2000, 123 (1-2) :35-65

[9]

Gu X., 2011, COMPUTERS ELECT ENG, V38, P785

[10] A Hardware Scheduler for Real Time Multiprocessor System on Chip [J].

Gupta, Nikhil ;

Mandal, Suman K. ;

Malave, Javier ;

Mandal, Ayan ;

Mahapatra, Rabi N. .

23RD INTERNATIONAL CONFERENCE ON VLSI DESIGN, 2010, :264-269

← 1 2 3 →