An efficient protocol with synchronization accelerator for multi-processor embedded systems

被引:0
作者
Yu, Jiyang [1 ]
Liu, Peng [1 ]
Wang, Weidong [1 ]
Huang, Chunming [1 ,3 ]
Yang, Jie [1 ,4 ]
Jiang, Yingtao [2 ]
Yao, Qingdong [1 ]
机构
[1] Zhejiang Univ, Dept Informat Sci & Elect Engn, Hangzhou 310003, Zhejiang, Peoples R China
[2] Univ Nevada, Dept Elect & Comp Engn, Las Vegas, NV 89154 USA
[3] Baidu Co Ltd, Dept Mobile Applicat, Shanghai, Peoples R China
[4] NetEase Inc, Pangu Creator Studio, Hangzhou, Zhejiang, Peoples R China
基金
国家高技术研究发展计划(863计划); 中国国家自然科学基金;
关键词
Real-time operating system; Parallel programming; Interface protocol; Synchronization; Multicore; SUPPORT;
D O I
10.1016/j.parco.2013.04.008
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With the proliferation of multi-processor core systems, parallel programming imposes a difficult challenge where current solutions are far from being considered efficient. In order to alleviate the difficulty of parallel programming, we propose a scheduler, which is part of a master-slave RTOS, to efficiently manage the parallel programs running on a multi-processor core system. We also propose an efficient protocol that serves as the interface between the operating system and application programs. This interface protocol runs on a dedicated control subnet to cut down the synchronization overhead between the parallel tasks. Such synchronization overhead incurred in these multi-core parallel systems has been recognized as one of the severe limiting factors when pushing up the performance envelope. Experimental results, obtained from the register-transfer level simulations of various benchmark parallel programs, show that the proposed protocol and the control subnet can improve the system efficiency by up to 33.5%. This protocol, as it is designed to be compatible with the minimum subset of the massage-passing interface functions (MPI), scales well with the number of cores. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:461 / 474
页数:14
相关论文
共 25 条
[1]  
Al-Kadi G, 2009, LECT NOTES COMPUT SC, V5409, P140
[2]  
Baumann A, 2009, SOSP'09: PROCEEDINGS OF THE TWENTY-SECOND ACM SIGOPS SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES, P29
[3]  
Bekooij M, 2005, PHILIPS RES BOOK SER, V3, P81
[4]  
Boyd-Wickizer Silas, 2008, P USENIX OSDI, V8, P43
[5]  
Cannon LynnElliot., 1969, CELLULAR COMPUTER IM
[6]   Computing pi(x): The Meissel, Lehmer, Lagarias, Miller, Odlyzko method [J].
Deleglise, M ;
Rivat, J .
MATHEMATICS OF COMPUTATION, 1996, 65 (213) :235-245
[7]   Overview of the Blue Gene/L system architecture [J].
Gara, A ;
Blumrich, MA ;
Chen, D ;
Chiu, GLT ;
Coteus, P ;
Giampapa, ME ;
Haring, RA ;
Heidelberger, P ;
Hoenicke, D ;
Kopcsay, GV ;
Liebsch, TA ;
Ohmacht, M ;
Steinmacher-Burow, BD ;
Takken, T ;
Vranas, P .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2005, 49 (2-3) :195-212
[8]   Eigenvalue computation in the 20th century [J].
Golub, GH ;
van der Vorst, HA .
JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2000, 123 (1-2) :35-65
[9]  
Gu X., 2011, COMPUTERS ELECT ENG, V38, P785
[10]   A Hardware Scheduler for Real Time Multiprocessor System on Chip [J].
Gupta, Nikhil ;
Mandal, Suman K. ;
Malave, Javier ;
Mandal, Ayan ;
Mahapatra, Rabi N. .
23RD INTERNATIONAL CONFERENCE ON VLSI DESIGN, 2010, :264-269