Cooperative communication based barrier synchronization in on-chip mesh architectures

被引:3
作者
Chen, Xiaowen [1 ,2 ]
Lu, Zhonghai [2 ]
Jantsch, Axel [2 ]
Chen, Shuming [1 ]
Liu, Hai [1 ]
机构
[1] Natl Univ Def Technol, Changsha 410073, Hunan, Peoples R China
[2] KTH Royal Inst Technol, S-16440 Stockholm, Sweden
基金
中国国家自然科学基金;
关键词
cooperative communication; barrier synchronization;
D O I
10.1587/elex.8.1856
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We propose cooperative communication as a means to enable efficient and scalable barrier synchronization on mesh-based many-core architectures. Our approach is different from but orthogonal to conventional algorithm-based optimizations. It relies on collaborating routers to provide efficient gather and multicast communication. In conjunction with a master-slave algorithm, it exploits the mesh regularity to achieve efficiency. The gather and multicast functions have been implemented in our router. Synthesis results suggest marginal area overhead. With synthetic and benchmark experiments, we show that our approach significantly reduces synchronization completion time and increases speedup.
引用
收藏
页码:1856 / 1862
页数:7
相关论文
共 10 条
[1]   Modeling Advanced Collective Communication Algorithms on Cell-based Systems [J].
Ali, Qasim ;
Midkiff, Samuel P. ;
Pai, Vijay S. .
PPOPP 2010: PROCEEDINGS OF THE 2010 ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, 2010, :293-303
[2]  
[Anonymous], 2004, Dark Victory: How a Government Lied Its Way to Political Triumph
[3]   THE BUTTERFLY BARRIER [J].
BROOKS, ED .
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 1986, 15 (04) :295-307
[4]  
Hennessy J., 2007, Computer Architecture-A Quantitative Approach
[5]  
Hoefler T., 2004, CHEMNITZER INFORMATI, V04
[6]  
Marongiu A., 2007, C COMPILERS ARCHITEC, P145
[7]   COLLECTIVE COMMUNICATION IN WORMHOLE-ROUTED MASSIVELY-PARALLEL COMPUTERS [J].
MCKINLEY, PK ;
TSAI, YJ ;
ROBINSON, DF .
COMPUTER, 1995, 28 (12) :39-&
[8]   ALGORITHMS FOR SCALABLE SYNCHRONIZATION ON SHARED-MEMORY MULTIPROCESSORS [J].
MELLORCRUMMEY, JM ;
SCOTT, ML .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1991, 9 (01) :21-65
[9]   Efficient synchronization for embedded on-chip multiprocessors [J].
Monchiero, Matteo ;
Palermo, Gianluca ;
Silvano, Cristina ;
Villa, Oreste .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2006, 14 (10) :1049-1062
[10]  
Villa O., 2008, INT C COMP ARCH SYNT, P81, DOI DOI 10.1145/1450095.1450110