In-kernel integration of operating system and infiniband functions for high performance computing clusters: A DSM example

被引:4
作者
Liss, L [1 ]
Birk, Y
Schuster, A
机构
[1] Technion Israel Inst Technol, Dept Elect Engn, IL-32000 Haifa, Israel
[2] Technion Israel Inst Technol, Dept Comp Sci, IL-32000 Haifa, Israel
关键词
hardware/software interfaces; high-speed networks; distributed shared memory; parallel computing;
D O I
10.1109/TPDS.2005.111
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The Infiniband (IB) System Area Network (SAN) enables applications to access hardware directly from user level, reducing the overhead of user-kernel crossings during data transfer. However, distributed applications that exhibit close coupling between network and OS services may benefit from accessing IB from the kernel through IB's native Verbs interface, which permits tight integration of these services. We assess this approach using a sequential-consistency Distributed Shared Memory (DSM) system as an example. We first develop primitives that abstract the low-level communication and kernel details, and efficiently serve the application's communication, memory, and scheduling needs. Next, we combine the primitives to form a kernel DSM protocol. The approach is evaluated using our full-fledged Linux kernel DSM implementation over Infiniband. We show that overheads are reduced substantially, and overall application performance is improved in terms of both absolute execution time and scalability relative to an entirely user level implementation.
引用
收藏
页码:830 / 840
页数:11
相关论文
共 21 条
[1]  
[Anonymous], 1995, P 22 ANN INT S COMP
[2]  
BAILEY D, 1991, RNR91002 NASA AM
[3]  
BANIKAZEMI M, 2001, P INT C PAR PROC ICP
[4]  
Bilas A, 1999, CONF PROC INT SYMP C, P282, DOI [10.1109/ISCA.1999.765958, 10.1145/307338.301003]
[5]  
ERLICHSON A, 1996, P 7 INT C ARCH SUPP
[6]  
*INF TRAD ASS, 2005, INF SPEC
[7]  
ITZKOVITZ A, 1999, P C OS DES IMPL
[8]  
JOUBERT P, 2001, P USENIX ANN TECHN C
[9]  
KELEHER P, 1994, PROCEEDINGS OF THE WINTER 1994 USENIX CONFERENCE, P115
[10]  
KELEHER P, 1992, ACM COMP AR, V20, P13, DOI 10.1145/146628.139676