Performance Evaluation of the Impact of NUMA on One-sided RDMA Interactions

被引：4

作者：

Nelson, Jacob ^{[1
]}

Palmieri, Roberto ^{[1
]}

机构：

[1] Lehigh Univ, CSE, Bethlehem, PA 18015 USA

来源：

2020 INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS (SRDS 2020) | 2020年

基金：

美国国家科学基金会;

关键词：

RDMA; NUMA; Performance; Locality;

D O I：

10.1109/SRDS51746.2020.00036

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Remote direct memory access (RDMA) and non-uniform memory access (NUMA) are critical technologies of modern high-performance computing platforms. RDMA allows nodes to directly access memory on remote machines. Multiprocessor architectures implement NUMA to scale up memory access performance. When paired together, these technologies exhibit performance penalties under certain configurations. This paper is the first study to explore these configurations to provide quantitative findings on the impact of NUMA for RDMA-based systems. One of the consequences of ultra-fast networks is that known implications of NUMA locality now constitute a higher relative impact on the performance of RDMA-enabled distributed systems. Our study quantifies its role and uncovers unexpected behavior. In summary, poor NUMA locality of remotely accessible memory can lead to an automatic 20% performance degradation. Additionally, local workloads operating on remotely accessible memory can lead to 300% performance gap depending on memory locality. Surprisingly, configurations demonstrating this result contradict the presumed impact of NUMA locality. Our findings are validated using two generations of RDMA cards, a synthetic benchmark, and the popular application Memcached ported for RDMA.

引用

页码：288 / 298

页数：11

共 41 条

[1]

[Anonymous], 2004, Linux J

[2]

Atikoglu Berk, 2012, Performance Evaluation Review, V40, P53, DOI 10.1145/2318857.2254766

[3] NUMA Aware I/O in Virtualized Systems [J].

Banerjee, Amitabha ;

Mehta, Rishi ;

Shen, Zach .

PROCEEDINGS 2015 IEEE 23RD ANNUAL SYMPOSIUM ON HIGH-PERFORMANCE INTERCONNECTS - HOTI 2015, 2015, :10-17

[4] Implementation and Evaluation of iSCSI over RDMA [J].

Bums, Ethan ;

Russell, Robert .

SNAPI 2008: FIFTH IEEE INTERNATIONAL WORKSHOP ON STORAGE NETWORK ARCHITECTURE AND PARALLEL I/OS, PROCEEDINGS, 2008, :3-10

[5] Efficient Distributed Memory Management with RDMA and Caching [J].

Cai, Qingchao ;

Guo, Wentian ;

Zhang, Hao ;

Agrawal, Divyakant ;

Chen, Gang ;

Ooi, Beng Chin ;

Tan, Kian-Lee ;

Teo, Yong Meng ;

Wang, Sheng .

PROCEEDINGS OF THE VLDB ENDOWMENT, 2018, 11 (11) :1604-1617

[6]

Calciu I, 2017, OPER SYST REV, V51, P207, DOI 10.1145/3037697.3037721

[7] NUMA-Aware Reader-Writer Locks [J].

Calciu, Irina ;

Dice, Dave ;

Lev, Yossi ;

Luchangco, Victor ;

Marathe, Virendra J. ;

Shavit, Nir .

ACM SIGPLAN NOTICES, 2013, 48 (08) :157-166

[8] Fast In-Memory Transaction Processing Using RDMA and HTM [J].

Chen, Haibo ;

Chen, Rong ;

Wei, Xingda ;

Shi, Jiaxin ;

Chen, Yanzhe ;

Wang, Zhaoguo ;

Zang, Binyu ;

Guan, Haibing .

ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2017, 35 (01)

[9] Fast and General Distributed Transactions using RDMA and HTM [J].

Chen, Yanzhe ;

Wei, Xingda ;

Shi, Jiaxin ;

Chen, Rong ;

Chen, Haibo .

PROCEEDINGS OF THE ELEVENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS, (EUROSYS 2016), 2016,

[10]

Cheng J, 2013, PROCEEDINGS OF THE 2013 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE FOR ENGINEERING SOLUTIONS (CIES), P1, DOI 10.1109/CIES.2013.6611721

← 1 2 3 4 5 →