RTHMS: A Tool for Data Placement on Hybrid Memory System

被引:39
作者
Peng, Ivy Bo [1 ]
Gioiosa, Roberto [2 ]
Kestor, Gokcen [2 ]
Cicotti, Pietro [3 ]
Laure, Erwin [1 ]
Markidis, Stefano [1 ]
机构
[1] KTH Royal Inst Technol, Dept Computat Sci & Technol, Stockholm, Sweden
[2] Pacific Northwest Natl Lab, Computat Sci & Math Div, Richland, WA USA
[3] San Diego Supercomp Ctr, Adv Technol Lab, San Diego, CA USA
关键词
heterogeneous memory systems; data placement; performance metrics;
D O I
10.1145/3156685.3092273
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Traditional scientific and emerging data analytics applications require fast, power-efficient, large, and persistent memories. Combining all these characteristics within a single memory technology is expensive and hence future supercomputers will feature different memory technologies side-by-side. However, it is a complex task to program hybrid-memory systems and to identify the best object-to-memory mapping. We envision that programmers will probably resort to use default configurations that only require minimal interventions on the application code or system settings. In this work, we argue that intelligent, fine-grained data placement can achieve higher performance than default setups. We present an algorithm for data placement on hybrid-memory systems. Our algorithm is based on a set of single-object allocation rules and global data placement decisions. We also present RTHMS, a tool that implements our algorithm and provides recommendations about the object-to-memory mapping. Our experiments on a hybrid memory system, an Intel Knights Landing processor with DRAM and HBM, show that RTHMS is able to achieve higher performance than the default configuration. We believe that RTHMS will be a valuable tool for programmers working on complex hybrid-memory systems.
引用
收藏
页码:82 / 91
页数:10
相关论文
共 20 条
[1]  
Absar J., 2006, AS S PAC C DES AUT 2, P6
[2]  
[Anonymous], 2017, RODINIA ACCELERATING
[3]  
[Anonymous], 2017, ZSBENCH MONTE CARLO
[4]   Leveraging Heterogeneity in DRAM Main Memories to Accelerate Critical Word Access [J].
Chatterjee, Niladrish ;
Shevgoor, Manjunath ;
Balasubramonian, Rajeev ;
Davis, Al ;
Fang, Zhen ;
Illikkal, Ramesh ;
Iyer, Ravi .
2012 IEEE/ACM 45TH INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO-45), 2012, :13-24
[5]  
Dulloor Subramanya R., 2016, P 11 EUROPEAN C COMP, P15, DOI DOI 10.1145/2901318.2901344
[6]  
Hassan A., 2015, P 12 ACM INT C COMP
[7]   Exploring Traditional and Emerging Parallel Programming Models using a Proxy Application [J].
Karlin, Ian ;
Bhatele, Abhinav ;
Keasler, Jeff ;
Chamberlain, Bradford L. ;
Cohen, Jonathan ;
DeVito, Zachary ;
Haque, Riyaz ;
Laney, Dan ;
Luke, Edward ;
Wang, Felix ;
Richards, David ;
Schulz, Martin ;
Still, Charles H. .
IEEE 27TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2013), 2013, :919-932
[8]  
Kestor G, 2013, I S WORKL CHAR PROC, P56, DOI 10.1109/IISWC.2013.6704670
[9]   Identifying Opportunities for Byte-Addressable Non-Volatile Memory in Extreme-Scale Scientific Applications [J].
Li, Dong ;
Vetter, Jeffrey S. ;
Marin, Gabriel ;
McCurdy, Collin ;
Cira, Cristian ;
Liu, Zhuo ;
Yu, Weikuan .
2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2012, :945-956
[10]   Pin: Building customized program analysis tools with dynamic instrumentation [J].
Luk, CK ;
Cohn, R ;
Muth, R ;
Patil, H ;
Klauser, A ;
Lowney, G ;
Wallace, S ;
Reddi, VJ ;
Hazelwood, K .
ACM SIGPLAN NOTICES, 2005, 40 (06) :190-200