Efficient Characterization of Hidden Processor Memory Hierarchies

被引:1
作者
Cooper, Keith [1 ]
Xu, Xiaoran [1 ]
机构
[1] Rice Univ, Houston, TX 77005 USA
来源
COMPUTATIONAL SCIENCE - ICCS 2018, PT III | 2018年 / 10862卷
关键词
Efficient characterization; Hidden memory hierarchies; Code performance; Portable tool; CACHE;
D O I
10.1007/978-3-319-93713-7_27
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A processor's memory hierarchy has a major impact on the performance of running code. However, computing platforms, where the actual hardware characteristics are hidden from both the end user and the tools that mediate execution, such as a compiler, a JIT and a runtime system, are used more and more, for example, performing large scale computation in cloud and cluster. Even worse, in such environments, a single computation may use a collection of processors with dissimilar characteristics. Ignorance of the performance-critical parameters of the underlying system makes it difficult to improve performance by optimizing the code or adjusting runtime-system behaviors; it also makes application performance harder to understand. To address this problem, we have developed a suite of portable tools that can efficiently derive many of the parameters of processor memory hierarchies, such as levels, effective capacity and latency of caches and TLBs, in a matter of seconds. The tools use a series of carefully considered experiments to produce and analyze cache response curves automatically. The tools are inexpensive enough to be used in a variety of contexts that may include install time, compile time or runtime adaption, or performance understanding tools.
引用
收藏
页码:335 / 349
页数:15
相关论文
共 47 条
  • [31] TSV: A novel energy efficient Memory Integrity Verification scheme for embedded systems
    Nimgaonkar, Satyajeet
    Gomathisankaran, Mahadevan
    Mohanty, Saraju P.
    [J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2013, 59 (07) : 400 - 411
  • [32] An Energy-Efficient GPGPU Register File Architecture Using Racetrack Memory
    Mao, Mengjie
    Wen, Wujie
    Zhang, Yaojun
    Chen, Yiran
    Li, Hai
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2017, 66 (09) : 1478 - 1490
  • [33] Efficient Page Collection Scheme for QLC NAND Flash Memory using Cache
    Seo, Seok-Bin
    Kim, Wanil
    Kwon, Se Jin
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (11) : 458 - 461
  • [34] A Nanoelectromechanical-Switch-Based Thermal Management for 3-D Integrated Many-Core Memory-Processor System
    Huang, Xiwei
    Zhang, Chun
    Yu, Hao
    Zhang, Wei
    [J]. IEEE TRANSACTIONS ON NANOTECHNOLOGY, 2012, 11 (03) : 588 - 600
  • [35] An energy-efficient cache replacement policy for ultra-dense racetrack memory
    Hameed, Fazal
    Maqsood, Moazam
    Irtaza, Syed Ali
    [J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2023, 137
  • [36] An energy-efficient encryption mechanism for NVM-based main memory in mobile systems
    Liu, Duo
    Luo, Xianlu
    Li, Yang
    Shao, Zili
    Guan, Yong
    [J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2017, 76 : 47 - 57
  • [37] An Energy-Efficient Near-Memory Computing Architecture for CNN Inference at Cache Level
    Nouripayam, Masoud
    Prieto, Arturo
    Kishorelal, Vignajeth Kuttuva
    Rodrigues, Joachim
    [J]. 2021 28TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS, AND SYSTEMS (IEEE ICECS 2021), 2021,
  • [38] Evolving Skyrmion Racetrack Memory as Energy-Efficient Last-Level Cache Devices
    Yang, Ya-Hui
    Chen, Shuo-Han
    Chang, Yuan-Hao
    [J]. 2022 ACM/IEEE INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, ISLPED 2022, 2022,
  • [39] Energy-Efficient and Performance-Enhanced Disks Using Flash-Memory Cache
    Hsieh, Jen-Wei
    Kuo, Tei-Wei
    Wu, Po-Liang
    Huang, Yu-Chung
    [J]. ISLPED'07: PROCEEDINGS OF THE 2007 INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, 2007, : 334 - 339
  • [40] Me-CLOCK: A Memory-Efficient Framework to Implement Replacement Policies for Large Caches
    Chen, Zhiguang
    Xiao, Nong
    Lu, Yutong
    Liu, Fang
    Ou, Yang
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2016, 65 (08) : 2665 - 2671