Low-Overhead Reuse Distance Profiling Tool for Multicore

被引:0
|
作者
Sasongko, Muhammad Aditya [1 ]
Chabbi, Milind [2 ]
Unat, Didem [1 ]
机构
[1] Koc Univ, Istanbul, Turkey
[2] Scalable Machines Res, San Jose, CA USA
来源
EURO-PAR 2021: PARALLEL PROCESSING WORKSHOPS | 2022年 / 13098卷
关键词
Reuse distance; Hardware performance counters; Debug registers; Address sampling;
D O I
10.1007/978-3-031-06156-1_49
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With the increase in core count in multicore systems, data movement is one of the main sources of performance slowdown in parallel applications and data locality has become a critical factor in application optimization. One of the important locality metrics is reuse distance, which shows the likelihood of a memory access to be a cache hit. In this work, we propose a low-overhead reuse distance profiling tool for multi-threaded applications. Our method relies on available hardware features in commodity CPUs, namely, Performance Monitoring Units (PMUs) and debug registers, to detect data reuse in private and shared caches by considering inter-thread cache line invalidations. Unlike prior approaches, our tool is fast, accurate, does not change the program behavior and can also handle shared cache accesses. Though it has low runtime (2.9x) and memory overheads (2.8x), our tool achieves 92% accuracy.
引用
收藏
页码:555 / 559
页数:5
相关论文
共 50 条
  • [41] Low-overhead core swapping for thermal management
    Kursun, E
    Reinman, G
    Sair, S
    Shayesteh, A
    Sherwood, T
    POWER-AWARE COMPUTER SYSTEMS, 2005, 3471 : 46 - 60
  • [42] Enhancing PAPI with Low-Overhead rdpmc Reads
    Liu, Yan
    Weaver, Vincent M.
    PROGRAMMING AND PERFORMANCE VISUALIZATION TOOLS, 2019, 11027 : 3 - 20
  • [43] Low-Overhead Bug Fingerprinting for Fast Debugging
    Zamfir, Cristian
    Candea, George
    RUNTIME VERIFICATION, 2010, 6418 : 460 - 468
  • [44] LOW-OVERHEAD SURFACE CODE LOGICAL HADAMARD
    Fowler, Austin G.
    QUANTUM INFORMATION & COMPUTATION, 2012, 12 (11-12) : 970 - 982
  • [45] Low-Overhead SEU-Tolerant Latches
    Liang Wang
    Suge Yue
    Yuanfur Zhao
    2007 5TH INTERNATIONAL CONFERENCE ON MICROWAVE AND MILLIMETER WAVE TECHNOLOGY PROCEEDINGS, 2007, : 627 - +
  • [46] A Method for Low-overhead Secure Network Coding
    Fei, Song
    Zhe, Cui
    APPLIED MATHEMATICS & INFORMATION SCIENCES, 2013, 7 (05): : 1699 - 1703
  • [47] Using Multicore Reuse Distance to Study Coherence Directories
    Zhao, Minshu
    Yeung, Donald
    ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2017, 35 (02):
  • [48] Accelerating Multicore Reuse Distance Analysis with Sampling and Parallelization
    Schuff, Derek L.
    Kulkarni, Milind
    Pai, Vijay S.
    PACT 2010: PROCEEDINGS OF THE NINETEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2010, : 53 - 63
  • [49] Optimal Model Partitioning with Low-Overhead Profiling on the PIM-based Platform for Deep Learning Inference
    Kim, Seok Young
    Lee, Jaewook
    Paik, Yoonah
    Kim, Chang Hyun
    Lee, Won Jun
    Kim, Seon Wook
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2024, 29 (02)
  • [50] A Low-Overhead and Low-Power RF Transceiver for Short-Distance On- and Off-Chip Interconnects
    Kim, Jongsun
    Byun, Gyungsu
    Chang, M. Frank
    IEICE TRANSACTIONS ON ELECTRONICS, 2011, E94C (05): : 854 - 857