Low-Overhead Reuse Distance Profiling Tool for Multicore

被引:0
|
作者
Sasongko, Muhammad Aditya [1 ]
Chabbi, Milind [2 ]
Unat, Didem [1 ]
机构
[1] Koc Univ, Istanbul, Turkey
[2] Scalable Machines Res, San Jose, CA USA
来源
EURO-PAR 2021: PARALLEL PROCESSING WORKSHOPS | 2022年 / 13098卷
关键词
Reuse distance; Hardware performance counters; Debug registers; Address sampling;
D O I
10.1007/978-3-031-06156-1_49
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With the increase in core count in multicore systems, data movement is one of the main sources of performance slowdown in parallel applications and data locality has become a critical factor in application optimization. One of the important locality metrics is reuse distance, which shows the likelihood of a memory access to be a cache hit. In this work, we propose a low-overhead reuse distance profiling tool for multi-threaded applications. Our method relies on available hardware features in commodity CPUs, namely, Performance Monitoring Units (PMUs) and debug registers, to detect data reuse in private and shared caches by considering inter-thread cache line invalidations. Unlike prior approaches, our tool is fast, accurate, does not change the program behavior and can also handle shared cache accesses. Though it has low runtime (2.9x) and memory overheads (2.8x), our tool achieves 92% accuracy.
引用
收藏
页码:555 / 559
页数:5
相关论文
共 50 条
  • [1] A Low-Overhead Method of Embedded Software Profiling
    Liu Fagui
    Li Shengwen
    Xie Ran
    Luo Chunwei
    2009 ISECS INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT, VOL IV, 2009, : 436 - 439
  • [2] Low-Overhead Micro architectural Patching for Multicore Memory Subsystems
    Lee, Doowon
    Matthews, Opeoluwa
    Bertacco, Valeria
    2018 IEEE 36TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD), 2018, : 17 - 25
  • [3] A low-overhead profiling and visualization framework for Hybrid Transactional Memory
    Arcas, Oriol
    Kirchhofer, Philipp
    Soenmez, Nehir
    Schindewolf, Martin
    Unsal, Osman S.
    Karl, Wolfgang
    Cristal, Adrian
    2012 IEEE 20TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2012, : 1 - 8
  • [4] Branch Regulation: Low-Overhead Protection from Code Reuse Attacks
    Kayaalp, Mehmet
    Ozsoy, Meltem
    Abu-Ghazaleh, Nael
    Ponomarev, Dmitry
    2012 39TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2012, : 94 - 105
  • [5] A unified, low-overhead framework to support continuous profiling and optimization
    Zhang, M
    He, XB
    Yang, Q
    2003 IEEE INTERNATIONAL PERFORMANCE, COMPUTING, AND COMMUNICATIONS CONFERENCE PROCEEDINGS, 2003, : 327 - 334
  • [6] Low-Overhead Trace Collection and Profiling on GPU Compute Kernels
    Darche, Sebastien
    Dagenais, Michel R.
    ACM TRANSACTIONS ON PARALLEL COMPUTING, 2024, 11 (02)
  • [7] MAMBO: A Low-Overhead Dynamic Binary Modification Tool for ARM
    Gorgovan, Cosmin
    D'Antras, Amanieu
    Lujan, Mikel
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2016, 13 (01)
  • [8] Low-overhead memory leak detection using adaptive statistical profiling
    Chilimbi, TM
    Hauswirth, M
    ACM SIGPLAN NOTICES, 2004, 39 (11) : 156 - 164
  • [9] Low-Overhead Deadlock Prediction
    Cai, Yan
    Meng, Ruijie
    Palsberg, Jens
    2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020), 2020, : 1298 - 1309
  • [10] Low-Overhead Paxos Replication
    Guo J.
    Chu J.
    Cai P.
    Zhou M.
    Zhou A.
    Data Science and Engineering, 2017, 2 (2) : 169 - 177