Low-Overhead Reuse Distance Profiling Tool for Multicore

被引:0
|
作者
Sasongko, Muhammad Aditya [1 ]
Chabbi, Milind [2 ]
Unat, Didem [1 ]
机构
[1] Koc Univ, Istanbul, Turkey
[2] Scalable Machines Res, San Jose, CA USA
来源
EURO-PAR 2021: PARALLEL PROCESSING WORKSHOPS | 2022年 / 13098卷
关键词
Reuse distance; Hardware performance counters; Debug registers; Address sampling;
D O I
10.1007/978-3-031-06156-1_49
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With the increase in core count in multicore systems, data movement is one of the main sources of performance slowdown in parallel applications and data locality has become a critical factor in application optimization. One of the important locality metrics is reuse distance, which shows the likelihood of a memory access to be a cache hit. In this work, we propose a low-overhead reuse distance profiling tool for multi-threaded applications. Our method relies on available hardware features in commodity CPUs, namely, Performance Monitoring Units (PMUs) and debug registers, to detect data reuse in private and shared caches by considering inter-thread cache line invalidations. Unlike prior approaches, our tool is fast, accurate, does not change the program behavior and can also handle shared cache accesses. Though it has low runtime (2.9x) and memory overheads (2.8x), our tool achieves 92% accuracy.
引用
收藏
页码:555 / 559
页数:5
相关论文
共 50 条
  • [21] On the Generation of Binary functions with Low-Overhead
    Voyiatzis, I.
    Efstathiou, C.
    2017 12TH IEEE INTERNATIONAL CONFERENCE ON DESIGN & TECHNOLOGY OF INTEGRATED SYSTEMS IN NANOSCALE ERA (DTIS 2017), 2017,
  • [22] Low-Overhead Accrual Failure Detector
    Ren, Xiao
    Dong, Jian
    Liu, Hongwei
    Li, Yang
    Yang, Xiaozong
    SENSORS, 2012, 12 (05): : 5815 - 5823
  • [23] Low-Overhead Vlrtualization of Mobile Platforms
    Heiser, Gernot
    PROCEEDINGS OF THE PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON COMPILERS, ARCHITECTURES AND SYNTHESIS FOR EMBEDDED SYSTEMS (CASES '11), 2011, : 3 - 3
  • [24] LOW-OVERHEAD SCHEDULING OF NESTED PARALLELISM
    HUMMEL, SF
    SCHONBERG, E
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1991, 35 (5-6) : 743 - 765
  • [25] LoGV: Low-overhead GPGPU Virtualization
    Gottschlag, Mathias
    Hillenbrand, Marius
    Kehne, Jens
    Stoess, Jan
    Bellosa, Frank
    2013 IEEE 15TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2013 IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (HPCC_EUC), 2013, : 1721 - 1726
  • [26] Torp: Full-Coverage and Low-Overhead Profiling of Host-Side Latency
    Chen, Xiang
    Liu, Hongyan
    Guo, Junyi
    Jiang, Xinyue
    Huang, Qun
    Zhang, Dong
    Wu, Chunming
    Zhou, Haifeng
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2022), 2022, : 1349 - 1358
  • [27] Low-Overhead Dynamic Instruction Mix Generation using Hybrid Basic Block Profiling
    Nowak, Andrzej
    Szostek, Pawel
    Yasin, Ahmad
    Zwaenepoel, Willy
    2018 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS), 2018, : 189 - 198
  • [28] Low-Voltage Low-Overhead Asynchronous Logic
    Sridharan, Akshay
    Sechen, Carl
    Jafari, Roozbeh
    2013 IEEE INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN (ISLPED), 2013, : 261 - 266
  • [29] Low-Latency Low-Overhead Zipper Codes
    Karimi, Bashirreza
    Barakatain, Masoud
    Hashemi, Yoones
    Chang, Deyuan
    Ebrahimzad, Hamid
    Li, Chuandong
    2022 EUROPEAN CONFERENCE ON OPTICAL COMMUNICATION (ECOC), 2022,
  • [30] A low-overhead checkpointing protocol for mobile networks
    Ahmed, RE
    Khaliq, A
    CCECE 2003: CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1-3, PROCEEDINGS: TOWARD A CARING AND HUMANE TECHNOLOGY, 2003, : 1779 - 1782