Hopscotch: A Hardware-Software Co-Design for Efficient Cache Resizing on Multi-Core SoCs

被引：2

作者：

Jiang, Zhe ^{[1
]}

Yang, Kecheng ^{[2
]}

Fisher, Nathan ^{[3
]}

Guan, Nan ^{[4
]}

Audsley, Neil C. ^{[5
]}

Dong, Zheng ^{[3
]}

机构：

[1] Southeast Univ, Sch Integrated Circuits, Nanjing 21000, Peoples R China

[2] Univ Cambridge, Comp Sci Dept, Cambridge CB3 0FD, England

[3] Texas State Univ, Dept Comp Sci, San Marcos, TX 78666 USA

[4] City Univ Hong Kong, Comp Sci Dept, Kowloon Tong, Hong Kong, Peoples R China

[5] City Univ London, Dept Comp Sci, London EC1V 0HB, England

来源：

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS | 2024年 / 35卷 / 01期

基金：

美国国家科学基金会;

关键词：

Task analysis; Real-time systems; Clocks; Parallel processing; Throughput; Software; Hardware; hardware/software co-design; L1; cache; schedulability analysis; PROCESSOR; ENERGY; PERFORMANCE; MANAGEMENT; OS;

D O I：

10.1109/TPDS.2023.3332711

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Following the trend of increasing autonomy in real-time systems, multi-core System-on-Chips (SoCs) have enabled devices to better handle the large streams of data and intensive computation required by such autonomous systems. In modern multi-core SoCs, each L1 cache is designed to be tied to an individual processor, and a processor can only access its own L1 cache. This design method ensures the system's average throughput, but also limits the possibility of parallelism, significantly reducing the system's real-time schedulability. To overcome this problem, we present a new system framework for highly-parallel multi-core systems, Hopscotch. Hopscotch introduces re-sizable L1 cache which is shared between processors in the same computing cluster. At execution, Hopscotch dynamically allocates L1 cache capacity to the tasks executed by the processors, unblocking the available parallelism in the system. Based on the new hardware architecture, we also present a new theoretical model and schedulability analysis providing cache size selection methods and corresponding timing guarantees for the system. As demonstrated in the evaluations, Hopscotch effectively improves system-level schedulability with negligible extra overhead.

引用

页码：89 / 104

页数：16

共 57 条

[1] Selective cache ways: On-demand cache resource allocation [J].

Albonesi, DH .

32ND ANNUAL INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, (MICRO-32), PROCEEDINGS, 1999, :248-259

[2]

[Anonymous], 2012, AMBA AXI and ACE protocol specification.

[3]

[Anonymous], 2016, technical report ucb/eecs-2016-17

[4]

Bachrach J, 2012, DES AUT CON, P1212

[5]

Benini L, 2012, DES AUT TEST EUROPE, P983

[6] The PARSEC Benchmark Suite: Characterization and Architectural Implications [J].

Bienia, Christian ;

Kumar, Sanjeev ;

Singh, Jaswinder Pal ;

Li, Kai .

PACT'08: PROCEEDINGS OF THE SEVENTEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2008, :72-81

[7] Coordinated Management of Multiple Interacting Resources in Chip Multiprocessors: A Machine Learning Approach [J].

Bitirgen, Ramazan ;

Ipek, Engin ;

Martinez, Jose F. .

2008 PROCEEDINGS OF THE 41ST ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE: MICRO-41, 2008, :318-+

[8]

Cai Y, 2006, ASIA S PACIF DES AUT, P923

[9] Predictable performance in SMT processors:: Synergy between the OS and SMTs [J].

Cazorla, Francisco J. ;

Knijnenburg, Peter M. W. ;

Sakellariou, Rizos ;

Fernandez, Enrique ;

Ramirez, Alex ;

Valero, Mateo .

IEEE TRANSACTIONS ON COMPUTERS, 2006, 55 (07) :785-799

[10] BROOM: An Open Source Out-of-Order Processor With Resilient Low-Voltace Operation in 28-nm CMOS [J].

Cello, Christopher ;

Chin, Pi-Feng ;

Asanovic, Krste ;

Nikolic, Borivoje ;

Patterson, David .

IEEE MICRO, 2019, 39 (02) :52-60

← 1 2 3 4 5 6 →