CaPPS: cache partitioning with partial sharing for multi-core embedded systems

被引：1

作者：

Zang, Wei ^{[1
]}

Gordon-Ross, Ann ^{[2
]}

机构：

[1] SK Hynix Memory Solut, San Jose, CA USA

[2] Univ Florida, Dept Elect & Comp Engn, Gainesville, FL USA

来源：

DESIGN AUTOMATION FOR EMBEDDED SYSTEMS | 2016年 / 20卷 / 01期

基金：

美国国家科学基金会;

关键词：

Cache memories; Modeling techniques; Optimization; Performance evaluation; HIGH-PERFORMANCE; ASSOCIATIVITY;

D O I：

10.1007/s10617-015-9168-7

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

As the number of cores in chip multi-processor systems increases, the contention over shared last-level cache (LLC) resources increases, thus making LLC optimization critical, especially for embedded systems with strict area/energy/power constraints. We propose cache partitioning with partial sharing (CaPPS), which reduces LLC contention using cache partitioning and improves utilization with sharing configuration. Sharing configuration enables the partitions to be privately allocated to a single core, partially shared with a subset of cores, or fully shared with all cores based on the co-executing applications' requirements. CaPPS imposes low hardware overhead and affords an extensive design space to increase optimization potential. To facilitate fast design space exploration, we develop an analytical model to quickly estimate the miss rates of all CaPPS configurations using the applications' isolated LLC access traces to predict runtime LLC contention. Experimental results demonstrate that the analytical model estimates cache miss rates with an average error of only 0.73% and with an average speedup of 3505x as compared to a cycle-accurate simulator. Due to CaPPS's extensive design space, CaPPS can reduce the average LLC miss rate by as much as 25% as compared to baseline configurations and as much as 14-17 % as compared to prior works.

引用

页码：65 / 92

页数：28

共 28 条

[1]

[Anonymous], THE GEM5 SIMULATOR

[2]

Burger D, 2000, CSTR1308

[3] Predicting inter-thread cache contention on a chip multi-processor architecture [J].

Chandra, D ;

Guo, F ;

Kim, S ;

Solihin, Y .

11TH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 2005, :340-351

[4]

Chang J, 2014, 25 ANN INT C SUP ANN

[5]

Chang JC, 2006, CONF PROC INT SYMP C, P264, DOI 10.1145/1150019.1136509

[6]

Chen XE, 2009, INT S HIGH PERF COMP, P329, DOI 10.1109/HPCA.2009.4798270

[7]

Chiou D, 2000, 430 MIT COMP STRUCT

[8]

Dybdahl H, 2007, INT S HIGH PERF COMP, P2

[9]

Eklov David., 2011, Proceedings of the International Conference on High Performance Embedded Architectures and Compilers, P147, DOI DOI 10.1145/1944862.1944885

[10]

Ghasemzadeh H, 2006, P WORLD AC SCI TECHN, V16

← 1 2 3 →