Dynamic scratchpad memory management for code in portable systems with an MMU

被引:22
作者
Egger, Bernhard [1 ]
Lee, Jaejin [1 ]
Shin, Heonshik [1 ]
机构
[1] Seoul Natl Univ, Sch Engn & Comp Sci, Adv Compiler Res Lab, Seoul 151744, South Korea
关键词
D O I
10.1145/1331331.1331335
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this work, we present a dynamic memory allocation technique for a novel, horizontally partitioned memory subsystem targeting contemporary embedded processors with a memory management unit (MMU). We propose to replace the on-chip instruction cache with a scratchpad memory (SPM) and a small minicache. Serializing the address translation with the actual memory access enables the memory system to access either only the SPM or the minicache. Independent of the SPM size and based solely on profiling information, a postpass optimizer classifies the code of an application binary into a pageable and a cacheable code region. The latter is placed at a fixed location in the external memory and cached by the minicache. The former, the pageable code region, is copied on demand to the SPM before execution. Both the pageable code region and the SPM are logically divided into pages the size of an MMU memory page. Using the MMU's pagefault exception mechanism, a runtime scratchpad memory manager ( SPMM) tracks page accesses and copies frequently executed code pages to the SPM before they get executed. In order to minimize the number of page transfers from the external memory to the SPM, good code placement techniques become more important with increasing sizes of the MMU pages. We discuss code-grouping techniques and provide an analysis of the effect of the MMU's page size on execution time, energy consumption, and external memory accesses. We show that by using the data cache as a victim buffer for the SPM, significant energy savings are possible. We evaluate our SPM allocation strategy with fifteen applications, including H. 264, MP3, MPEG-4, and PGP. The proposed memory system requires 8% less die are compared to a fully-cached configuration. On average, we achieve a 31% improvement in runtime performance and a 35% reduction in energy consumption with an MMU page size of 256 bytes.
引用
收藏
页数:38
相关论文
共 40 条
[1]  
[Anonymous], P 10 INT S HARDW SOF
[2]  
[Anonymous], 2004, P INT C COMP ARCH SY
[3]  
[Anonymous], 3 WORKSH EMB SYST RE
[4]  
Cho H, 2007, LCTES'07: PROCEEDINGS OF THE 2007 ACM SIGPLAN-SIGBED CONFERENCE ON LANGUAGES, COMPILERS, AND TOOLS FOR EMBEDDED SYSTEMS, P195
[5]  
Cormen T. H., 1990, INTRO ALGORITHMS
[6]  
Denning P. J., 1983, Communications of the ACM, V26, P43, DOI 10.1145/357980.357997
[7]  
Dominguez A., 2005, J EMBEDDED COMPUTING, V1
[9]  
GUTHAUS MR, 1998, P 4 ANN WORKSH WORKL
[10]   A novel instruction scratchpad memory optimization method based on concomitance metric [J].
Janapsatya, Andhi ;
Ignjatovic, Aleksandar ;
Parameswaran, Sri .
ASP-DAC 2006: 11TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, PROCEEDINGS, 2006, :612-617