Analysis of memory hierarchy performance of block data layout

被引:7
作者
Park, N [1 ]
Hong, B [1 ]
Prasanna, VK [1 ]
机构
[1] Univ So Calif, Dept Elect Engn Syst, Los Angeles, CA 90089 USA
来源
2002 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, PROCEEDING | 2002年
关键词
D O I
10.1109/ICPP.2002.1040857
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, several experimental studies have been conducted on block data layout as a data transformation technique used in conjunction with tiling to improve cache performance. In this paper, we provide a theoretical analysis for the TLB and cache performance of block data layout. For standard matrix access patterns, we derive an asymptotic lower bound on the number of TLB misses for any data layout and show that block data layout achieves this bound. We show that block data layout improves TLB misses by a factor of O(B) compared with conventional data layouts, where B is the block size of block data layout. This reduction contributes to the improvement in memory hierarchy performance. Using our TLB and cache analysis, we also discuss the impact of block size on the overall memory hierarchy performance. These results are validated through simulations and experiments on state-of-the-art platforms.
引用
收藏
页码:35 / 44
页数:10
相关论文
共 23 条
  • [1] Burger D, 1997, 1342 U WISC MAD COMP
  • [2] CHAME J, 2000, P SOLV MEM WALL WORK
  • [3] CHATTERJEE S, 1999, P 13 ACM ICS 99 JUN
  • [4] CHATTERJEE S, 1999, P 11 ANN ACM S PAR A, P222
  • [5] CIERNIAK M, 1995, P SIGPLAN 95 C PROGR, P205, DOI DOI 10.1145/207110.207145
  • [6] COLEMAN S, 1995, P SIGPLAN PLDI JUN
  • [7] Horowitz E., 1998, Computer Algorithm
  • [8] KANDEMIR M, 1998, P 31 IEEE ACM INT S
  • [9] LAM M, 1991, P ASPLOS 4 APR
  • [10] MITCHELL N, 1998, INT J PARALLEL PROGR