Counters based performance analysis and optimization of an out-of-order superscalar processor core

被引:0
作者
Sun C. [1 ]
Sui B. [1 ]
Wang L. [1 ]
Wang Y. [1 ]
Huang L. [1 ]
Li W. [1 ]
Wang J. [1 ]
机构
[1] College of Computer, National University of Defense Technology, Changsha
来源
| 1600年 / National University of Defense Technology卷 / 38期
关键词
Counters; Micro-architecture; Performance analysis; Processor core;
D O I
10.11887/j.cn.201605003
中图分类号
学科分类号
摘要
With the ever-increasing design complexity in the processor micro-architecture, performance analysis becomes more and more important in the research and design of processors. Performance models are used widely in the performance analysis, which are more suitable for the design space exploration in the early stage. When used in micro-architecture optimizations, the accuracy and the speed of performance models are the limiting factors. Therefore, a performance analysis method based on counters was proposed. In this method, the RTL register transfer level code of a processor core was used as a baseline, and a specialized performance monitor unit was added to collect the events needed by the micro-architecture analysis and optimization. Then the collected events were sent to a result analyzer, where the factors affecting the performance were obtained. By a dopting the method, we analyzed what affects the performance in running SPEC CPU2000 benchmarks on FPGA(field-programmable gate array) prototyping, and optimized the micro-architecture of processor core according to the analysis results. The performance of the optimized processor core is improved obviously. © 2016, NUDT Press. All right reserved.
引用
收藏
页码:14 / 19
页数:5
相关论文
共 16 条
  • [1] Moudgill M., Wellman J.D., Moreno J.H., Environment for PowerPC micro-architecture exploration, IEEE Micro, 19, 3, pp. 15-25, (1999)
  • [2] Singhal R., Venkatraman K.S., Cohn E.R., Et al., Performance analysis and validation of the Intel Pentium 4 processor on 90nm technology, Intel Technology Journal, 8, 1, (2004)
  • [3] Zhang F., Zhang L., Hu W., Sim-Godson: a godson processor simulator based on SimpleScalar, Chinese Journal of Computers, 30, 1, pp. 68-73, (2007)
  • [4] Austin T., Larson E., Ernst D., SimpleScalar: an infrastructure for computer system modeling, Computer, 35, 2, pp. 59-67, (2002)
  • [5] Martin M.M.K., Sorin D.J., Beckmann B.M., Et al., Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset, ACM SIGARCH Computer Architecture News, 33, 4, pp. 92-99, (2005)
  • [6] Binkert N.L., Dreslinski R.G., Hsu L.R., Et al., The M5 simulator: modeling networked systems, IEEE Micro, 26, 4, pp. 52-60, (2006)
  • [7] Binkert N.L., Beckmann B., Black G., Et al., The GEM5 simulator, ACM SIGARCH Computer Architecture News, 39, 2, pp. 1-7, (2011)
  • [8] Wunderlich R.E., Wenisch T.F., Falsafi B., Et al., SMARTS: accelerating micro-architecture simulation via rigorous statistical sampling, Computer Architecture NEWS, 31, 2, pp. 84-95, (2003)
  • [9] Zhu Y., Zhu Y., Wang Y., Compiler performance test and analysis based on hardware performance counters, Microelectronics & Computer, 25, 3, pp. 192-196, (2008)
  • [10] Che Y., Wang Z., Li X., A hardware counter based tool for application's performance measurement and analysis, Computer Science, 31, 1, pp. 170-174, (2004)