TPUPoint: Automatic Characterization of Hardware-Accelerated Machine-Learning Behavior for Cloud Computing

被引:0
作者
Wudenhe, Abenezer [1 ]
Tseng, Hung-Wei [1 ]
机构
[1] Univ Calif Riverside, Riverside, CA 92521 USA
来源
2021 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS 2021) | 2021年
基金
美国国家科学基金会;
关键词
BENCHMARK SUITE; SCALE;
D O I
10.1109/ISPASS51385.2021.00048
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the share of machine learning (ML) workloads in data centers rapidly increasing, cloud providers are beginning to incorporate accelerators such as tensor processing units (TPUs) to improve the energy-efficiency of applications. However, without optimizing application parameters, users may underutilize accelerators and end up wasting energy and money. This paper presents TPUPoint to facilitate the development of efficient applications on TPU-based cloud platforms. TPUPoint automatically classifies repetitive patterns into phases and identifies the most timing-critical operations in each phase. Further, TPUPoint can associate phases with checkpoints to allow fast-forwarding in applications, thereby significantly reducing the time and money spent optimizing applications. By running TPUPoint on a wide array of representative ML workloads, we found that computation is no longer the most time-consuming operation; instead, the infeed and reshape operations, which exchange and realign data, become most significant. TPUPoints advantages significantly increase the potential for discovering optimal parameters to quickly balance the complex workload pipeline of feeding data into a system, reformatting the data, and computing results.
引用
收藏
页码:254 / 264
页数:11
相关论文
共 77 条
  • [1] Adolf R, 2016, I S WORKL CHAR PROC, P148
  • [2] Alibaba, 2018, MATRIX
  • [3] amerly G., 2006, 2006 IEEE INT S PERF, P131
  • [4] [Anonymous], 2017, TRAINING
  • [5] [Anonymous], 2018, QANET COMBINING LOCA
  • [6] [Anonymous], 2017, DeepBench: Benchmarking Deep Learning Operations on Different Hardware
  • [7] [Anonymous], 2019, INTRO EEMBC MLMARK B
  • [8] [Anonymous], 2017, Inside Volta: The World's Most Advanced Data Center GPU
  • [9] [Anonymous], 1967, P 5 BERK S MATH STAT
  • [10] [Anonymous], ISCA