nnPerf: Demystifying DNN Runtime Inference Latency on Mobile Platforms

被引:1
作者
Chu, Haolin [1 ]
Zheng, Xiaolong [1 ]
Liu, Liang [1 ]
Ma, Huadong [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 21ST ACM CONFERENCE ON EMBEDDED NETWORKED SENSOR SYSTEMS, SENSYS 2023 | 2023年
基金
中国国家自然科学基金;
关键词
Mobile GPU; Deep Neural Network; Inference latency; Profiling;
D O I
10.1145/3625687.3625797
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present nnPerf, a real-time on-device profiler designed to collect and analyze the DNN model run-time inference latency on mobile platforms. nnPerf demystifies the hidden layers and metrics used for pursuing I)NN optimizations and adaptations al the granularity of operators and kernels, ensuring every facet contributing to a DNN model's run-time efficiency is easily accessible to mobile developers via well-defined APIs. With nnPerf, the mobile developers can easily identify the bottleneck in model run-time efficiency and optimize the model architecture to meet system -level objectives (SIX)). We implement nnPerf on TFLite framework and evaluate its e2e-, operator-, and kernel -latency profiling accuracy across four mobile platforms. The results show that nnPerf achieves consistently high latency profiling accuracy on both CPU (98.12%) and CPU (99.87%). Our benchmark studies demonstrate that running nnPerf on mobile devices introduces the minlintun overhead to model inference, with 0.231% and 0.605% extra inference latency and power consumption. We further run a case study to show how we leverage nnPerf to migrate OFA, a SOTA NAS system, to kernel oriented model optimization on GPUs.
引用
收藏
页码:125 / 137
页数:13
相关论文
共 89 条
  • [1] Adreno GPU Profiler, ADRENO GPU PROFILER
  • [2] Smart at what cost? Characterising Mobile Deep Neural Networks in the wild
    Almeida, Mario
    Laskaridis, Stefanos
    Mehrotra, Abhinav
    Dudziak, Lukasz
    Leontiadis, Ilias
    Lane, Nicholas D.
    [J]. PROCEEDINGS OF THE 2021 ACM INTERNET MEASUREMENT CONFERENCE, IMC 2021, 2021, : 658 - 672
  • [3] android, ANDROID STUDIO PROFI
  • [4] [Anonymous], TENSORFLOW BENCHMARK
  • [5] [Anonymous], 2021, NVIDIA NSIGHT COMPUT
  • [6] [Anonymous], ANDROID GPU INSPECTO
  • [7] [Anonymous], SSD MOBILEV2 TFLITE
  • [8] apple, XCODE INSTRUMENT
  • [9] bbc, CREATES NEW LEVELS D
  • [10] Bhaskaracharya S. C., 2020, ARXIV