Real-Time Mobile Acceleration of DNNs: From Computer Vision to Medical Applications

被引:6
作者
Li, Hongjia [1 ]
Yuan, Geng [1 ]
Niu, Wei [2 ]
Cai, Yuxuan [1 ]
Sun, Mengshu [1 ]
Li, Zhengang [1 ]
Ren, Bin [2 ]
Lin, Xue [1 ]
Wang, Yanzhi [1 ]
机构
[1] Northeastern Univ, Boston, MA 02115 USA
[2] Coll William & Mary, Williamsburg, VA USA
来源
2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC) | 2021年
基金
美国国家科学基金会;
关键词
computer vision; real-time; mobile acceleration;
D O I
10.1145/3394885.3431627
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the growth of mobile vision applications, there is a growing need to break through the current performance limitation of mobile platforms, especially for computationally intensive applications, such as object detection, action recognition, and medical diagnosis. To achieve this goal, we present our unified real-time mobile DNN inference acceleration framework, seamlessly integrating hardware-friendly, structured model compression with mobile-targeted compiler optimizations. We aim at an unprecedented, real-time performance of such large-scale neural network inference on mobile devices. A fine-grained block-based pruning scheme is proposed to be universally applicable to all types of DNN layers, such as convolutional layers with different kernel sizes and fully connected layers. Moreover, it is also successfully extended to 3D convolutions. With the assist of our compiler optimizations, the fine-grained block-based sparsity is fully utilized to achieve high model accuracy and high hardware acceleration simultaneously. To validate our framework, three representative fields of applications are implemented and demonstrated, object detection, activity detection, and medical diagnosis. All applications achieve real-time inference using an off-the-shelf smartphone, outperforming the representative mobile DNN inference acceleration frameworks by up to 6.7x in speed. The demonstrations of these applications can be found in the following link: https://bit.ly/39lWpYu.
引用
收藏
页码:581 / 586
页数:6
相关论文
共 24 条
  • [1] [Anonymous], 2018, OSDI
  • [2] [Anonymous], 2018, ECCV
  • [3] [Anonymous], 2018, Rsna pneumonia detection challenge
  • [4] DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car
    Bechtel, Michael G.
    McEllhiney, Elise
    Kim, Minje
    Yun, Heechul
    [J]. 2018 IEEE 24TH INTERNATIONAL CONFERENCE ON EMBEDDED AND REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS (RTCSA), 2018, : 11 - 21
  • [5] Bochkovskiy A., 2020, YOLO4 OPTIMAL SPEED
  • [6] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
    Carreira, Joao
    Zisserman, Andrew
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
  • [7] Chetlur Sharan, 2014, ARXIV PREPRINT ARXIV
  • [8] Learning Spatiotemporal Features with 3D Convolutional Networks
    Du Tran
    Bourdev, Lubomir
    Fergus, Rob
    Torresani, Lorenzo
    Paluri, Manohar
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4489 - 4497
  • [9] Guo Y., 2016, NeurIPS
  • [10] Channel Pruning for Accelerating Very Deep Neural Networks
    He, Yihui
    Zhang, Xiangyu
    Sun, Jian
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1398 - 1406