DeepThings: Distributed Adaptive Deep Learning Inference on Resource-Constrained IoT Edge Clusters

被引:295
作者
Zhao, Zhuoran [1 ]
Barijough, Kamyar Mirzazad [1 ]
Gerstlauer, Andreas [1 ]
机构
[1] Univ Texas Austin, Dept Elect & Comp Engn, Austin, TX 78712 USA
基金
美国国家科学基金会;
关键词
Deep learning; distributed inference; edge computing; Internet of Things;
D O I
10.1109/TCAD.2018.2858384
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Edge computing has emerged as a trend to improve scalability, overhead, and privacy by processing large-scale data, e.g., in deep learning applications locally at the source. In IoT networks, edge devices are characterized by tight resource constraints and often dynamic nature of data sources, where existing approaches for deploying Deep/Convolutional Neural Networks (DNNs/CNNs) can only meet IoT constraints when severely reducing accuracy or using a static distribution that cannot adapt to dynamic IoT environments. In this paper, we propose DeepThings, a framework for adaptively distributed execution of CNN-based inference applications on tightly resource-constrained IoT edge clusters. DeepThings employs a scalable Fused Tile Partitioning (FTP) of convolutional layers to minimize memory footprint while exposing parallelism. It further realizes a distributed work stealing approach to enable dynamic workload distribution and balancing at inference runtime. Finally, we employ a novel work scheduling process to improve data reuse and reduce overall execution latency. Results show that our proposed FTP method can reduce memory footprint by more than 68% without sacrificing accuracy. Furthermore, compared to existing work sharing methods, our distributed work stealing and work scheduling improve throughput by 1.7 x -2.2 x with multiple dynamic data sources. When combined, DeepThings provides scalable CNN inference speedups of 1.7 x-3.5 x on 2-6 edge devices with less than 23 MB memory each.
引用
收藏
页码:2348 / 2359
页数:12
相关论文
共 23 条
  • [1] [Anonymous], 2018, DEEPTHINGS
  • [2] [Anonymous], 2015, P USENIX HOTSTORAGE, DOI DOI 10.5555/2827719.2827740
  • [3] [Anonymous], 2016, MICRO
  • [4] [Anonymous], IEEE T CLOUD COMPUT
  • [5] [Anonymous], 2017, ARXIV170701083
  • [6] [Anonymous], 2017, Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications
  • [7] [Anonymous], 2016, DARKNET OPEN SOURCE
  • [8] Bhattacharya S., 2016, PROC 14 ACM C EMBEDD, P176
  • [9] Chien SY, 2015, ASIA S PACIF DES AUT, P130, DOI 10.1109/ASPDAC.2015.7058993
  • [10] Dukhan Marat., 2018, Nnpack