Design and Implementation of a Convolutional Neural Network on an Edge Computing Smartphone for Human Activity Recognition

被引:54
作者
Zebin, Tahmina [1 ]
Scully, Patricia J. [2 ]
Peek, Niels [3 ]
Casson, Alexander J. [4 ]
Ozanyan, Krikor B. [4 ]
机构
[1] Univ East Anglia, Sch Comp Sci, Norwich NR4 7TJ, Norfolk, England
[2] NUI Galway, Sch Phys, Galway H91 TK33, Ireland
[3] Univ Manchester, Hlth eRes Ctr, Manchester M13 9PL, Lancs, England
[4] Univ Manchester, Dept Elect & Elect Engn, Manchester M13 9PL, Lancs, England
来源
IEEE ACCESS | 2019年 / 7卷
基金
英国工程与自然科学研究理事会;
关键词
Convolutional neural networks; edge computing; tensorflow lite; activity recognition; deep learning;
D O I
10.1109/ACCESS.2019.2941836
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Edge computing aims to integrate computing into everyday settings, enabling the system to be context-aware and private to the user. With the increasing success and popularity of deep learning methods, there is an increased demand to leverage these techniques in mobile and wearable computing scenarios. In this paper, we present an assessment of a deep human activity recognition system's memory and execution time requirements, when implemented on a mid-range smartphone class hardware and the memory implications for embedded hardware. This paper presents the design of a convolutional neural network (CNN) in the context of human activity recognition scenario. Here, layers of CNN automate the feature learning and the influence of various hyper-parameters such as the number of filters and filter size on the performance of CNN. The proposed CNN showed increased robustness with better capability of detecting activities with temporal dependence compared to models using statistical machine learning techniques. The model obtained an accuracy of 96.4% in a five-class static and dynamic activity recognition scenario. We calculated the proposed model memory consumption and execution time requirements needed for using it on a mid-range smartphone. Per-channel quantization of weights and per-layer quantization of activation to 8-bits of precision post-training produces classification accuracy within 2% of floating-point networks for dense, convolutional neural network architecture. Almost all the size and execution time reduction in the optimized model was achieved due to weight quantization. We achieved more than four times reduction in model size when optimized to 8-bit, which ensured a feasible model capable of fast on-device inference.
引用
收藏
页码:133509 / 133520
页数:12
相关论文
共 38 条
  • [1] [Anonymous], 2016, P 13 INT C MOB UB SY
  • [2] [Anonymous], DEEP LEARNING DEMO
  • [3] [Anonymous], UCI HAR MACHINE LEAR
  • [4] [Anonymous], 2018, ARXIV180309492
  • [5] [Anonymous], TENSOR OW LITE MOBIL
  • [6] [Anonymous], SPARKFUN EDGE DEV BO
  • [7] [Anonymous], IEEE J BIOMED HLTH I
  • [8] [Anonymous], MPU 9150 PROD SPEC
  • [9] [Anonymous], P EUR C COMP VIS ECC
  • [10] [Anonymous], 2019, TREPN POWER PROFILER