A Survey of Deep Learning on Mobile Devices: Applications, Optimizations, Challenges, and Research Opportunities

被引:44
作者
Zhao, Tianming [1 ]
Xie, Yucheng [2 ]
Wang, Yan [1 ]
Cheng, Jerry [3 ]
Guo, Xiaonan [4 ]
Hu, Bin [5 ]
Chen, Yingying [5 ]
机构
[1] Temple Univ, Dept Comp & Informat Sci, Philadelphia, PA 19122 USA
[2] Indiana Univ Purdue Univ, Dept Elect & Comp Engn, Indianapolis, IN 46202 USA
[3] New York Inst Technol, Dept Comp Sci, New York, NY 10023 USA
[4] Indiana Univ Purdue Univ, Dept Comp & Informat Technol, Indianapolis, IN 46202 USA
[5] Rutgers State Univ, Dept Elect & Comp Engn, New Brunswick, NJ 08901 USA
基金
美国国家科学基金会;
关键词
Deep learning; Pipelines; Transportation; Mobile handsets; Hardware; Software; Libraries; Deep learning (DL); hardware and software accelerator design; mobile security; mobile sensing; optimization; CONVOLUTIONAL NEURAL-NETWORK; ACTIVITY RECOGNITION; SMARTPHONE SENSORS; EDGE; IOT; CLASSIFICATION; SYSTEM; BEHAVIOR;
D O I
10.1109/JPROC.2022.3153408
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deep learning (DL) has demonstrated great performance in various applications on powerful computers and servers. Recently, with the advancement of more powerful mobile devices (e.g., smartphones and touch pads), researchers are seeking DL solutions that could be deployed on mobile devices. Compared to traditional DL solutions using cloud servers, deploying DL on mobile devices have unique advantages in data privacy, communication overhead, and system cost. This article provides a comprehensive survey for the current studies of adopting and deploying DL on mobile devices. Specifically, we summarize and compare the state-of-the-art DL techniques on mobile devices in various application domains involving vision, speech/speaker recognition, human activity recognition, transportation mode detection, and security. We generalize an optimization pipeline for bringing DL to mobile devices, including model-oriented optimization mechanisms (e.g., pruning and quantization) and nonmodel-oriented optimization mechanisms (e.g., software accelerator and hardware design). Moreover, we summarize popular DL libraries regarding their support to state-of-the-art models (software) and processors (hardware). Based on our summarization, we further provide insights into potential research opportunities for developing DL for mobile devices.
引用
收藏
页码:334 / 354
页数:21
相关论文
共 190 条
[51]   EIE: Efficient Inference Engine on Compressed Deep Neural Network [J].
Han, Song ;
Liu, Xingyu ;
Mao, Huizi ;
Pu, Jing ;
Pedram, Ardavan ;
Horowitz, Mark A. ;
Dally, William J. .
2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, :243-254
[52]  
Hard A., 2018, CoRR
[53]   A robust human activity recognition system using smartphone sensors and deep learning [J].
Hassan, Mohammed Mehedi ;
Uddin, Md. Zia ;
Mohamed, Amr ;
Almogren, Ahmad .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 81 :307-313
[54]  
He YZ, 2019, INT CONF ACOUST SPEE, P6381, DOI [10.1109/ICASSP.2019.8682336, 10.1109/icassp.2019.8682336]
[55]   AMC: AutoML for Model Compression and Acceleration on Mobile Devices [J].
He, Yihui ;
Lin, Ji ;
Liu, Zhijian ;
Wang, Hanrui ;
Li, Li-Jia ;
Han, Song .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :815-832
[56]  
He YH, 2019, IEEE INT CONF ROBOT, P8339, DOI [10.1109/icra.2019.8793673, 10.1109/ICRA.2019.8793673]
[57]   DeftNN: Addressing Bottlenecks for DNN Execution on GPUs via Synapse Vector Elimination and Near-compute Data Fission [J].
Hill, Parker ;
Jain, Animesh ;
Hill, Mason ;
Zamirai, Babak ;
Hsu, Chang-Hong ;
Laurenzano, Michael A. ;
Mahlke, Scott ;
Tang, Lingjia ;
Mars, Jason .
50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2017, :786-799
[58]  
Hinton G., 2014, Distilling the knowledge in a neural network
[59]  
Hnoohom N, 2018, 2018 1ST INTERNATIONAL ECTI NORTHERN SECTION CONFERENCE ON ELECTRICAL, ELECTRONICS, COMPUTER AND TELECOMMUNICATIONS ENGINEERING (ECTI-NCON, P116, DOI 10.1109/ECTI-NCON.2018.8378293
[60]   GRNN: Low-Latency and Scalable RNN Inference on GPUs [J].
Holmes, Connor ;
Mawhirter, Daniel ;
He, Yuxiong ;
Yan, Feng ;
Wu, Bo .
PROCEEDINGS OF THE FOURTEENTH EUROSYS CONFERENCE 2019 (EUROSYS '19), 2019,