ScaleFlow: Efficient Deep Vision Pipeline with Closed-Loop Scale-Adaptive Inference

被引:1
作者
Leng, Yuyang [1 ]
Liu, Renyuan [1 ]
Guo, Hongpeng [2 ]
Chen, Songqing [1 ]
Yao, Shuochao [1 ]
机构
[1] George Mason Univ, Fairfax, VA 22030 USA
[2] Univ Illinois, Champaign, IL USA
来源
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023 | 2023年
基金
美国国家科学基金会;
关键词
Scale-adaptive Inference; Object Detection; Scale-equivariant; Wavelet Transform; Edge Computing; Anytime Inference; OBJECT DETECTION;
D O I
10.1145/3581783.3612412
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep visual data processing is underpinning many life-changing applications, such as auto-driving and smart cities. Improving the accuracy while minimizing their inference time under constrained resources has been the primary pursuit for their practical adoptions. Existing research thus has been devoted to either narrowing down the area of interest for the detection or miniaturizing the deep learning model for faster inference time. However, the former may risk missing/delaying small but important object detection, potentially leading to disastrous consequences (e.g., car accidents), while the latter often compromises the accuracy without fully utilizing intrinsic semantic information. To overcome these limitations, in this work, we propose ScaleFlow, a closed-loop scale-adaptive inference that can reduce model inference time by progressively processing vision data with increasing resolution but decreasing spatial size, achieving speedup without compromising accuracy. For this purpose, ScaleFlow refactors existing neural networks to be scale-equivariant on multiresolution data with the assistance of wavelet theory, providing predictable feature patterns on different data resolutions. Comprehensive experiments have been conducted to evaluate ScaleFlow. The results show that ScaleFlow can support anytime inference, consistently provide 1.5x to 2.2x speed up, and save around 25% similar to 45% energy consumption with < 1% accuracy loss on four embedded and edge platforms.
引用
收藏
页码:1698 / 1706
页数:9
相关论文
共 43 条
[1]  
[Anonymous], 2022, INTEL NEURAL COMPUTE
[2]  
[Anonymous], 2001, Multiresolution signal decomposition: transforms, subbands, and wavelets
[3]   Frugal Following: Power Thrifty Object Detection and Tracking for Mobile Augmented Reality [J].
Apicharttrisorn, Kittipat ;
Ran, Xukan ;
Chen, Jiasi ;
Krishnamurthy, Srikanth, V ;
Roy-Chowdhury, Amit K. .
PROCEEDINGS OF THE 17TH CONFERENCE ON EMBEDDED NETWORKED SENSOR SYSTEMS (SENSYS '19), 2019, :96-109
[4]   Soft-NMS - Improving Object Detection With One Line of Code [J].
Bodla, Navaneeth ;
Singh, Bharat ;
Chellappa, Rama ;
Davis, Larry S. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5562-5570
[5]  
Bojarski Mariusz, 2016, arXiv
[6]   Deep Learning with Low Precision by Half-wave Gaussian Quantization [J].
Cai, Zhaowei ;
He, Xiaodong ;
Sun, Jian ;
Vasconcelos, Nuno .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5406-5414
[7]   Glimpse: Continuous, Real-Time Object Recognition on Mobile Devices [J].
Chen, Tiffany Yu-Han ;
Ravindranath, Lenin ;
Deng, Shuo ;
Bahl, Paramvir ;
Balakrishnan, Hari .
SENSYS'15: PROCEEDINGS OF THE 13TH ACM CONFERENCE ON EMBEDDED NETWORKED SENSOR SYSTEMS, 2015, :155-168
[8]  
Courbariaux M., 2016, ARXIV
[9]   Server-Driven Video Streaming for Deep Learning Inference [J].
Du, Kuntai ;
Pervaiz, Ahsan ;
Yuan, Xin ;
Chowdhery, Aakanksha ;
Zhang, Qizheng ;
Hoffmann, Henry ;
Jiang, Junchen .
SIGCOMM '20: PROCEEDINGS OF THE 2020 ANNUAL CONFERENCE OF THE ACM SPECIAL INTEREST GROUP ON DATA COMMUNICATION ON THE APPLICATIONS, TECHNOLOGIES, ARCHITECTURES, AND PROTOCOLS FOR COMPUTER COMMUNICATION, 2020, :557-570
[10]   CenterNet: Keypoint Triplets for Object Detection [J].
Duan, Kaiwen ;
Bai, Song ;
Xie, Lingxi ;
Qi, Honggang ;
Huang, Qingming ;
Tian, Qi .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6568-6577