Real-Time Object Detection System with Multi-Path Neural Networks

被引:44
作者
Heo, Seonyeong [1 ]
Cho, Sungjun [1 ]
Kim, Youngsok [2 ]
Kim, Hanjun [2 ]
机构
[1] POSTECH, Pohang, South Korea
[2] Yonsei Univ, Seoul, South Korea
来源
2020 IEEE REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM (RTAS 2020) | 2020年
关键词
D O I
10.1109/RTAS48715.2020.000-8
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Thanks to the recent advances in Deep Neural Networks (DNNs), DNN-based object detection systems become highly accurate and widely used in real-time environments such as autonomous vehicles, drones and security robots. Although the systems should detect objects within a certain time limit that can vary depending on their execution environments such as vehicle speeds, existing systems blindly execute the entire long-latency DNNs without reflecting the time-varying time limits, and thus they cannot guarantee real-time constraints. This work proposes a novel real-time object detection system that employs multi-path neural networks based on a new worst-case execution time (WCET) model for DNNs on a GPU. This work designs the WCET model for a single DNN layer analyzing processor and memory contention on GPUs, and extends the WCET model to the end-to-end networks. This work also designs the multi-path networks with three new operators such as skip, switch, and dynamic generate proposals that dynamically change their execution paths and the number of target objects. Finally, this work proposes a path decision model that chooses the optimal execution path at run-time reflecting dynamically changing environments and time constraints. Our detailed evaluation using widely-used driving datasets shows that the proposed real-time object detection system performs as good as a baseline object detection system without violating the time-varying time limits. Moreover, the WCET model predicts the worst-case execution latency of convolutional and group normalization layers with only 27% and 81% errors on average, respectively.
引用
收藏
页码:174 / 187
页数:14
相关论文
共 44 条
[1]   An Adaptive Performance Modeling Tool for GPU Architectures [J].
Baghsorkhi, Sara S. ;
Delahaye, Matthieu ;
Patel, Sanjay J. ;
Gropp, William D. ;
Hwu, Wen-mei W. .
PPOPP 2010: PROCEEDINGS OF THE 2010 ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, 2010, :105-114
[2]   PredJoule: A Timing-Predictable Energy Optimization Framework for Deep Neural Networks [J].
Bateni, Soroush ;
Zhou, Husheng ;
Zhu, Yuankun ;
Liu, Cong .
2018 39TH IEEE REAL-TIME SYSTEMS SYMPOSIUM (RTSS 2018), 2018, :107-118
[3]   Estimating the WCET of GPU-Accelerated Applications using Hybrid Analysis [J].
Betts, Adam ;
Donaldson, Alastair .
PROCEEDINGS OF THE 2013 25TH EUROMICRO CONFERENCE ON REAL-TIME SYSTEMS (ECRTS 2013), 2013, :193-202
[4]  
Chetlur S., 2014, CUDNN EFFICIENT PRIM
[5]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[6]   NestDNN: Resource-Aware Multi-Tenant On-Device Deep Learning for Continuous Mobile Vision [J].
Fang, Biyi ;
Zeng, Xiao ;
Zhang, Mi .
MOBICOM'18: PROCEEDINGS OF THE 24TH ANNUAL INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING, 2018, :115-127
[7]  
Girshick R., 2018, Detectron
[8]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[9]   Rich feature hierarchies for accurate object detection and semantic segmentation [J].
Girshick, Ross ;
Donahue, Jeff ;
Darrell, Trevor ;
Malik, Jitendra .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587
[10]  
Grubb Alex, 2012, P 15 INT C ARTIFICIA, P458