Real-Time Object Detection System with Multi-Path Neural Networks

被引：44

作者：

Heo, Seonyeong ^{[1
]}

Cho, Sungjun ^{[1
]}

Kim, Youngsok ^{[2
]}

Kim, Hanjun ^{[2
]}

机构：

[1] POSTECH, Pohang, South Korea

[2] Yonsei Univ, Seoul, South Korea

来源：

2020 IEEE REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM (RTAS 2020) | 2020年

关键词：

D O I：

10.1109/RTAS48715.2020.000-8

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Thanks to the recent advances in Deep Neural Networks (DNNs), DNN-based object detection systems become highly accurate and widely used in real-time environments such as autonomous vehicles, drones and security robots. Although the systems should detect objects within a certain time limit that can vary depending on their execution environments such as vehicle speeds, existing systems blindly execute the entire long-latency DNNs without reflecting the time-varying time limits, and thus they cannot guarantee real-time constraints. This work proposes a novel real-time object detection system that employs multi-path neural networks based on a new worst-case execution time (WCET) model for DNNs on a GPU. This work designs the WCET model for a single DNN layer analyzing processor and memory contention on GPUs, and extends the WCET model to the end-to-end networks. This work also designs the multi-path networks with three new operators such as skip, switch, and dynamic generate proposals that dynamically change their execution paths and the number of target objects. Finally, this work proposes a path decision model that chooses the optimal execution path at run-time reflecting dynamically changing environments and time constraints. Our detailed evaluation using widely-used driving datasets shows that the proposed real-time object detection system performs as good as a baseline object detection system without violating the time-varying time limits. Moreover, the WCET model predicts the worst-case execution latency of convolutional and group normalization layers with only 27% and 81% errors on average, respectively.

引用

页码：174 / 187

页数：14

共 44 条

[1] An Adaptive Performance Modeling Tool for GPU Architectures [J].

Baghsorkhi, Sara S. ;

Delahaye, Matthieu ;

Patel, Sanjay J. ;

Gropp, William D. ;

Hwu, Wen-mei W. .

PPOPP 2010: PROCEEDINGS OF THE 2010 ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, 2010, :105-114

[2] PredJoule: A Timing-Predictable Energy Optimization Framework for Deep Neural Networks [J].

Bateni, Soroush ;

Zhou, Husheng ;

Zhu, Yuankun ;

Liu, Cong .

2018 39TH IEEE REAL-TIME SYSTEMS SYMPOSIUM (RTSS 2018), 2018, :107-118

[3] Estimating the WCET of GPU-Accelerated Applications using Hybrid Analysis [J].

Betts, Adam ;

Donaldson, Alastair .

PROCEEDINGS OF THE 2013 25TH EUROMICRO CONFERENCE ON REAL-TIME SYSTEMS (ECRTS 2013), 2013, :193-202

[4]

Chetlur S., 2014, CUDNN EFFICIENT PRIM

[5] The Cityscapes Dataset for Semantic Urban Scene Understanding [J].

Cordts, Marius ;

Omran, Mohamed ;

Ramos, Sebastian ;

Rehfeld, Timo ;

Enzweiler, Markus ;

Benenson, Rodrigo ;

Franke, Uwe ;

Roth, Stefan ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223

[6] NestDNN: Resource-Aware Multi-Tenant On-Device Deep Learning for Continuous Mobile Vision [J].

Fang, Biyi ;

Zeng, Xiao ;

Zhang, Mi .

MOBICOM'18: PROCEEDINGS OF THE 24TH ANNUAL INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING, 2018, :115-127

[7]

Girshick R., 2018, Detectron

[8] Fast R-CNN [J].

Girshick, Ross .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448

[9] Rich feature hierarchies for accurate object detection and semantic segmentation [J].

Girshick, Ross ;

Donahue, Jeff ;

Darrell, Trevor ;

Malik, Jitendra .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587

[10]

Grubb Alex, 2012, P 15 INT C ARTIFICIA, P458

← 1 2 3 4 5 →