Pedestrian Detection for Autonomous Cars: Inference Fusion of Deep Neural Networks

被引：8

作者：

Islam, Muhammad Mobaidul ^{[1
]}

Newaz, Abdullah Al Redwan ^{[2
]}

Karimoddini, Ali ^{[1
]}

机构：

[1] North Carolina Agr & Tech State Univ, Dept Elect & Comp Engn, Greensboro, NC 27411 USA

[2] Univ New Orleans, Dept Comp Sci, New Orleans, LA 70148 USA

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2022年 / 23卷 / 12期

基金：

美国国家科学基金会;

关键词：

Semantics; Feature extraction; Object detection; Detectors; Runtime; Deep learning; Location awareness; Pedestrian detection; autonomous vehicles; object detection; semantic segmentation; fusion; deep learning; COVARIANCE;

D O I：

10.1109/TITS.2022.3210186

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Network fusion has been recently explored as an approach for improving pedestrian detection performance. However, most existing fusion methods suffer from runtime efficiency, modularity, scalability, and maintainability due to the complex structure of the entire fused models, their end-to-end training requirements, and sequential fusion process. Addressing these challenges, this paper proposes a novel fusion framework that combines asymmetric inferences from object detectors and semantic segmentation networks for jointly detecting multiple pedestrians. This is achieved by introducing a consensus-based scoring method that fuses pair-wise pixel-relevant information from the object detector and the semantic segmentation network to boost the final confidence scores. The parallel implementation of the object detection and semantic segmentation networks in the proposed framework entails a low runtime overhead. The efficiency and robustness of the proposed fusion framework are extensively evaluated by fusing different state-of-the-art pedestrian detectors and semantic segmentation networks on a public dataset. The generalization of fused models is also examined on new cross pedestrian data collected through an autonomous car. Results show that the proposed fusion method significantly improves detection performance while achieving competitive runtime efficiency.

引用

页码：23358 / 23368

页数：11

共 54 条

[1] [Anonymous], 2006, IEEE C COMP VIS PATT
[2] YOLACT Real-time Instance Segmentation
Bolya, Daniel
Zhou, Chong
Xiao, Fanyi
Lee, Yong Jae
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9156 - 9165
[3] Illuminating Pedestrians via Simultaneous Detection & Segmentation
Brazil, Garrick
Yin, Xi
Liu, Xiaoming
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4960 - 4969
[4] HarDNet: A Low Memory Traffic Network
Chao, Ping
Kao, Chao-Yang
Ruan, Yu-Shan
Huang, Chien-Hsiang
Lin, Youn-Long
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3551 - 3560
[5] Chen L.C., 2014, ARXIV14127062, V40, P357
[6] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
Chen, Liang-Chieh
Zhu, Yukun
Papandreou, George
Schroff, Florian
Adam, Hartwig
[J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
[7] Beyond triplet loss: a deep quadruplet network for person re-identification
Chen, Weihua
Chen, Xiaotang
Zhang, Jianguo
Huang, Kaiqi
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1320 - 1329
[8] The Cityscapes Dataset for Semantic Urban Scene Understanding
Cordts, Marius
Omran, Mohamed
Ramos, Sebastian
Rehfeld, Timo
Enzweiler, Markus
Benenson, Rodrigo
Franke, Uwe
Roth, Stefan
Schiele, Bernt
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
[9] Histograms of oriented gradients for human detection
Dalal, N
Triggs, B
[J]. 2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, : 886 - 893
[10] Human detection using oriented histograms of flow and appearance
Dalal, Navneet
Triggs, Bill
Schmid, Cordelia
[J]. COMPUTER VISION - ECCV 2006, PT 2, PROCEEDINGS, 2006, 3952 : 428 - 441

← 1 2 3 4 5 6 →