MDSSD-MobV2: An embedded deconvolutional multispectral pedestrian detection based on SSD-MobileNetV2

被引:2
作者
Aghaee, Fereshteh [1 ]
Fazl-Ersi, Ehsan [2 ]
Noori, Hamid [2 ]
机构
[1] Ferdowsi Univ Mashhad, Dept Elect Engn, Mashhad, Iran
[2] Ferdowsi Univ Mashhad, Dept Comp Engn, Mashhad, Iran
关键词
Multispectral network; Embedded systems; Pedestrian detection; Deconvolution; MobileNetV2; CNN;
D O I
10.1007/s11042-023-17188-7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multispectral pedestrian detection is known as a promising task in the field of computer vision and deep learning regarding its robustness in challenging conditions like adverse illumination, occlusion, and low-resolution imaging. Both the input data and the network layers can benefit from multispectral fusion. Due to the high computational cost of fusing visible and thermal modalities, it cannot be accomplished efficiently across many tasks in this domain, such as autonomous vehicles, robotic applications, security, and monitoring systems. Despite this fact, most recent efforts have focused on accuracy as a dominant parameter and have paid less attention to the importance of speed. As a solution, this paper proposes a fast Multispectral Deconvolutional MobileNetV2 based Single Shot Detector (MDSSD-MobV2) that effectively balances two crucial goals of accuracy and speed. This lightweight multispectral network is built upon the novel high-resolution MobileNetV2 followed by DSSD auxiliary layers, which fuse visible and thermal feature maps at multiple levels of the network architecture. This research is primarily dependent on developing a framework for a blind assistance detector that is light enough to run on embedded systems with low memory and processing power while still retaining the required accuracy. The evaluation results on the well-known KAIST benchmark show that this method attains sufficient speed on the Nvidia Jetson TX2 while still keeping up with recent solutions with a miss rate margin of less than 1%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document} and improving their best speed by more than 3.5x on the non-embedded operating system of the GTX 1080Ti platform.
引用
收藏
页码:43801 / 43829
页数:29
相关论文
共 47 条
[1]  
[Anonymous], 2017, 2017 IEEE INT C SIGN
[2]  
[Anonymous], 2018, CVPR
[3]   Illuminating Pedestrians via Simultaneous Detection & Segmentation [J].
Brazil, Garrick ;
Yin, Xi ;
Liu, Xiaoming .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4960-4969
[4]   Hard Negative Mining for Metric Learning Based Zero-Shot Classification [J].
Bucher, Maxime ;
Herbin, Stephane ;
Jurie, Frederic .
COMPUTER VISION - ECCV 2016 WORKSHOPS, PT III, 2016, 9915 :524-531
[5]  
Burger Wilhelm, 2022, Scale-Invariant Feature Transform (SIFT), P709
[6]   A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection [J].
Cai, Zhaowei ;
Fan, Quanfu ;
Feris, Rogerio S. ;
Vasconcelos, Nuno .
COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :354-370
[7]   Locality guided cross-modal feature aggregation and pixel-level fusion for multispectral pedestrian detection [J].
Cao, Yanpeng ;
Luo, Xing ;
Yang, Jiangxin ;
Cao, Yanlong ;
Yang, Michael Ying .
INFORMATION FUSION, 2022, 88 :1-11
[8]   Multispectral image fusion based pedestrian detection using a multilayer fused deconvolutional single-shot detector [J].
Chen, Yunfan ;
Shin, Hyunchul .
JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 2020, 37 (05) :768-779
[9]   Fast Feature Pyramids for Object Detection [J].
Dollar, Piotr ;
Appel, Ron ;
Belongie, Serge ;
Perona, Pietro .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (08) :1532-1545
[10]   Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection [J].
Du, Xianzhi ;
El-Khamy, Mostafa ;
Lee, Jungwon ;
Davis, Larry .
2017 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2017), 2017, :953-961