Research on Pedestrian Detection Based on Multimodal Information Fusion

被引:0
作者
Yang, Xiaoping [1 ,2 ]
Li, Zhehong [1 ,2 ]
Liu, Yuan [3 ]
Huang, Ran [1 ,2 ]
Tan, Kai [1 ,2 ]
Huang, Lin [1 ,2 ]
机构
[1] Guilin Univ Technol, Sch Informat Sci & Engn, Guilin 541004, Guangxi, Peoples R China
[2] Guilin Univ Technol, Guangxi Key Lab Embedded Technol & Intelligent Sy, Guilin 541004, Guangxi, Peoples R China
[3] Guilin Med Univ, Coll Intelligent Med & Biotechnol, Guilin 541004, Guangxi, Peoples R China
来源
INFORMATION TECHNOLOGY AND CONTROL | 2023年 / 52卷 / 04期
基金
中国国家自然科学基金;
关键词
Multimodal Pedestrian Detection; Faster R-CNN; Generalized Intersection Over Union; Feature Fusion;
D O I
10.5755/j01.itc.52.4.33766
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The automatic driving system based on a single-mode sensor is susceptible to the external environment in pedestrian detection. This paper proposes a fusion of light and thermal infrared multimodal pedestrian detection methodology. Firstly, 1 x 1 convolution and dilated convolution square measure are introduced within the residual network, and also the ROIAlign methodology is employed to exchange the ROIPooling methodology to map the candidate box to the feature layer to optimize the Faster R-CNN. Secondly, the generalized intersection over union (GIoU) loss function is employed as the loss function of prediction box positioning regression. Finally, to explore the performance of multimodal image pedestrian detection methods in different fusion periods in the improved Faster R-CNN, four forms of multimodal neural network structures are designed to fuse visible and thermal infrared pictures. Experimental results show that the proposed algorithm performs better on the KAIST dataset than current mainstream detection algorithms. Compared to the conventional ACF + T + THOG pedestrian detector, the AP is 8.38 percentage points greater. The miss rate is 5.34 percentage points lower than the visible light pedestrian detector.
引用
收藏
页码:1045 / 1057
页数:13
相关论文
共 28 条
  • [1] Angelova A., 2015, Bmvc2015, DOI [DOI 10.5244/C.29.32, 10.5244/C.29.32]
  • [2] CNN-Based Classification for Highly Similar Vehicle Model Using Multi-Task Learning
    Avianto, Donny
    Harjoko, Agus
    Afiahayati
    [J]. JOURNAL OF IMAGING, 2022, 8 (11)
  • [3] [陈莹 Chen Ying], 2020, [光学精密工程, Optics and Precision Engineering], V28, P2700
  • [4] SSD-EMB: An Improved SSD Using Enhanced Feature Map Block for Object Detection
    Choi, Hong-Tae
    Lee, Ho-Jun
    Kang, Hoon
    Yu, Sungwook
    Park, Ho-Hyun
    [J]. SENSORS, 2021, 21 (08)
  • [5] A robust and fast multispectral pedestrian detection deep network
    Ding, Lu
    Wang, Yong
    Laganiere, Robert
    Huang, Dan
    Luo, Xinbin
    Zhang, Huanlong
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 227
  • [6] Eun-Jin Choi, 2010, Proceedings of the 5th International Conference on Computer Sciences and Convergence Information Technology (ICCIT 2010), P882, DOI 10.1109/ICCIT.2010.5711182
  • [7] Feng Y.P., 2021, Electronic Measurement Technology, V44, P123
  • [8] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [9] Hou Y. L, 2017, IEEE INT C SIGNAL PR, P1, DOI [10.1109/ICSPCC.2017.8242507, DOI 10.1109/ICSPCC.2017.8242507]
  • [10] Huang X, 2020, PROC CVPR IEEE, P10747, DOI 10.1109/CVPR42600.2020.01076