Deep Learning-Based Outdoor Object Detection Using Visible and Near-Infrared Spectrum

被引:6
作者
Bhowmick, Shubhadeep [1 ]
Kuiry, Somenath [2 ]
Das, Alaka [2 ]
Das, Nibaran [1 ]
Nasipuri, Mita [1 ]
机构
[1] Jadavpur Univ, Dept CSE, Kolkata 700032, India
[2] Jadavpur Univ, Dept Math, Kolkata 700032, India
关键词
Object detection; Multispectral object detection dataset; Near-infrared image; YOLO v3 algorithm; YOLOv4; RECOGNITION;
D O I
10.1007/s11042-021-11848-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Object detection is one of the essential branches of computer vision. However, detecting objects in the natural scene is challenging due to various reasons, for example, different sizes of objects, overlapping and similarities of colour, the texture of different objects, etc. The visible spectrum is not suited for standard computer vision tasks in many real-life scenarios. In low visibility settings, moving outside the visible spectrum range, such as to the thermal spectrum or near-infrared (NIR) imaging, is significantly more beneficial. For the object detection task in this study, we used photos from both the RGB and NIR spectrums. The purpose of this paper is to see if it's possible to use the information offered by the near-infrared (NIR) spectrum in conjunction with the visible band for object detection because they both have multimodal information. For example, because near-infrared wavelengths are less prone to haze and distortion, some visually indistinguishable things in the RGB spectrum can be spotted in the NIR image. We gathered a well-organized dataset of outdoor scenes in three spectra: visible (RGB), near-infrared (NIR), and thermal to train such a multispectral object recognition system. For the experiments, we use the YOLOv3 algorithm to train and evaluate our object detection models for NIR and RGB images separately, then train the model with four-channel input (3 channels from RGB images and one channel from NIR images) and the corresponding annotations to see if the model's performance improves even more in detecting the underlying objects. To determine the effectiveness of our approach, we conducted trials on YOLOv4 and SSD models and compared our results with existing related state-of-the-art models.
引用
收藏
页码:9385 / 9402
页数:18
相关论文
共 35 条
[1]  
Aguilera C, 2017, P INT C PRACT APPL A, P155
[2]   Context-Aware Fusion of RGB and Thermal Imagery for Traffic Monitoring [J].
Alldieck, Thiemo ;
Bahnsen, Chris H. ;
Moeslund, Thomas B. .
SENSORS, 2016, 16 (11)
[3]  
Ambinder M, 2011, NATL J, V3
[4]   Multispectral Image-Aided Automatic Landing System: Position Availability Investigation during Final Approach [J].
Angermann, M. ;
Wolkow, S. ;
Schwithal, A. ;
Tonhaeuser, C. ;
Bestmann, U. ;
Hecker, P. .
PROCEEDINGS OF THE ION 2017 PACIFIC PNT MEETING, 2017, :56-69
[5]  
[Anonymous], 2010, IEEE C COMPUTER VISI, DOI DOI 10.1109/CVPR.2010.5539906
[6]  
Bo Z., 2013, Engineering, V05, P553, DOI [10.4236/eng.2013.510B114, DOI 10.4236/ENG.2013.510B114]
[7]  
Brown M, 2011, PROC CVPR IEEE, P177, DOI 10.1109/CVPR.2011.5995637
[8]   RANUS: RGB and NIR Urban Scene Dataset for Deep Scene Parsing [J].
Choe, Gyeongmin ;
Kim, Seong-Heum ;
Im, Sunghoon ;
Lee, Joon-Young ;
Narasimhan, Srinivasa G. ;
Kweon, In So .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2018, 3 (03) :1808-1815
[9]   Human Detection and Identification by Robots Using Thermal and Visual Information in Domestic Environments [J].
Correa, Mauricio ;
Hermosilla, Gabriel ;
Verschae, Rodrigo ;
Ruiz-del-Solar, Javier .
JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2012, 66 (1-2) :223-243
[10]  
Davis JW, 2005, WACV 2005: SEVENTH IEEE WORKSHOP ON APPLICATIONS OF COMPUTER VISION, PROCEEDINGS, P364