Multi-modal object detection using unsupervised transfer learning and adaptation techniques

被引：1

作者：

Abbott, Rachael ^{[1
]}

Robertson, Neil ^{[1
]}

del Rincon, Jesus Martinez ^{[1
]}

Connor, Barry ^{[2
]}

机构：

[1] Queens Univ Belfast, Belfast, Antrim, North Ireland

[2] Thales UK, London, England

来源：

ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING IN DEFENSE APPLICATIONS | 2019年 / 11169卷

关键词：

Object detection; Transfer learning; Modality adaption; Thermal imagery; Multi-modal detection;

D O I：

10.1117/12.2532794

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep neural networks achieve state-of-the-art performance on object detection tasks with RGB data. However, there are many advantages of detection using multi-modal imagery for defence and security operations. For example, the IR modality offers persistent surveillance and is essential in poor lighting conditions and 24hr operation. It is, therefore, crucial to create an object detection system which can use IR imagery. Collecting and labelling large volumes of thermal imagery is incredibly expensive and time-consuming. Consequently, we propose to mobilise labelled RGB data to achieve detection in the IR modality. In this paper, we present a method for multi-modal object detection using unsupervised transfer learning and adaptation techniques. We train faster RCNN on RGB imagery and test with a thermal imager. The images contain object classes; people and land vehicles and represent real-life scenes which include clutter and occlusions. We improve the baseline F1-score by up to 20% through training with an additional loss function, which reduces the difference between RGB and IR feature maps. This work shows that unsupervised modality adaptation is possible, and we have the opportunity to maximise the use of labelled RGB imagery for detection in multiple modalities. The novelty of this work includes; the use of the IR imagery, modality adaption from RGB to IR for object detection and the ability to use real-life imagery in uncontrolled environments. The practical impact of this work to the defence and security community is an increase in performance and the saving of time and money in data collection and annotation.

引用

页数：10

共 50 条

[31] Unsupervised Cross-domain Object Detection Based on Progressive Multi-source Transfer
Li W.
Wang M.
Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (09): : 2337 - 2351
[32] Multi-scale multi-modal fusion for object detection in autonomous driving based on selective kernel
Gao, Xin
Zhang, Guoying
Xiong, Yijin
MEASUREMENT, 2022, 194
[33] Object Detection and Tracking for Community Surveillance using Transfer Learning
Machiraju, Gayatri Sasi Rekha
Kumari, K. Aruna
Sharif, Shaikh Khadar
PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT 2021), 2021, : 1035 - 1042
[34] Target-Style-Aware Unsupervised Domain Adaptation for Object Detection
Yun, Woo-han
Han, ByungOk
Lee, Jaeyeon
Kim, Jaehong
Kim, Junmo
IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02) : 3825 - 3832
[35] Deep Learning Based Multi-Modal Fusion Architectures for Maritime Vessel Detection
Farahnakian, Fahimeh
Heikkonen, Jukka
REMOTE SENSING, 2020, 12 (16)
[36] RGB-INFRARED MULTI-MODAL REMOTE SENSING OBJECT DETECTION USING CNN AND TRANSFORMER BASED FEATURE FUSION
Tian, Tao
Cai, Jiang
Xu, Yang
Wu, Zebin
Wei, Zhihui
Chanussot, Jocelyn
IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 5728 - 5731
[37] Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges
Feng, Di
Haase-Schutz, Christian
Rosenbaum, Lars
Hertlein, Heinz
Glaser, Claudius
Timm, Fabian
Wiesbeck, Werner
Dietmayer, Klaus
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (03) : 1341 - 1360
[38] Industrial object detection with multi-modal SSD: closing the gap between synthetic and real images
Julia Cohen
Carlos Crispim-Junior
Jean-Marc Chiappa
Laure Tougne Rodet
Multimedia Tools and Applications, 2024, 83 : 12111 - 12138
[39] Multi-Modal Fusion Based on Depth Adaptive Mechanism for 3D Object Detection
Liu, Zhanwen
Cheng, Juanru
Fan, Jin
Lin, Shan
Wang, Yang
Zhao, Xiangmo
IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 707 - 717
[40] Height-Adaptive Deformable Multi-Modal Fusion for 3D Object Detection
Li, Jiahao
Chen, Lingshan
Li, Zhen
IEEE ACCESS, 2025, 13 : 52385 - 52396

← 1 2 3 4 5 →