Multi-modal object detection using unsupervised transfer learning and adaptation techniques

被引：1

作者：

Abbott, Rachael ^{[1
]}

Robertson, Neil ^{[1
]}

del Rincon, Jesus Martinez ^{[1
]}

Connor, Barry ^{[2
]}

机构：

[1] Queens Univ Belfast, Belfast, Antrim, North Ireland

[2] Thales UK, London, England

来源：

ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING IN DEFENSE APPLICATIONS | 2019年 / 11169卷

关键词：

Object detection; Transfer learning; Modality adaption; Thermal imagery; Multi-modal detection;

D O I：

10.1117/12.2532794

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep neural networks achieve state-of-the-art performance on object detection tasks with RGB data. However, there are many advantages of detection using multi-modal imagery for defence and security operations. For example, the IR modality offers persistent surveillance and is essential in poor lighting conditions and 24hr operation. It is, therefore, crucial to create an object detection system which can use IR imagery. Collecting and labelling large volumes of thermal imagery is incredibly expensive and time-consuming. Consequently, we propose to mobilise labelled RGB data to achieve detection in the IR modality. In this paper, we present a method for multi-modal object detection using unsupervised transfer learning and adaptation techniques. We train faster RCNN on RGB imagery and test with a thermal imager. The images contain object classes; people and land vehicles and represent real-life scenes which include clutter and occlusions. We improve the baseline F1-score by up to 20% through training with an additional loss function, which reduces the difference between RGB and IR feature maps. This work shows that unsupervised modality adaptation is possible, and we have the opportunity to maximise the use of labelled RGB imagery for detection in multiple modalities. The novelty of this work includes; the use of the IR imagery, modality adaption from RGB to IR for object detection and the ability to use real-life imagery in uncontrolled environments. The practical impact of this work to the defence and security community is an increase in performance and the saving of time and money in data collection and annotation.

引用

页数：10

共 50 条

[11] Unsupervised Change Detection in Multi-Modal SAR Images using CycleGAN
Bergamasco, Luca
Bovolo, Francesca
Proceedings of SPIE - The International Society for Optical Engineering, 2022, 12267
[12] Unsupervised scene detection and commentator building using multi-modal chains
Poulisse, Gert-Jan
Patsis, Yorgos
Moens, Marie-Francine
MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 70 (01) : 159 - 175
[13] Unsupervised anchor shot detection using multi-modal spectral clustering
Ma, Chengyuan
Lee, Chin-Hui
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 813 - 816
[14] Lightweight Multi-modal Representation Learning for RGB Salient Object Detection
Xiao, Yun
Huang, Yameng
Li, Chenglong
Liu, Lei
Zhou, Aiwu
Tang, Jin
COGNITIVE COMPUTATION, 2023, 15 (06) : 1868 - 1883
[15] Lightweight Multi-modal Representation Learning for RGB Salient Object Detection
Yun Xiao
Yameng Huang
Chenglong Li
Lei Liu
Aiwu Zhou
Jin Tang
Cognitive Computation, 2023, 15 : 1868 - 1883
[16] Learning Adaptive Fusion Bank for Multi-Modal Salient Object Detection
Wang, Kunpeng
Tu, Zhengzheng
Li, Chenglong
Zhang, Cheng
Luo, Bin
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (08) : 7344 - 7358
[17] Multi-modal unsupervised domain adaptation for semantic image segmentation
Hu, Sijie
Bonardi, Fabien
Bouchafa, Samia
Sidibe, Desire
PATTERN RECOGNITION, 2023, 137
[18] Multi-modal Queried Object Detection in the Wild
Xu, Yifan
Zhang, Mengdan
Fu, Chaoyou
Chen, Peixian
Yang, Xiaoshan
Li, Ke
Xu, Changsheng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[19] Differentiated Learning for Multi-Modal Domain Adaptation
Lv, Jianming
Liu, Kaijie
He, Shengfeng
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1322 - 1330
[20] Adversarial unsupervised domain adaptation for 3D semantic segmentation with multi-modal learning
Liu, Wei
Luo, Zhiming
Cai, Yuanzheng
Yu, Ying
Ke, Yang
Marcato Junior, Jose
Goncalves, Wesley Nunes
Li, Jonathan
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2021, 176 : 211 - 221

← 1 2 3 4 5 →