Multi-modal object detection using unsupervised transfer learning and adaptation techniques

被引:1
|
作者
Abbott, Rachael [1 ]
Robertson, Neil [1 ]
del Rincon, Jesus Martinez [1 ]
Connor, Barry [2 ]
机构
[1] Queens Univ Belfast, Belfast, Antrim, North Ireland
[2] Thales UK, London, England
来源
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING IN DEFENSE APPLICATIONS | 2019年 / 11169卷
关键词
Object detection; Transfer learning; Modality adaption; Thermal imagery; Multi-modal detection;
D O I
10.1117/12.2532794
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks achieve state-of-the-art performance on object detection tasks with RGB data. However, there are many advantages of detection using multi-modal imagery for defence and security operations. For example, the IR modality offers persistent surveillance and is essential in poor lighting conditions and 24hr operation. It is, therefore, crucial to create an object detection system which can use IR imagery. Collecting and labelling large volumes of thermal imagery is incredibly expensive and time-consuming. Consequently, we propose to mobilise labelled RGB data to achieve detection in the IR modality. In this paper, we present a method for multi-modal object detection using unsupervised transfer learning and adaptation techniques. We train faster RCNN on RGB imagery and test with a thermal imager. The images contain object classes; people and land vehicles and represent real-life scenes which include clutter and occlusions. We improve the baseline F1-score by up to 20% through training with an additional loss function, which reduces the difference between RGB and IR feature maps. This work shows that unsupervised modality adaptation is possible, and we have the opportunity to maximise the use of labelled RGB imagery for detection in multiple modalities. The novelty of this work includes; the use of the IR imagery, modality adaption from RGB to IR for object detection and the ability to use real-life imagery in uncontrolled environments. The practical impact of this work to the defence and security community is an increase in performance and the saving of time and money in data collection and annotation.
引用
收藏
页数:10
相关论文
共 50 条
  • [11] Unsupervised Change Detection in Multi-Modal SAR Images using CycleGAN
    Bergamasco, Luca
    Bovolo, Francesca
    Proceedings of SPIE - The International Society for Optical Engineering, 2022, 12267
  • [12] Unsupervised scene detection and commentator building using multi-modal chains
    Poulisse, Gert-Jan
    Patsis, Yorgos
    Moens, Marie-Francine
    MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 70 (01) : 159 - 175
  • [13] Unsupervised anchor shot detection using multi-modal spectral clustering
    Ma, Chengyuan
    Lee, Chin-Hui
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 813 - 816
  • [14] Lightweight Multi-modal Representation Learning for RGB Salient Object Detection
    Xiao, Yun
    Huang, Yameng
    Li, Chenglong
    Liu, Lei
    Zhou, Aiwu
    Tang, Jin
    COGNITIVE COMPUTATION, 2023, 15 (06) : 1868 - 1883
  • [15] Lightweight Multi-modal Representation Learning for RGB Salient Object Detection
    Yun Xiao
    Yameng Huang
    Chenglong Li
    Lei Liu
    Aiwu Zhou
    Jin Tang
    Cognitive Computation, 2023, 15 : 1868 - 1883
  • [16] Learning Adaptive Fusion Bank for Multi-Modal Salient Object Detection
    Wang, Kunpeng
    Tu, Zhengzheng
    Li, Chenglong
    Zhang, Cheng
    Luo, Bin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (08) : 7344 - 7358
  • [17] Multi-modal unsupervised domain adaptation for semantic image segmentation
    Hu, Sijie
    Bonardi, Fabien
    Bouchafa, Samia
    Sidibe, Desire
    PATTERN RECOGNITION, 2023, 137
  • [18] Multi-modal Queried Object Detection in the Wild
    Xu, Yifan
    Zhang, Mengdan
    Fu, Chaoyou
    Chen, Peixian
    Yang, Xiaoshan
    Li, Ke
    Xu, Changsheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [19] Differentiated Learning for Multi-Modal Domain Adaptation
    Lv, Jianming
    Liu, Kaijie
    He, Shengfeng
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1322 - 1330
  • [20] Adversarial unsupervised domain adaptation for 3D semantic segmentation with multi-modal learning
    Liu, Wei
    Luo, Zhiming
    Cai, Yuanzheng
    Yu, Ying
    Ke, Yang
    Marcato Junior, Jose
    Goncalves, Wesley Nunes
    Li, Jonathan
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2021, 176 : 211 - 221