Multi-modal object detection using unsupervised transfer learning and adaptation techniques

被引:1
作者
Abbott, Rachael [1 ]
Robertson, Neil [1 ]
del Rincon, Jesus Martinez [1 ]
Connor, Barry [2 ]
机构
[1] Queens Univ Belfast, Belfast, Antrim, North Ireland
[2] Thales UK, London, England
来源
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING IN DEFENSE APPLICATIONS | 2019年 / 11169卷
关键词
Object detection; Transfer learning; Modality adaption; Thermal imagery; Multi-modal detection;
D O I
10.1117/12.2532794
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks achieve state-of-the-art performance on object detection tasks with RGB data. However, there are many advantages of detection using multi-modal imagery for defence and security operations. For example, the IR modality offers persistent surveillance and is essential in poor lighting conditions and 24hr operation. It is, therefore, crucial to create an object detection system which can use IR imagery. Collecting and labelling large volumes of thermal imagery is incredibly expensive and time-consuming. Consequently, we propose to mobilise labelled RGB data to achieve detection in the IR modality. In this paper, we present a method for multi-modal object detection using unsupervised transfer learning and adaptation techniques. We train faster RCNN on RGB imagery and test with a thermal imager. The images contain object classes; people and land vehicles and represent real-life scenes which include clutter and occlusions. We improve the baseline F1-score by up to 20% through training with an additional loss function, which reduces the difference between RGB and IR feature maps. This work shows that unsupervised modality adaptation is possible, and we have the opportunity to maximise the use of labelled RGB imagery for detection in multiple modalities. The novelty of this work includes; the use of the IR imagery, modality adaption from RGB to IR for object detection and the ability to use real-life imagery in uncontrolled environments. The practical impact of this work to the defence and security community is an increase in performance and the saving of time and money in data collection and annotation.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Unsupervised Cross-domain Object Detection Based on Progressive Multi-source Transfer
    Li W.
    Wang M.
    Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (09): : 2337 - 2351
  • [32] Multi-scale multi-modal fusion for object detection in autonomous driving based on selective kernel
    Gao, Xin
    Zhang, Guoying
    Xiong, Yijin
    MEASUREMENT, 2022, 194
  • [33] Object Detection and Tracking for Community Surveillance using Transfer Learning
    Machiraju, Gayatri Sasi Rekha
    Kumari, K. Aruna
    Sharif, Shaikh Khadar
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT 2021), 2021, : 1035 - 1042
  • [34] Target-Style-Aware Unsupervised Domain Adaptation for Object Detection
    Yun, Woo-han
    Han, ByungOk
    Lee, Jaeyeon
    Kim, Jaehong
    Kim, Junmo
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02) : 3825 - 3832
  • [35] Deep Learning Based Multi-Modal Fusion Architectures for Maritime Vessel Detection
    Farahnakian, Fahimeh
    Heikkonen, Jukka
    REMOTE SENSING, 2020, 12 (16)
  • [36] RGB-INFRARED MULTI-MODAL REMOTE SENSING OBJECT DETECTION USING CNN AND TRANSFORMER BASED FEATURE FUSION
    Tian, Tao
    Cai, Jiang
    Xu, Yang
    Wu, Zebin
    Wei, Zhihui
    Chanussot, Jocelyn
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 5728 - 5731
  • [37] Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges
    Feng, Di
    Haase-Schutz, Christian
    Rosenbaum, Lars
    Hertlein, Heinz
    Glaser, Claudius
    Timm, Fabian
    Wiesbeck, Werner
    Dietmayer, Klaus
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (03) : 1341 - 1360
  • [38] Industrial object detection with multi-modal SSD: closing the gap between synthetic and real images
    Julia Cohen
    Carlos Crispim-Junior
    Jean-Marc Chiappa
    Laure Tougne Rodet
    Multimedia Tools and Applications, 2024, 83 : 12111 - 12138
  • [39] Multi-Modal Fusion Based on Depth Adaptive Mechanism for 3D Object Detection
    Liu, Zhanwen
    Cheng, Juanru
    Fan, Jin
    Lin, Shan
    Wang, Yang
    Zhao, Xiangmo
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 707 - 717
  • [40] Height-Adaptive Deformable Multi-Modal Fusion for 3D Object Detection
    Li, Jiahao
    Chen, Lingshan
    Li, Zhen
    IEEE ACCESS, 2025, 13 : 52385 - 52396