Self-supervised domain feature mining for underwater domain generalization object detection

被引:0
作者
Chen, Haojie [1 ]
Wang, Zhuo [1 ]
Qin, Hongde [1 ]
Mu, Xiaokai [1 ,2 ]
机构
[1] Harbin Engn Univ, Coll Shipbldg Engn, Harbin 150001, Heilongjiang, Peoples R China
[2] Harbin Engn Univ, Qingdao Innovat & Dev Ctr, Qingdao 266000, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
Underwater object detection; Domain generalization; Self-supervised learning; Deep learning;
D O I
10.1016/j.eswa.2024.126023
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the practical application of underwater object detection, the object detection network is generally trained on known source domains and applied to the field scene of the unknown domain, which can be regarded as a domain generalization object detection task. However, owing to the disparities between the source domain and unknown domain data, the object detection network often suffers from a domain shift phenomenon during migration to the unknown domain, resulting in compromised generalization capabilities of the underwater object detection network. To address this challenge, this paper proposes a general underwater generalized object detection framework termed as SPIR-DFM. SPIR-DFM extracts domain-invariant features and task- relevant domain-specific features through deep feature mining, thereby enhancing the generalization capability of the underwater object detection network in the target domain. Firstly, the Instance Normalization and Domain Feature Supplement(INDFS) module is designed. Within the INDFS, the Instance Normalization(IN) module mitigates style variations across domains through instance-level normalization, enabling the extraction of domain-invariant features from the source domain, and the Domain Feature Supplement(DFS) module explicitly captures the task-related domain-specific features lost in the normalization process through the idea of feature decomposition to increase the discriminative power of the model. Additionally, a Domain Information Reconstruction (DIR) decoder is designed to reconstruct the principal information of the source domain using the multi-scale features extracted by the object detector. The Principal Information Reconstruction(PIR) loss between the reconstructed domain and the original source domain information is designed to serve as a weak regularization term, guiding the process of deep feature mining in a self-supervised learning manner. The extensive comparison and visualization experiments on the underwater domain generalization dataset S-UODAC2020 demonstrate the effectiveness of our method. Code will be available at https://github.com/ donggaomu/SPIR-DFM.
引用
收藏
页数:11
相关论文
共 46 条
  • [1] Rethinking Domain Generalization Baselines
    Borlino, Francesco Cappio
    D'Innocente, Antonio
    Tommasi, Tatiana
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 9227 - 9233
  • [2] Domain Generalization by Solving Jigsaw Puzzles
    Carlucci, Fabio M.
    D'Innocente, Antonio
    Bucci, Silvia
    Caputo, Barbara
    Tommasi, Tatiana
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2224 - 2233
  • [3] Chen K, 2019, Arxiv, DOI [arXiv:1906.07155, DOI 10.48550/ARXIV.1906.07155]
  • [4] Perceptual Underwater Image Enhancement With Deep Learning and Physical Priors
    Chen, Long
    Jiang, Zheheng
    Tong, Lei
    Liu, Zhihua
    Zhao, Aite
    Zhang, Qianni
    Dong, Junyu
    Zhou, Huiyu
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (08) : 3078 - 3092
  • [5] Achieving domain generalization for underwater object detection by domain mixup and contrastive learning
    Chen, Yang
    Song, Pinhao
    Liu, Hong
    Dai, Linhui
    Zhang, Xiaochuan
    Ding, Runwei
    Li, Shengquan
    [J]. NEUROCOMPUTING, 2023, 528 : 20 - 34
  • [6] Exploring Resolution and Degradation Clues as Self-supervised Signal for Low Quality Object Detection
    Cui, Ziteng
    Zhu, Yingying
    Gu, Lin
    Qi, Guo-Jun
    Li, Xiaoxiao
    Zhang, Renrui
    Zhang, Zenghui
    Harada, Tatsuya
    [J]. COMPUTER VISION, ECCV 2022, PT IX, 2022, 13669 : 473 - 491
  • [7] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [8] Ganin Y, 2015, PR MACH LEARN RES, V37, P1180
  • [9] Fast R-CNN
    Girshick, Ross
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1440 - 1448
  • [10] Region-Based Convolutional Networks for Accurate Object Detection and Segmentation
    Girshick, Ross
    Donahue, Jeff
    Darrell, Trevor
    Malik, Jitendra
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (01) : 142 - 158