FBDPN: CNN-Transformer hybrid feature boosting and differential pyramid network for underwater object detection

被引:2
|
作者
Ji, Xun [1 ]
Chen, Shijie [1 ]
Hao, Li-Ying [1 ]
Zhou, Jingchun [2 ]
Chen, Long [1 ]
机构
[1] Dalian Maritime Univ, Sch Marine Elect Engn, Dalian 116026, Peoples R China
[2] Dalian Maritime Univ, Sch Informat Sci & Technol, Dalian 116026, Peoples R China
关键词
Underwater object detection; Feature pyramid network; Convolutional neural network; Vision transformer;
D O I
10.1016/j.eswa.2024.124978
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite advancements in underwater object detection (UOD) from optical underwater images in recent years, the task still poses significant challenges due to the chaotic underwater environment, as well as the substantial variations in scale and contour of objects. Existing deep learning-based schemes generally overlook the enhancement and refinement between multi-scale features of densely distributed underwater objects, leading to inaccurate localization and classification predictions with excessive information redundancy. To tackle the above issues, this article presents a novel feature boosting and differential pyramid network (FBDPN) for precise and efficient UOD. The salient properties of our paper are: (1) a heuristic feature pyramid network (FPN)-inspired architecture is constructed, which employs a convolutional neural network (CNN)-Transformer hybrid strategy to simultaneously facilitate the learning of multi-scale features and the capture of long-distance dependencies among pixels. (2) A neighborhood-scale feature boosting module (NSFBM) is developed to enhance contextual information between features of neighborhood scales. (3) A cross-scale feature differential module (CSFDM) is designed further to achieve effective information redundancy between features of different scales. Extensive experiments are conducted to reveal that our proposed FBDPN can outperform other stateof-the-art methods in both UOD performance and computational complexity. In addition, sufficient ablation studies are also performed to demonstrate the effectiveness of each component in our FBDPN. The source code is available at https://github.com/jixun-dmu/FBDPN.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] SFPN: Semantic Feature Pyramid Network for Object Detection
    Gan, Yi
    Xu, Wei
    Su, Jianbo
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 795 - 802
  • [42] Bidirectional Matrix Feature Pyramid Network for Object Detection
    Xu, Wei
    Gan, Yi
    Su, Jianbo
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 8000 - 8007
  • [43] Bidirectional Parallel Feature Pyramid Network for Object Detection
    Zhang, Zhengning
    Zhang, Lin
    Wang, Yue
    Feng, Pengming
    Sun, Baochen
    IEEE ACCESS, 2022, 10 : 49422 - 49432
  • [44] Attentional feature pyramid network for small object detection
    Min, Kyungseo
    Lee, Gun-Hee
    Lee, Seong-Whan
    NEURAL NETWORKS, 2022, 155 : 439 - 450
  • [45] Adaptively Dense Feature Pyramid Network for Object Detection
    Pan, Haodong
    Chen, Guangfeng
    Jiang, Jue
    IEEE ACCESS, 2019, 7 : 81132 - 81144
  • [46] Feature Pyramid and Hierarchical Boosting Network for Pavement Crack Detection
    Yang, Fan
    Zhang, Lei
    Yu, Sijia
    Prokhorov, Danil
    Mei, Xue
    Ling, Haibin
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 21 (04) : 1525 - 1535
  • [47] Extended Feature Pyramid Network for Small Object Detection
    Deng, Chunfang
    Wang, Mengmeng
    Liu, Liang
    Liu, Yong
    Jiang, Yunliang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1968 - 1979
  • [48] GraphFPN: Graph Feature Pyramid Network for Object Detection
    Zhao, Gangming
    Ge, Weifeng
    Yu, Yizhou
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 2743 - 2752
  • [49] HYPER FEATURE FUSION PYRAMID NETWORK FOR OBJECT DETECTION
    Huang, Shouzhi
    Li, Xiaoyu
    Jiang, Zhuqing
    Guo, Xiaoqiang
    Men, Aidong
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW 2018), 2018,
  • [50] CTHPose: An Efficient and Effective CNN-Transformer Hybrid Network for Human Pose Estimation
    Chen, Danya
    Wu, Lijun
    Chen, Zhicong
    Lin, Xufeng
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT V, 2024, 14429 : 327 - 339