Fine-grained ship image classification and detection based on a vision transformer and multi-grain feature vector FPN model

被引:2
|
作者
Wang, Fengxiang [1 ]
Yu, Deying [2 ]
Huang, Liang [3 ]
Zhang, Yalun [4 ]
Chen, Yongbing [2 ]
Wang, Zhiguo [5 ]
机构
[1] Natl Univ Def Technol, State Key Lab High Performance Comp, Changsha, Peoples R China
[2] Naval Univ Engn, Sch Elect Engn, Wuhan, Peoples R China
[3] Naval Univ Engn, Coll Elect Engn, Wuhan, Peoples R China
[4] Peoples Liberat Army Naval Command Coll, Combat Command Dept, Nanjing, Peoples R China
[5] Naval Univ Engn, Dept Operat Res & Planning, Wuhan, Peoples R China
来源
GEO-SPATIAL INFORMATION SCIENCE | 2024年
基金
中国国家自然科学基金;
关键词
Deep learning; image classification; ship detection; remote-sensing images; transformer; REMOTE-SENSING IMAGES; NETWORK;
D O I
10.1080/10095020.2024.2331552
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
In naval and civilian domains, meticulous ship classification and detection are paramount. Nevertheless, predominant research has gravitated toward leveraging Convolutional Neural Network (CNN)-centered methodologies, often overlooking the diverse granularity inherent in ship samples. In our pursuit to holistically extract features from ship images across varying granularities, we present a transformative architecture: the Vision Transformer and Multi-Grain Feature Vector Feature Pyramid Network (ViT-MGFV-FPN). This model synergistically melds the merits of MGFV-FPN with an augmented Vision Transformer (ViT) for a comprehensive image feature extraction. To cater to the extraction of broader image features whilst sidestepping the innate quadratic complexity of traditional ViT, we unveil an enhanced version christened the Global Swin Transformer. Concurrently, the MGFV-FPN is orchestrated to harness the prowess of CNNs in distilling intricate ship attributes. Rigorous empirical evaluations underscore our model's superiority in juxtaposition with extant CNN and transformer-based paradigms for nuanced ship categorization.
引用
收藏
页数:22
相关论文
共 50 条
  • [31] TransIFC: Invariant Cues-Aware Feature Concentration Learning for Efficient Fine-Grained Bird Image Classification
    Liu, Hai
    Zhang, Cheng
    Deng, Yongjian
    Xie, Bochen
    Liu, Tingting
    Li, You-Fu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 1677 - 1690
  • [32] Fine-Grained Image Classification Based on Multi-Modal Features and Enhanced Alignment
    Han, Jing
    Zhang, Tianpeng
    Lyu, Xueqiang
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2024, 47 (04): : 130 - 135
  • [33] Multitask Fine-Grained Feature Mining for Multilabel Remote Sensing Image Classification
    Guo, Jie
    Sun, Hao
    Han, Jinheng
    Song, Bin
    Chi, Yuhao
    Song, Bingxi
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [34] Multilayer feature fusion with parallel convolutional block for fine-grained image classification
    Wang, Lei
    He, Kai
    Feng, Xu
    Ma, Xitao
    APPLIED INTELLIGENCE, 2022, 52 (03) : 2872 - 2883
  • [35] Efficient multi-granularity network for fine-grained image classification
    Jiabao Wang
    Yang Li
    Hang Li
    Xun Zhao
    Rui Zhang
    Zhuang Miao
    Journal of Real-Time Image Processing, 2022, 19 : 853 - 866
  • [36] Pixel Saliency Based Encoding for Fine-Grained Image Classification
    Yin, Chao
    Zhang, Lei
    Liu, Ji
    PATTERN RECOGNITION AND COMPUTER VISION (PRCV 2018), PT I, 2018, 11256 : 274 - 285
  • [37] Multilayer feature fusion with parallel convolutional block for fine-grained image classification
    Lei Wang
    Kai He
    Xu Feng
    Xitao Ma
    Applied Intelligence, 2022, 52 : 2872 - 2883
  • [38] A fine-grained image classification method based on information interaction
    Zhu, Shuo
    Zhang, Xukang
    Wang, Yu
    Wang, Zongyang
    Sun, Jiahao
    IET IMAGE PROCESSING, 2024, 18 (14) : 4852 - 4861
  • [39] Efficient multi-granularity network for fine-grained image classification
    Wang, Jiabao
    Li, Yang
    Li, Hang
    Zhao, Xun
    Zhang, Rui
    Miao, Zhuang
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2022, 19 (05) : 853 - 866
  • [40] Multi-scale attention-based adaptive feature fusion network for fine-grained ship classification in remote sensing scenarios
    Liu, Kun
    Zhang, Xiaomeng
    Xu, Zhijing
    Liu, Sidong
    JOURNAL OF APPLIED REMOTE SENSING, 2024, 18 (03)