Multidomain feature fusion method for small object classification: MDFF

被引:0
|
作者
Hu, Jing [1 ]
Shi, Zican [2 ]
Zhang, Zheng [2 ]
Lv, Siqi [2 ]
Chen, Yifan [2 ]
Ouyang, Yan [3 ]
He, Jia [4 ]
机构
[1] Natl Key Lab Sci & Technol Multi Spectral Informa, Wuhan, Peoples R China
[2] Huazhong Univ Sci & Technol, Wuhan, Peoples R China
[3] Air Force Early Warning Acad, Wuhan, Peoples R China
[4] Chinese Peoples Liberat Army, Troops 95841, Peoples R China
关键词
ConvMixer; vision transformer; small object classification; frequency domain;
D O I
10.1117/1.JEI.32.4.043009
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The task of classifying small objects is still challenging for current deep learning classification models [such as convolutional neural networks (CNNs) and vision transformers (ViTs)]. We believe that these algorithms are not designed specifically for small targets, so their feature extraction abilities for small targets are insufficient. To improve the classification capabilities of CNN-based and ViT-based classification models for small objects, two multidomain feature fusion (MDFF) frameworks are proposed to increase the amount of feature information derived from images and they are called MDFF-ConvMixer and MDFF-ViT. Compared with the basic model, the uniquely added design includes frequency domain feature extraction and MDFF processes. In the frequency domain feature extraction part, the input image is first transformed into a frequency domain form through discrete cosine transform (DCT) transformation and then a three-dimensional matrix containing the frequency domain information is obtained via channel splicing and reshaping. In the MDFF part, MDFF-ConvMixer splices the spatial and frequency domain features by channel, whereas MDFF-ViT uses a cross-attention mechanism to fuse the spatial and frequency domain features. When targeting small target classification tasks, these two frameworks obviously improve the utilized classification algorithm. On the DOTA dataset and the CIFAR10 dataset with two downsampling operations, the accuracies of MDFF-ConvMixer relative to ConvMixer increase from 87.82% and 62.14% to 90.14% and 66.00%, respectively, and the accuracies of MDFF-ViT relative to the ViT increase from 79.22% and 36.2% to 88.15% and 59.23%, respectively. (c) The Authors. Published by SPIE under a Creative Commons Attribution 4.0 International License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] MFFOD: multidomain feature fusion object detector for infrared images
    Xu, Wenxiao
    Yin, Qiyuan
    Xu, Cheng
    Zhao, Zhe
    Li, Yao
    Huang, Daqing
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2025, 36 (03)
  • [2] A feature level fusion approach for object classification
    Wender, Stefan
    Dietmayer, Klaus C. J.
    2007 IEEE INTELLIGENT VEHICLES SYMPOSIUM, VOLS 1-3, 2007, : 650 - 655
  • [3] MDFF-Net: A multi-dimensional feature fusion network for breast histopathology image classification
    Xu, Cheng
    Yi, Ke
    Jiang, Nan
    Li, Xiong
    Zhong, Meiling
    Zhang, Yuejin
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 165
  • [4] MDFF: Multi-Domain feature fusion for anomaly recognition
    Xu, Yuan
    Ye, Cheng-Shu
    Niu, Hai-Ming
    Chen, Si-Yuan
    Luo, Yi
    Zhu, Qun-Xiong
    He, Yan-Lin
    Zhang, Yang
    Zhang, Ming-Qing
    ADVANCED ENGINEERING INFORMATICS, 2025, 64
  • [5] Adaptive Feature Fusion for Small Object Detection
    Zhang, Qi
    Zhang, Hongying
    Lu, Xiuwen
    APPLIED SCIENCES-BASEL, 2022, 12 (22):
  • [6] Feature Fusion-Based Data Augmentation Method for Small Object Detection
    Wang, Xin
    Zhang, Hongyan
    Liu, Qianhe
    Gong, Wei
    IEEE MULTIMEDIA, 2024, 31 (03) : 65 - 77
  • [7] Visual-tactile fusion object classification method based on adaptive feature weighting
    Zhang, Peng
    Bai, Lu
    Shan, Dongri
    Wang, Xiaofang
    Li, Shuang
    Zou, Wenkai
    Chen, Zhenxue
    INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2023, 20 (04):
  • [8] MARGINA LIZED KERNEL-BASED FEATURE FUSION METHOD FOR VHR OBJECT CLASSIFICATION
    Liu, Chuntian
    Wei, Wei
    Bai, Xiao
    Zhou, Jun
    2013 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2013, : 216 - 219
  • [9] Small Object Detection Method Based on Weighted Feature Fusion and CSMA Attention Module
    Peng, Chao
    Zhu, Meng
    Ren, Honge
    Emam, Mahmoud
    ELECTRONICS, 2022, 11 (16)
  • [10] Fast Small Object Detection Method with Scale-Sensitivity Loss and Feature Fusion
    Ju C.-R.
    Qin X.-Y.
    Yuan G.-L.
    Li H.
    Zhu H.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2022, 50 (09): : 2119 - 2126