Multidomain feature fusion method for small object classification: MDFF

被引：0

作者：

Hu, Jing ^{[1
]}

Shi, Zican ^{[2
]}

Zhang, Zheng ^{[2
]}

Lv, Siqi ^{[2
]}

Chen, Yifan ^{[2
]}

Ouyang, Yan ^{[3
]}

He, Jia ^{[4
]}

机构：

[1] Natl Key Lab Sci & Technol Multi Spectral Informa, Wuhan, Peoples R China

[2] Huazhong Univ Sci & Technol, Wuhan, Peoples R China

[3] Air Force Early Warning Acad, Wuhan, Peoples R China

[4] Chinese Peoples Liberat Army, Troops 95841, Peoples R China

来源：

JOURNAL OF ELECTRONIC IMAGING | 2023年 / 32卷 / 04期

关键词：

ConvMixer; vision transformer; small object classification; frequency domain;

D O I：

10.1117/1.JEI.32.4.043009

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The task of classifying small objects is still challenging for current deep learning classification models [such as convolutional neural networks (CNNs) and vision transformers (ViTs)]. We believe that these algorithms are not designed specifically for small targets, so their feature extraction abilities for small targets are insufficient. To improve the classification capabilities of CNN-based and ViT-based classification models for small objects, two multidomain feature fusion (MDFF) frameworks are proposed to increase the amount of feature information derived from images and they are called MDFF-ConvMixer and MDFF-ViT. Compared with the basic model, the uniquely added design includes frequency domain feature extraction and MDFF processes. In the frequency domain feature extraction part, the input image is first transformed into a frequency domain form through discrete cosine transform (DCT) transformation and then a three-dimensional matrix containing the frequency domain information is obtained via channel splicing and reshaping. In the MDFF part, MDFF-ConvMixer splices the spatial and frequency domain features by channel, whereas MDFF-ViT uses a cross-attention mechanism to fuse the spatial and frequency domain features. When targeting small target classification tasks, these two frameworks obviously improve the utilized classification algorithm. On the DOTA dataset and the CIFAR10 dataset with two downsampling operations, the accuracies of MDFF-ConvMixer relative to ConvMixer increase from 87.82% and 62.14% to 90.14% and 66.00%, respectively, and the accuracies of MDFF-ViT relative to the ViT increase from 79.22% and 36.2% to 88.15% and 59.23%, respectively. (c) The Authors. Published by SPIE under a Creative Commons Attribution 4.0 International License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.

引用

页数：18

共 50 条

[1] MFFOD: multidomain feature fusion object detector for infrared images
Xu, Wenxiao
Yin, Qiyuan
Xu, Cheng
Zhao, Zhe
Li, Yao
Huang, Daqing
MEASUREMENT SCIENCE AND TECHNOLOGY, 2025, 36 (03)
[2] A feature level fusion approach for object classification
Wender, Stefan
Dietmayer, Klaus C. J.
2007 IEEE INTELLIGENT VEHICLES SYMPOSIUM, VOLS 1-3, 2007, : 650 - 655
[3] MDFF-Net: A multi-dimensional feature fusion network for breast histopathology image classification
Xu, Cheng
Yi, Ke
Jiang, Nan
Li, Xiong
Zhong, Meiling
Zhang, Yuejin
COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 165
[4] MDFF: Multi-Domain feature fusion for anomaly recognition
Xu, Yuan
Ye, Cheng-Shu
Niu, Hai-Ming
Chen, Si-Yuan
Luo, Yi
Zhu, Qun-Xiong
He, Yan-Lin
Zhang, Yang
Zhang, Ming-Qing
ADVANCED ENGINEERING INFORMATICS, 2025, 64
[5] Adaptive Feature Fusion for Small Object Detection
Zhang, Qi
Zhang, Hongying
Lu, Xiuwen
APPLIED SCIENCES-BASEL, 2022, 12 (22):
[6] Feature Fusion-Based Data Augmentation Method for Small Object Detection
Wang, Xin
Zhang, Hongyan
Liu, Qianhe
Gong, Wei
IEEE MULTIMEDIA, 2024, 31 (03) : 65 - 77
[7] Visual-tactile fusion object classification method based on adaptive feature weighting
Zhang, Peng
Bai, Lu
Shan, Dongri
Wang, Xiaofang
Li, Shuang
Zou, Wenkai
Chen, Zhenxue
INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2023, 20 (04):
[8] MARGINA LIZED KERNEL-BASED FEATURE FUSION METHOD FOR VHR OBJECT CLASSIFICATION
Liu, Chuntian
Wei, Wei
Bai, Xiao
Zhou, Jun
2013 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2013, : 216 - 219
[9] Small Object Detection Method Based on Weighted Feature Fusion and CSMA Attention Module
Peng, Chao
Zhu, Meng
Ren, Honge
Emam, Mahmoud
ELECTRONICS, 2022, 11 (16)
[10] Fast Small Object Detection Method with Scale-Sensitivity Loss and Feature Fusion
Ju C.-R.
Qin X.-Y.
Yuan G.-L.
Li H.
Zhu H.
Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2022, 50 (09): : 2119 - 2126

← 1 2 3 4 5 →