Structure-Preserved Self-Attention for Fusion Image Information in Multiple Color Spaces

被引:0
作者
He, Zhu [1 ,2 ]
Lin, Mingwei [1 ,2 ]
Luo, Xin [3 ]
Xu, Zeshui [4 ]
机构
[1] Fujian Normal Univ, Coll Comp & Cyber Secur, Fuzhou 350117, Peoples R China
[2] Fujian Normal Univ, Fujian Prov Engn Res Ctr Publ Serv Big Data Min &, Fuzhou 350117, Peoples R China
[3] Southwest Univ, Coll Comp & Informat Sci, Chongqing 400715, Peoples R China
[4] Sichuan Univ, Business Sch, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
Image color analysis; Computational modeling; Image recognition; Feature extraction; Accuracy; Adaptation models; Convolutional neural networks; Computer architecture; Image segmentation; Image classification; Channel shuffle; color space; group convolution (GConv); image classification; self-attention mechanism; NEURAL-NETWORKS; MODEL; HSV;
D O I
10.1109/TNNLS.2024.3490800
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The selection and utilization of different color spaces significantly impact the recognition performance of deep learning models in downstream tasks. Existing studies typically leverage image information from various color spaces through model integration or channel concatenation. However, these methods result in excessive model size and suboptimal utilization of image information. In this study, we propose the structure-preserved self-attention network (SPSANet) model for efficient fusion of image information from different color spaces. This model incorporates a novel structure-preserved self-attention (SPSA) module that employs a single-head pixel-wise attention mechanism, as opposed to the conventional multihead self-attention (MHSA) approach. Specifically, feature maps from all color space grouping paths are utilized for similarity matching, enabling the model to focus on critical pixel locations across different color spaces. This design mitigates the dependence of the SPSANet model on the choice of color space while enhancing the advantages of integrating multiple color spaces. The SPSANet model also employs channel shuffle operations to facilitate limited interaction between information flows from different color space paths. Experimental results demonstrate that the SPSANet model, utilizing eight common color spaces-RGB, Luv, XYZ, Lab, HSV, YCrCb, YUV, and HLS-achieves superior recognition performance with reduced parameters and computational cost.
引用
收藏
页数:15
相关论文
共 72 条
[1]  
Alharbi M, 2020, IEEE INT CONF INF VI, P98, DOI [10.2316/P.2018.857-013, 10.1109/IV51561.2020.00026]
[2]   Evaluating the Impact of Color Information in Deep Neural Networks [J].
Buhrmester, Vanessa ;
Muench, David ;
Bulatov, Dimitri ;
Arens, Michael .
PATTERN RECOGNITION AND IMAGE ANALYSIS, PT I, 2020, 11867 :302-316
[3]   Randomly translational activation inspired by the input distributions of ReLU [J].
Cao, Jiale ;
Pang, Yanwei ;
Li, Xuelong ;
Liang, Jingkun .
NEUROCOMPUTING, 2018, 275 :859-868
[4]   Improved Traffic Sign Detection and Recognition Algorithm for Intelligent Vehicles [J].
Cao, Jingwei ;
Song, Chuanxue ;
Peng, Silun ;
Xiao, Feng ;
Song, Shixin .
SENSORS, 2019, 19 (18)
[5]   A Robust GAN-Generated Face Detection Method Based on Dual-Color Spaces and an Improved Xception [J].
Chen, Beijing ;
Liu, Xin ;
Zheng, Yuhui ;
Zhao, Guoying ;
Shi, Yun-Qing .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (06) :3527-3538
[6]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[7]   The Effect of Color Channel Representations on the Transferability of Convolutional Neural Networks [J].
Diaz-Cely, Javier ;
Arce-Lopera, Carlos ;
Cardona Mena, Juan ;
Quintero, Lina .
ADVANCES IN COMPUTER VISION, CVC, VOL 1, 2020, 943 :27-38
[8]   A survey on ensemble learning [J].
Dong, Xibin ;
Yu, Zhiwen ;
Cao, Wenming ;
Shi, Yifan ;
Ma, Qianli .
FRONTIERS OF COMPUTER SCIENCE, 2020, 14 (02) :241-258
[9]  
Dosovitskiy A., 2021, IMAGE IS WORTH 16 16, DOI DOI 10.48550/ARXIV.2010.11929
[10]  
Dzigal D, 2019, 2019 11TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ELECO 2019), P595, DOI [10.23919/ELECO47770.2019.8990608, 10.23919/eleco47770.2019.8990608]