Combining transformers with CNN for multi-focus image fusion

被引:41
作者
Duan, Zhao [1 ]
Luo, Xiaoliu [1 ]
Zhang, Taiping [1 ]
机构
[1] Chongqing Univ, Sch Comp Sci, Chongqing 400038, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-focus image fusion; Transformers; Knowledge distillation; CNNs; PERFORMANCE; FRAMEWORK;
D O I
10.1016/j.eswa.2023.121156
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, deep convolutional neural network (CNN) based methods for multi-focus image fusion have achieved adequate performance. However, most of them cannot obtain spatially continuous results, especially in smooth regions and edges between focused and defocused regions. In this paper, we propose a novel end-to-end method, which merits both Transformers and CNNs, as a strong alternative for multi-focus image fusion task. Transformer has advantages over a CNN in that it can extract global features. It is able to make the fusion results to be spatially consistent. The proposed architecture consists of CNN and transformer branches, where transformer branches take feature map patches as inputs and leverages the transformer to propagate global contexts among patches. Moreover, in order to improve feature representation, we introduce online knowledge distillation learning strategy (KDL). The strategy achieves better interactions between global features and local features. Specifically, we design hard target and soft target by simply yet effectively ensembling outputs of two branches, which are used to supervise CNN and transformer branches. The experiments demonstrate the superiority of our proposed architecture and achieve competitive results with state-of-the-art methods.
引用
收藏
页数:13
相关论文
共 69 条
[1]   Ensemble of CNN for multi-focus image fusion [J].
Amin-Naji, Mostafa ;
Aghagolzadeh, Ali ;
Ezoji, Mehdi .
INFORMATION FUSION, 2019, 51 :201-214
[2]   Fusion of multi-focus images using differential evolution algorithm [J].
Aslantas, V. ;
Kurban, R. .
EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (12) :8861-8870
[3]  
Ba J.L., 2016, arXiv preprint arXiv:1607.06450, DOI DOI 10.48550/ARXIV.1607.06450
[4]  
Bertasius G, 2021, PR MACH LEARN RES, V139
[5]   A fuzzy convolutional neural network for enhancing multi-focus image fusion [J].
Bhalla, Kanika ;
Koundal, Deepika ;
Sharma, Bhisham ;
Hu, Yu-Chen ;
Zaguia, Atef .
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 84
[6]   Multi-focus Image Fusion using Neutrosophic based Wavelet Transform [J].
Bhat, Shiveta ;
Koundal, Deepika .
APPLIED SOFT COMPUTING, 2021, 106
[7]   Multi-focus image fusion techniques: a survey [J].
Bhat, Shiveta ;
Koundal, Deepika .
ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (08) :5735-5787
[8]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[9]   Multi-Focus Image Fusion Based on Multi-Scale Gradients and Image Matting [J].
Chen, Jun ;
Li, Xuejiao ;
Luo, Linbo ;
Ma, Jiayi .
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 :655-667
[10]   A new automated quality assessment algorithm for image fusion [J].
Chen, Yin ;
Blum, Rick S. .
IMAGE AND VISION COMPUTING, 2009, 27 (10) :1421-1432