SwinMFF: toward high-fidelity end-to-end multi-focus image fusion via swin transformer-based network

被引：3

作者：

Xie, Xinzhe ^{[1
,2
]}

Guo, Buyu ^{[2
,3
]}

Li, Peiliang ^{[1
,2
]}

He, Shuangyan ^{[1
,2
]}

Zhou, Sangjun ^{[1
,2
]}

机构：

[1] Zhejiang Univ, Ocean Coll, Zhoushan 316021, Zhejiang, Peoples R China

[2] Zhejiang Univ, Hainan Inst, Sanya 572025, Hainan, Peoples R China

[3] Donghai Lab, Zhoushan 316021, Zhejiang, Peoples R China

来源：

VISUAL COMPUTER | 2025年 / 41卷 / 06期

关键词：

Deep learning; Multi-focus; Image fusion; End-to-end; Transformer; ALGORITHM; FRAMEWORK; ENSEMBLE;

D O I：

10.1007/s00371-024-03637-3

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

The end-to-end approach that directly learns the mapping from multi-focus images to fused images has been widely used recently, which achieves excellent performance in dealing with complex scenes. However, the fusion quality of this approach falls short of decision map-based methods, as this approach can preserve the original pixels of the focused regions in the fused image, while end-to-end methods use network inference results with pixel-wise regression errors, resulting in low fidelity of the fused images. To mitigate this limitation, we propose SwinMFF, which effectively captures long-range dependencies across the source images via the swin transformer to reduce pixel-wise regression errors, achieving high-fidelity end-to-end fusion while simultaneously alleviating edge artifacts in the fused image. Extensive experiments demonstrate that SwinMFF outperforms the other 28 state-of-the-art methods in both subjective visual quality and quantitative metrics. The codes are available at https://github.com/Xinzhe99/SwinMFF.

引用

页码：3883 / 3906

页数：24

共 71 条

[1] Ensemble of CNN for multi-focus image fusion [J].

Amin-Naji, Mostafa ;

Aghagolzadeh, Ali ;

Ezoji, Mehdi .

INFORMATION FUSION, 2019, 51 :201-214

[2] OPTICAL MICROSCOPE SYSTEM FOR STANDARDIZED CELL MEASUREMENTS AND ANALYSES [J].

BACUS, JW ;

GRACE, LJ .

APPLIED OPTICS, 1987, 26 (16) :3280-3293

[3]

Burt P. J., 1993, [1993] Proceedings Fourth International Conference on Computer Vision, P173, DOI 10.1109/ICCV.1993.378222

[4] THE LAPLACIAN PYRAMID AS A COMPACT IMAGE CODE [J].

BURT, PJ ;

ADELSON, EH .

IEEE TRANSACTIONS ON COMMUNICATIONS, 1983, 31 (04) :532-540

[5] Structural characterization and measurement of nonwoven fabrics based on multi-focus image fusion [J].

Chen, Yang ;

Deng, Na ;

Xin, Bin-Jie ;

Xing, Wen-Yu ;

Zhang, Zheng-Ye .

MEASUREMENT, 2019, 141 :356-363

[6] Multi-focus image fusion using a morphology-based focus measure in a quad-tree structure [J].

De, Ishita ;

Chanda, Bhabatosh .

INFORMATION FUSION, 2013, 14 (02) :136-146

[7]

Dosovitskiy Alexey., 2021, PROC INT C LEARN REP, P2021, DOI [10.48550/arXiv.2010.11929, DOI 10.48550/ARXIV.2010.11929]

[8] Multifocus image fusion with enhanced linear spectral clustering and fast depth map estimation [J].

Duan, Junwei ;

Chen, Long ;

Chen, C. L. Philip .

NEUROCOMPUTING, 2018, 318 :43-54

[9] Combining transformers with CNN for multi-focus image fusion [J].

Duan, Zhao ;

Luo, Xiaoliu ;

Zhang, Taiping .

EXPERT SYSTEMS WITH APPLICATIONS, 2024, 235

[10] NCDCN: multi-focus image fusion via nest connection and dilated convolution network [J].

Guan, Zheng ;

Wang, Xue ;

Nie, Rencan ;

Yu, Shishuang ;

Wang, Chengchao .

APPLIED INTELLIGENCE, 2022, 52 (14) :15883-15898

← 1 2 3 4 5 6 7 8 →