MSFNet: MultiStage Fusion Network for infrared and visible image fusion

被引:17
作者
Wang, Chenwu [1 ]
Wu, Junsheng [2 ]
Zhu, Zhixiang [3 ]
Chen, Hao [4 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, 127 West Youyi Rd, Xian 710072, Shaanxi, Peoples R China
[2] Northwestern Polytech Univ, Sch Software, 127 West Youyi Rd, Xian 710072, Shaanxi, Peoples R China
[3] Xian Univ Posts & Telecommun, Sch Modern Posts, 563 South Changan Rd, Xian 710061, Shaanxi, Peoples R China
[4] Xian Univ Posts & Telecommun, Sch Comp Sci, 563 South Changan Rd, Xian 710061, Shaanxi, Peoples R China
关键词
Deep learning; Image fusion; Visible image; Infrared image; Multistage network; ENHANCEMENT; PERFORMANCE; FRAMEWORK;
D O I
10.1016/j.neucom.2022.07.048
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a multistage fusion network to solve the infrared and visible image fusion (IVIF) problem. Unlike other deep learning methods, our network architecture is designed to the complex balance between spatial details and contextualized information. In order to optimally balance these competing goals, our main proposal is a multistage architecture that progressively learns IVIF functions for the source images (infrared and visible images), thereby breaking down the overall recovery process into more manageable steps. Specifically, our model first learns the contextural features using the encoderdecoder architecture with downsampling operations and later combines them with a full-resolution branch that retains local details. Between stages, we introduce a cross-stage fusion module (CSFM) to propagate multiscale contextual features from an earlier stage to a later stage. In addition, we introduce an upsampling module that can conquer both checkerboard artifacts and blurring effect by a bilinear interpolation operation followed by a deformable convolution. The resulting tightly interlinked multistage fusion network, named MSFNet, demonstrates the superiority of our method over state-of-theart performance on publicly available datasets. (c) 2022 Published by Elsevier B.V.
引用
收藏
页码:26 / 39
页数:14
相关论文
共 53 条
[1]   MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation [J].
Abu Farha, Yazan ;
Gall, Juergen .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3570-3579
[2]   Visible and infrared image fusion using DTCWT and adaptive combined clustered dictionary [J].
Aishwarya, N. ;
Thangammal, C. Bennila .
INFRARED PHYSICS & TECHNOLOGY, 2018, 93 :300-309
[3]  
Bavirisetti DP, 2017, 2017 20TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), P701
[4]   Two-scale image fusion of visible and infrared images using saliency detection [J].
Bavirisetti, Durga Prasad ;
Dhuli, Ravindra .
INFRARED PHYSICS & TECHNOLOGY, 2016, 76 :52-64
[5]   Fusion of Infrared and Visible Sensor Images Based on Anisotropic Diffusion and Karhunen-Loeve Transform [J].
Bavirisetti, Durga Prasad ;
Dhuli, Ravindra .
IEEE SENSORS JOURNAL, 2016, 16 (01) :203-209
[6]   Cascaded Pyramid Network for Multi-Person Pose Estimation [J].
Chen, Yilun ;
Wang, Zhicheng ;
Peng, Yuxiang ;
Zhang, Zhiqiang ;
Yu, Gang ;
Sun, Jian .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7103-7112
[7]   Image quality measures and their performance [J].
Eskicioglu, AM ;
Fisher, PS .
IEEE TRANSACTIONS ON COMMUNICATIONS, 1995, 43 (12) :2959-2965
[8]   A light-weight, efficient, and general cross-modal image fusion network [J].
Fang, Aiqing ;
Zhao, Xinbo ;
Yang, Jiaqi ;
Qin, Beibei ;
Zhang, Yanning .
NEUROCOMPUTING, 2021, 463 :198-211
[9]   Non-linear and selective fusion of cross-modal images [J].
Fang, Aiqing ;
Zhao, Xinbo ;
Yang, Jiaqi ;
Zhang, Yanning ;
Zheng, Xiang .
PATTERN RECOGNITION, 2021, 119
[10]   Cross-modal image fusion guided by subjective visual attention [J].
Fang, Aiqing ;
Zhao, Xinbo ;
Zhang, Yanning .
NEUROCOMPUTING, 2020, 414 :333-345