A light-weight, efficient, and general cross-modal image fusion network

被引:22
作者
Fang, Aiqing [1 ]
Zhao, Xinbo [1 ]
Yang, Jiaqi [1 ]
Qin, Beibei [1 ]
Zhang, Yanning [1 ]
机构
[1] Northwestern Polytech Univ, Natl Engn Lab Integrated Aerosp Ground Ocean Big, Sch Comp Sci, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
Image fusion; Deep learning; Collaborative optimization; Image quality; Optimization; INFORMATION; EXTRACTION; FRAMEWORK;
D O I
10.1016/j.neucom.2021.08.044
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing cross-modal image fusion methods pay limited research attention to image fusion efficiency and network architecture. However, the efficiency and accuracy of image fusion have an important impact on practical applications. To solve this problem, we propose a light-weight, efficient, and general cross -modal image fusion network, termed as AE-Netv2. Firstly, we analyze the influence of different network architectures (e.g., group convolution, depth-wise convolution, inceptionNet, squeezeNet, shuffleNet, and multi-scale module) on image fusion quality and efficiency, which provides a reference for the design of image fusion architecture. Secondly, we explore the commonness and characteristics of different image fusion tasks, which provides a research basis for further research on the continuous learning character-istics of the human brain. Finally, positive sample loss is added to the similarity loss to reduce the differ-ence of data distribution of different cross-modal image fusion tasks. Comprehensive experiments demonstrate the superiority of our method compared to state-of-the-art methods in different fusion tasks at a real-time speed of 100+ FPS on GTX 2070. Compared with the fastest image fusion method based on deep learning, the efficiency of AE-Netv2 is improved by 2.14 times. Compared with the image fusion model with the smallest model size, the size of our model is reduced by 11.59 times. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:198 / 211
页数:14
相关论文
共 69 条
[1]  
[Anonymous], 2016, ARXIV
[2]  
[Anonymous], 2019, 2019 22 INT C INF
[3]   Visual Categorization with Random Projection [J].
Arriaga, Rosa I. ;
Rutter, David ;
Cakmak, Maya ;
Vempala, Santosh S. .
NEURAL COMPUTATION, 2015, 27 (10) :2132-2147
[4]  
AZoSensors, 2018, FLIR REL START THERM
[5]   Multi-scale Guided Image and Video Fusion: A Fast and Efficient Approach [J].
Bavirisetti, Durga Prasad ;
Xiao, Gang ;
Zhao, Junhao ;
Dhuli, Ravindra ;
Liu, Gang .
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2019, 38 (12) :5576-5605
[6]  
Bavirisetti DP, 2017, 2017 20TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), P701
[7]   Fusion of Infrared and Visible Sensor Images Based on Anisotropic Diffusion and Karhunen-Loeve Transform [J].
Bavirisetti, Durga Prasad ;
Dhuli, Ravindra .
IEEE SENSORS JOURNAL, 2016, 16 (01) :203-209
[8]  
Bello I, 2017, PR MACH LEARN RES, V70
[9]   Learning a Deep Single Image Contrast Enhancer from Multi-Exposure Images [J].
Cai, Jianrui ;
Gu, Shuhang ;
Zhang, Lei .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (04) :2049-2062
[10]   Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition [J].
Cui, Guangmang ;
Feng, Huajun ;
Xu, Zhihai ;
Li, Qi ;
Chen, Yueting .
OPTICS COMMUNICATIONS, 2015, 341 :199-209