Searching a Hierarchically Aggregated Fusion Architecture for Fast Multi-Modality Image Fusion

被引:44
作者
Liu, Risheng [1 ]
Liu, Zhu [1 ]
Liu, Jinyuan [1 ]
Fan, Xin [1 ]
机构
[1] Dalian Univ Technol, Dalian, Peoples R China
来源
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Hierachically aggregated fusion architecture; fusion-oriented search; space; collaborative architecture search; multi-modality fusion;
D O I
10.1145/3474085.3475299
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-modality image fusion refers to generating a complementary image that integrates typical characteristics from source images. In recent years, we have witnessed the remarkable progress of deep learning models for multi-modality fusion. Existing CNN-based approaches strain every nerve to design various architectures for realizing these tasks in an end-to-end manner. However, these handcrafted designs are unable to cope with the high demanding fusion tasks, resulting in blurred targets and lost textural details. To alleviate these issues, in this paper, we propose a novel approach, aiming at searching effective architectures according to various modality principles and fusion mechanisms. Specifically, we construct a hierarchically aggregated fusion architecture to extract and refine fused features from feature-level and object-level fusion perspectives, which is responsible for obtaining complementary target/detail representations. Then by investigating diverse effective practices, we composite a more flexible fusion-specific search space. Motivated by the collaborative principle, we employ a new search strategy with different principled losses and hardware constraints for sufficient discovery of components. As a result, we can obtain a task-specific architecture with fast inference time. Extensive quantitative and qualitative results demonstrate the superiority and versatility of our method against state-of-the-art methods.
引用
收藏
页码:1600 / 1608
页数:9
相关论文
共 51 条
[1]  
Alletto Stefano, 2020, AAAI
[2]  
[Anonymous], 2018, IEEE T IMAGE PROCESS
[3]   A new image quality metric for image fusion: The sum of the correlations of differences [J].
Aslantas, V. ;
Bendes, E. .
AEU-INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATIONS, 2015, 69 (12) :160-166
[4]   Directive Contrast Based Multimodal Medical Image Fusion in NSCT Domain [J].
Bhatnagar, Gaurav ;
Wu, Q. M. Jonathan ;
Liu, Zheng .
IEEE TRANSACTIONS ON MULTIMEDIA, 2013, 15 (05) :1014-1024
[5]  
Bochkovskiy A, 2020, Yolov4: optimal speed and accuracy of object detection, DOI 10.48550/ARXIV.2004.10934
[6]   Multi-Focus Image Fusion Based on Spatial Frequency in Discrete Cosine Transform Domain [J].
Cao, Liu ;
Jin, Longxu ;
Tao, Hongjiang ;
Li, Guoning ;
Zhuang, Zhuang ;
Zhang, Yanfu .
IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (02) :220-224
[7]  
Chen Y.-C., 2020, ARXIV200811713
[8]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[9]   Fusing Infrared and Visible Images of Different Resolutions via Total Variation Model [J].
Du, Qinglei ;
Xu, Han ;
Ma, Yong ;
Huang, Jun ;
Fan, Fan .
SENSORS, 2018, 18 (11)
[10]   Vision meets robotics: The KITTI dataset [J].
Geiger, A. ;
Lenz, P. ;
Stiller, C. ;
Urtasun, R. .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) :1231-1237