Co-Enhancement of Multi-Modality Image Fusion and Object Detection via Feature Adaptation

被引:0
作者
Dong, Aimei [1 ,2 ,3 ]
Wang, Long [2 ]
Liu, Jian [2 ]
Xu, Jingyuan [2 ]
Zhao, Guixin [1 ,2 ,3 ]
Zhai, Yi [1 ,2 ,3 ]
Lv, Guohua [1 ,2 ,3 ]
Cheng, Jinyong [1 ,2 ,3 ]
机构
[1] Qilu Univ Technol, Shandong Acad Sci, Shandong Comp Sci Ctr, Natl Supercomp Ctr Jinan,Minist Educ,Key Lab Comp, Jinan 250316, Peoples R China
[2] Qilu Univ Technol, Shandong Acad Sci, Fac Comp Sci & Technol, Jinan 250316, Peoples R China
[3] Shandong Fundamental Res Ctr Comp Sci, Shandong Prov Key Lab Comp Networks, Jinan 250100, Peoples R China
关键词
Image fusion; Task analysis; Semantics; Feature extraction; Object detection; Visualization; Visual perception; object detection; feature adaptation; mutual promotion; MULTISCALE TRANSFORM; NETWORK; NEST;
D O I
10.1109/TCSVT.2024.3433555
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The integration of multi-modality images significantly enhances the clarity of critical details for object detection. Valuable semantic data from object detection enriches the fusion process of these images. However, the potential reciprocal relationship that could enhance their mutual performance remains largely unexplored and underutilized, despite some semantic-driven fusion methodologies catering to specific application needs. To address these limitations, this study proposes a mutually reinforcing, dual-task-driven fusion architecture. Specifically, our design integrates a feature-adaptive interlinking module into both image fusion and object detection components, effectively managing the inherent feature discrepancies. The core idea is to channel distinct features from both tasks into a unified feature space after feature transformation. We then design a feature-adaptive selection module to generate features rich in target semantic information and compatible with the fusion network. Finally, effective combination and mutual enhancement of the two tasks are achieved through an alternating training process. A diverse range of swift evaluations is performed across various datasets to corroborate the potential efficiency of our framework, actualizing visible advancements in both fusion effectiveness and detection accuracy.
引用
收藏
页码:12624 / 12637
页数:14
相关论文
共 56 条
  • [1] Multimodal Feature Fusion and Knowledge-Driven Learning via Experts Consult for Thyroid Nodule Classification
    Avola, Danilo
    Cinque, Luigi
    Fagioli, Alessio
    Filetti, Sebastiano
    Grani, Giorgio
    Rodola, Emanuele
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (05) : 2527 - 2534
  • [2] Multi-Focus Image Fusion Based on Multi-Scale Gradients and Image Matting
    Chen, Jun
    Li, Xuejiao
    Luo, Linbo
    Ma, Jiayi
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 655 - 667
  • [3] Infrared and visible image fusion based on target-enhanced multiscale transform decomposition
    Chen, Jun
    Li, Xuejiao
    Luo, Linbo
    Mei, Xiaoguang
    Ma, Jiayi
    [J]. INFORMATION SCIENCES, 2020, 508 (508) : 64 - 78
  • [4] Xception: Deep Learning with Depthwise Separable Convolutions
    Chollet, Francois
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1800 - 1807
  • [5] Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges
    Feng, Di
    Haase-Schutz, Christian
    Rosenbaum, Lars
    Hertlein, Heinz
    Glaser, Claudius
    Timm, Fabian
    Wiesbeck, Werner
    Dietmayer, Klaus
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (03) : 1341 - 1360
  • [6] Ho, 2020, ADV NEURAL INFORM PR, V33, P6840, DOI [10.48550/arXiv.2006.11239, DOI 10.48550/ARXIV.2006.11239]
  • [7] Adaptive fusion method of visible light and infrared images based on non-subsampled shearlet transform and fast non-negative matrix factorization
    Kong, Weiwei
    Lei, Yang
    Zhao, Huaixun
    [J]. INFRARED PHYSICS & TECHNOLOGY, 2014, 67 : 161 - 172
  • [8] A new framework for feature descriptor based on SIFT
    Li, Canlin
    Ma, Lizhuang
    [J]. PATTERN RECOGNITION LETTERS, 2009, 30 (05) : 544 - 557
  • [9] RFN-Nest: An end-to-end residual fusion network for infrared and visible images
    Li, Hui
    Wu, Xiao-Jun
    Kittler, Josef
    [J]. INFORMATION FUSION, 2021, 73 : 72 - 86
  • [10] NestFuse: An Infrared and Visible Image Fusion Architecture Based on Nest Connection and Spatial/Channel Attention Models
    Li, Hui
    Wu, Xiao-Jun
    Durrani, Tariq
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2020, 69 (12) : 9645 - 9656