Learning Semantic Alignment Using Global Features and Multi-Scale Confidence

被引:0
作者
Xu, Huaiyuan [1 ]
Liao, Jing [2 ]
Liu, Huaping [3 ]
Sun, Yuxiang [1 ]
机构
[1] Hong Kong Polytech Univ, Dept Mech Engn, Kowloon, Hong Kong, Peoples R China
[2] City Univ Hong Kong, Dept Comp Sci, Kowloon, Hong Kong, Peoples R China
[3] Tsinghua Univ, Inst Artificial Intelligence, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Correlation; Feature extraction; Transformers; Training; Task analysis; Probabilistic logic; Semantic alignment; enhancement transformer; probabilistic correlation computation; cross-domain alignment;
D O I
10.1109/TCSVT.2023.3288370
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Semantic alignment aims to establish pixel correspondences between images based on semantic consistency. It can serve as a fundamental component for various downstream computer vision tasks, such as style transfer and exemplar-based colorization, etc. Many existing methods use local features and their cosine similarities to infer semantic alignment. However, they struggle with significant intra-class variation of objects, such as appearance, size, etc. In other words, contents with the same semantics tend to be significantly different in vision. To address this issue, we propose a novel deep neural network of which the core lies in global feature enhancement and adaptive multi-scale inference. Specifically, two modules are proposed: an enhancement transformer for enhancing semantic features with global awareness; a probabilistic correlation module for adaptively fusing multi-scale information based on the learned confidence scores. We use the unified network architecture to achieve two types of semantic alignment, namely, cross-object semantic alignment and cross-domain semantic alignment. Experimental results demonstrate that our method achieves competitive performance on five standard cross-object semantic alignment benchmarks, and outperforms the state of the arts in cross-domain semantic alignment.
引用
收藏
页码:897 / 910
页数:14
相关论文
共 50 条
  • [31] Multi-Exposure Image Fusion via Multi-Scale and Context-Aware Feature Learning
    Liu, Yu
    Yang, Zhigang
    Cheng, Juan
    Chen, Xun
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 100 - 104
  • [32] Boundary-Guided Lightweight Semantic Segmentation With Multi-Scale Semantic Context
    Zhou, Quan
    Wang, Linjie
    Gao, Guangwei
    Kang, Bin
    Ou, Weihua
    Lu, Huimin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 7887 - 7900
  • [33] Multi-Scale Enhanced Features Correlation Filters Learning With Dual Second-Order Difference for UAV Tracking
    Yu, Yu-Feng
    Zhang, Yang
    Chen, Long
    Ge, Pengfei
    Chen, C. L. Philip
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (02): : 3232 - 3245
  • [34] Multi-Scale Decision Network With Feature Fusion and Weighting for Few-Shot Learning
    Wang, Xiaoru
    Ma, Bing
    Yu, Zhihong
    Li, Fu
    Cai, Yali
    IEEE ACCESS, 2020, 8 : 92172 - 92181
  • [35] Multi-Scale Representation Learning on Hypergraph for 3D Shape Retrieval and Recognition
    Bai, Junjie
    Gong, Biao
    Zhao, Yining
    Lei, Fuqiang
    Yan, Chenggang
    Gao, Yue
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 (30) : 5327 - 5338
  • [36] Multi-Scale Binocular Stereo Matching Based on Semantic Association
    Zheng, Jin
    Jiang, Botao
    Peng, Wei
    Zhang, Qiaohui
    CHINESE JOURNAL OF ELECTRONICS, 2024, 33 (04) : 1010 - 1022
  • [37] Single Image Super-Resolution Using Asynchronous Multi-Scale Network
    Ji, Jiahuan
    Zhong, Baojiang
    Ma, Kai-Kuang
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1823 - 1827
  • [38] MANet: Multi-Scale Aware-Relation Network for Semantic Segmentation in Aerial Scenes
    He, Pei
    Jiao, Licheng
    Shang, Ronghua
    Wang, Shuang
    Liu, Xu
    Quan, Dou
    Yang, Kun
    Zhao, Dong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [39] DeFusionNET: Defocus Blur Detection via Recurrently Fusing and Refining Discriminative Multi-Scale Deep Features
    Tang, Chang
    Liu, Xinwang
    Zheng, Xiao
    Li, Wanqing
    Xiong, Jian
    Wang, Lizhe
    Zomaya, Albert
    Longo, Antonella
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (02) : 955 - 968
  • [40] RNGDet++: Road Network Graph Detection by Transformer With Instance Segmentation and Multi-Scale Features Enhancement
    Xu, Zhenhua
    Liu, Yuxuan
    Sun, Yuxiang
    Liu, Ming
    Wang, Lujia
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (05) : 2991 - 2998