UnionFormer: Unified-Learning Transformer with Multi-View Representation for Image Manipulation Detection and Localization

被引:3
|
作者
Li, Shuaibo [1 ,2 ]
Ma, Wei [1 ]
Guo, Jianwei [2 ]
Xu, Shibiao [3 ]
Li, Benchong [1 ]
Zhan, Xiaopeng [2 ]
机构
[1] Beijing Univ Technol, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Automat, MAIS, Beijing, Peoples R China
[3] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
来源
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2024年
基金
中国国家自然科学基金;
关键词
NETWORKS;
D O I
10.1109/CVPR52733.2024.01190
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present UnionFormer, a novel framework that integrates tampering clues across three views by unified learning for image manipulation detection and localization. Specifically, we construct a BSFI-Net to extract tampering features from RGB and noise views, achieving enhanced responsiveness to boundary artifacts while modulating spatial consistency at different scales. Additionally, to explore the inconsistency between objects as a new view of clues, we combine object consistency modeling with tampering detection and localization into a three-task unified learning process, allowing them to promote and improve mutually. Therefore, we acquire a unified manipulation discriminative representation under multi-scale supervision that consolidates information from three views. This integration facilitates highly effective concurrent detection and localization of tampering. We perform extensive experiments on diverse datasets, and the results show that the proposed approach outperforms state-of-the-art methods in tampering detection and localization.
引用
收藏
页码:12523 / 12533
页数:11
相关论文
共 7 条
  • [1] A Survey of Multi-View Representation Learning
    Li, Yingming
    Yang, Ming
    Zhang, Zhongfei
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (10) : 1863 - 1883
  • [2] Unsupervised representation learning based on the deep multi-view ensemble learning
    Koohzadi, Maryam
    Charkari, Nasrollah Moghadam
    Ghaderi, Foad
    APPLIED INTELLIGENCE, 2020, 50 (02) : 562 - 581
  • [3] Multi-view feature learning for VHR remote sensing image classification
    Guo, Yiyou
    Ji, Jinsheng
    Shi, Dan
    Ye, Qiankun
    Xie, Huan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (15) : 23009 - 23021
  • [4] Multi-view learning for camouflaged object detection with PVTv2
    Yan, Pu
    Ruan, Kang
    Wang, Lili
    Zhao, Yang
    Wang, Xu
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2025, 14 (02)
  • [5] A Novel Rock Mass Discontinuity Detection Approach with CNNs and Multi-View Image Augmentation
    Yalcin, Ilyas
    Can, Recep
    Gokceoglu, Candan
    Kocaman, Sultan
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2024, 13 (06)
  • [6] Multi-view representation learning with Kolmogorov-Smirnov to predict default based on imbalanced and complex dataset
    Tan, Yandan
    Zhao, Guangcai
    INFORMATION SCIENCES, 2022, 596 : 380 - 394
  • [7] Multi-View Graph Contrastive Learning via Adaptive Channel Optimization for Depression Detection in EEG Signals
    Zhang, Shuangyong
    Wang, Hong
    Zheng, Zixi
    Liu, Tianyu
    Li, Weixin
    Zhang, Zishan
    Sun, Yanshen
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2023, 33 (11)