Dual-stream enhancement encoder and attention optimization decoder for image manipulation localization

被引：0

作者：

Zhu, Ye ^{[1
]}

Zhao, Xiaoxiang ^{[1
]}

Yu, Yang ^{[1
]}

机构：

[1] Hebei Univ Technol, Sch Artificial Intelligence, Tianjin 300401, Peoples R China

来源：

CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS | 2024年 / 39卷 / 08期

基金：

中国国家自然科学基金;

关键词：

image manipulation localization; dual-stream enhancement encoder; attention optimization decoder; adjacent feature aggregation module;

D O I：

10.37188/CJLCD.2023-0280

中图分类号：

O7 [晶体学];

学科分类号：

0702 ; 070205 ; 0703 ; 080501 ;

摘要：

Mainstream image manipulation localization methods usually fuse inconsistent features of different streams through simple operations, resulting in feature redundancy and pixel misdetection of tampered regions. Therefore, we propose a novel network of dual-stream enhancement encoder and attention optimization decoder for image manipulation localization. Firstly, the dual-stream enhancement encoder module can self-reinforce and interact with the extracted dual-stream multi-scale features, and can make full use of a variety of tampered information, so that a variety of tampered information can be complemented by interaction, and more attention is paid to the tampering features. Then, a multi-scale receptive field strategy is introduced to explore multi-scale context information, and an adjacent-level feature aggregation module is designed to fuse multi-scale adjacent features. Finally, the capability of manipulation localization is enhanced with the cooperation of tamper region and genuine region, the attention optimization decoder module is designed to eliminate the wrong prediction of edge pixels in the initial tamper region prediction, and the manipulation localization is refined step by step. Extensive experiments are constructed on four mainstream public datasets, NIST16, Coverage, Columbia and CASIA, and two realistic challenge datasets, IMD20 and Wild, to compare with mainstream manipulation localization methods. Our proposed method has superior performance under six datasets in the settings of none fine-tuning-tuning and fine-tuning-tuning model, which demonstrates that our proposed method can make full use of various forgery clues to achieve greater localization accuracy and stronger robustness.

引用

页码：1103 / 1115

页数：13

共 44 条

[21] LU K L, 2022, Opto-Electronic Engineering, V49
[22] Using noise inconsistencies for blind image forensics
Mahdian, Babak
Saic, Stanislav
[J]. IMAGE AND VISION COMPUTING, 2009, 27 (10) : 1497 - 1503
[23] Mahfoudi G, 2019, EUR SIGNAL PR CONF
[24] NG T T, 2009, Columbia Univ CalPhotos Digit Libr
[25] Novozámsky A, 2020, IEEE WINT CONF APPL, P71, DOI [10.1109/WACVW50321.2020.9096940, 10.1109/wacvw50321.2020.9096940]
[26] Pang YW, 2020, PROC CVPR IEEE, P9410, DOI 10.1109/CVPR42600.2020.00943
[27] Image dehazing algorithm based on multi-scale concat convolutional neural network
Qiao Dan
Zhang Chuang
Zhu Chen-yu
[J]. CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS, 2021, 36 (10) : 1420 - 1429
[28] A2SPPNet: Attentive Atrous Spatial Pyramid Pooling Network for Salient Object Detection
Qiu, Yu
Liu, Yun
Chen, Yanan
Zhang, Jianwen
Zhu, Jinchao
Xu, Jing
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1991 - 2006
[29] Rao Y., 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence, P1
[30] AXIOMATIC DERIVATION OF THE PRINCIPLE OF MAXIMUM-ENTROPY AND THE PRINCIPLE OF MINIMUM CROSS-ENTROPY
SHORE, JE
JOHNSON, RW
[J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1980, 26 (01) : 26 - 37

← 1 2 3 4 5 →