DAMS: Document Image Steganography with Dual Attention Multi-scale Encoder-Decoder Architecture

被引：0

作者：

Li, Kaijiang ^{[1
]}

Qin, Yi ^{[1
]}

Wang, Peisen ^{[1
]}

Guo, Chunyi ^{[1
]}

Wang, Junqi ^{[2
]}

Jia, Ruiyang ^{[1
]}

Jiang, Wenfeng ^{[3
]}

机构：

[1] Zhengzhou Univ, Zhengzhou, Peoples R China

[2] Zhengzhou Univ Aeronaut, Zhengzhou, Peoples R China

[3] China Mobile Grp Henan Co Ltd, Zhengzhou, Peoples R China

来源：

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT II | 2025年 / 15032卷

关键词：

Steganography; Document image; Channel attention; Transformer; Multi-scale feature fusion;

D O I：

10.1007/978-981-97-8490-5_9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the research field of steganography, advances in deep learning techniques have significantly improved the ability to embed secret messages into scene images. However, for document images with significant differences in color and background distributions, it is still a major challenge to ensure the invisibility of hidden information without interfering with the text-reading experience. To address this challenge, we propose an end-to-end framework designed specifically for document images, namely, the Dual Attention Multi-scale Encoder-Decoder Architecture (DAMS). The DAMS framework takes into full consideration of the pixel distributions and value deviations caused during the formation of document images. To balance the information embedding and extraction processes, the encoder and decoder adopt the same Channel Attention Network (CAN) module. In addition, we introduce a Self-Attention Fusion network (SAF), which can perform multi-scale text region feature extraction and fusion. The self-attention mechanism significantly enhances the perceptual capability of text region features, thereby improving the effectiveness of secret information embedding. Extensive experiments demonstrate that DAMS achieves state-of-the-art results, with an average accuracy rate of 99.99% and a PSNR of 40.52 dB under noise-free conditions, and an average accuracy rate of 99.32% and a PSNR of 38.24 dB under combined noise interference. The code will be released.

引用

页码：118 / 131

页数：14

共 50 条

[21] A Dual Attention Encoder-Decoder Text Summarization Model
Hakami, Nada Ali
Mahmoud, Hanan Ahmed Hosni
CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (02): : 3697 - 3710
[22] Image deblurring via multi-scale feature fusion and multi-input multi-output encoder-decoder
Zhao Q.
Zhou D.
Yang H.
Wang C.
Li M.
Hongwai yu Jiguang Gongcheng/Infrared and Laser Engineering, 2022, 51 (10):
[23] PMED-Net: Pyramid Based Multi-Scale Encoder-Decoder Network for Medical Image Segmentation
Khan, Abbas
Kim, Hyongsuk
Chua, Leon
IEEE ACCESS, 2021, 9 : 55988 - 55998
[24] Building Extraction of Aerial Images by a Global and Multi-Scale Encoder-Decoder Network
Ma, Jingjing
Wu, Linlin
Tang, Xu
Liu, Fang
Zhang, Xiangrong
Jiao, Licheng
REMOTE SENSING, 2020, 12 (15)
[25] A Multi-Scale Contrast Preserving Encoder-Decoder Architecture for Local Change Detection From Thermal Video Scenes
Panda, Manoj Kumar
Subudhi, Badri Narayan
Veerakumar, T.
Jakhetiya, Vinit
Bouwmans, Thierry
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 7968 - 7981
[26] WaveFusionNet: Infrared and visible image fusion based on multi-scale feature encoder-decoder and discrete wavelet decomposition
Liu, Renhe
Liu, Yu
Wang, Han
Du, Shan
OPTICS COMMUNICATIONS, 2024, 573
[27] MSFF-UNet: Image segmentation in colorectal glands using an encoder-decoder U-shaped architecture with multi-scale feature fusion
Liu, Chengdao
Peng, Kexin
Peng, Ziyang
Zhang, Xingzhi
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (14) : 42681 - 42701
[28] MSFF-UNet: Image segmentation in colorectal glands using an encoder-decoder U-shaped architecture with multi-scale feature fusion
Chengdao Liu
Kexin Peng
Ziyang Peng
Xingzhi Zhang
Multimedia Tools and Applications, 2024, 83 : 42681 - 42701
[29] Encoder-Decoder Networks for Retinal Vessel Segmentation Using Large Multi-scale Patches
Browatzki, Bjoern
Lies, Joern-Philipp
Wallraven, Christian
OPHTHALMIC MEDICAL IMAGE ANALYSIS, OMIA 2020, 2020, 12069 : 42 - 52
[30] A Traffic Surveillance Multi-Scale Vehicle Detection Object Method Base on Encoder-Decoder
Hong, Feng
Lu, Chang-Hua
Liu, Chun
Liu, Ru-Ru
Wei, Ju
IEEE ACCESS, 2020, 8 : 47664 - 47674

← 1 2 3 4 5 →