DAMS: Document Image Steganography with Dual Attention Multi-scale Encoder-Decoder Architecture

被引:0
|
作者
Li, Kaijiang [1 ]
Qin, Yi [1 ]
Wang, Peisen [1 ]
Guo, Chunyi [1 ]
Wang, Junqi [2 ]
Jia, Ruiyang [1 ]
Jiang, Wenfeng [3 ]
机构
[1] Zhengzhou Univ, Zhengzhou, Peoples R China
[2] Zhengzhou Univ Aeronaut, Zhengzhou, Peoples R China
[3] China Mobile Grp Henan Co Ltd, Zhengzhou, Peoples R China
来源
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT II | 2025年 / 15032卷
关键词
Steganography; Document image; Channel attention; Transformer; Multi-scale feature fusion;
D O I
10.1007/978-981-97-8490-5_9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the research field of steganography, advances in deep learning techniques have significantly improved the ability to embed secret messages into scene images. However, for document images with significant differences in color and background distributions, it is still a major challenge to ensure the invisibility of hidden information without interfering with the text-reading experience. To address this challenge, we propose an end-to-end framework designed specifically for document images, namely, the Dual Attention Multi-scale Encoder-Decoder Architecture (DAMS). The DAMS framework takes into full consideration of the pixel distributions and value deviations caused during the formation of document images. To balance the information embedding and extraction processes, the encoder and decoder adopt the same Channel Attention Network (CAN) module. In addition, we introduce a Self-Attention Fusion network (SAF), which can perform multi-scale text region feature extraction and fusion. The self-attention mechanism significantly enhances the perceptual capability of text region features, thereby improving the effectiveness of secret information embedding. Extensive experiments demonstrate that DAMS achieves state-of-the-art results, with an average accuracy rate of 99.99% and a PSNR of 40.52 dB under noise-free conditions, and an average accuracy rate of 99.32% and a PSNR of 38.24 dB under combined noise interference. The code will be released.
引用
收藏
页码:118 / 131
页数:14
相关论文
共 50 条
  • [31] Iterative Convolutional Encoder-Decoder Network with Multi-Scale Context Learning for Liver Segmentation
    Zhang, Feiyan
    Yan, Shuhao
    Zhao, Yizhong
    Gao, Yuan
    Li, Zhi
    Lu, Xuesong
    APPLIED ARTIFICIAL INTELLIGENCE, 2022, 36 (01)
  • [32] Remote Sensing LiDAR and Hyperspectral Classification with Multi-Scale Graph Encoder-Decoder Network
    Wang, Fang
    Du, Xingqian
    Zhang, Weiguang
    Nie, Liang
    Wang, Hu
    Zhou, Shun
    Ma, Jun
    REMOTE SENSING, 2024, 16 (20)
  • [33] SAR IMAGES ENHANCEMENT VIA DEEP MULTI-SCALE ENCODER-DECODER NEURAL NETWORK
    Yang, Xiaqing
    Zhou, Yuanyuan
    Wang, Chen
    Shi, Jun
    2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 3368 - 3371
  • [34] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
    Badrinarayanan, Vijay
    Kendall, Alex
    Cipolla, Roberto
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) : 2481 - 2495
  • [35] End-to-End Trained CNN Encoder-Decoder Networks for Image Steganography
    Rehman, Atique ur
    Rahim, Rafia
    Nadeem, Shahroz
    ul Hussain, Sibt
    COMPUTER VISION - ECCV 2018 WORKSHOPS, PT IV, 2019, 11132 : 723 - 729
  • [36] Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries
    Bappy, Jawadul H.
    Simons, Cody
    Nataraj, Lakshmanan
    Manjunath, B. S.
    Roy-Chowdhury, Amit K.
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (07) : 3286 - 3300
  • [37] An encoder-decoder architecture with Fourier attention for chaotic time series multi-step prediction
    Fu, Ke
    Li, He
    Shi, Xiaotian
    APPLIED SOFT COMPUTING, 2024, 156
  • [38] Fracture Extraction From Logging Image Using a Dual Encoder-Decoder Architecture With Swin Transformer
    Wang, Wenjun
    Zhou, Luoyu
    PETROPHYSICS, 2023, 64 (01): : 38 - 49
  • [39] Highly efficient encoder-decoder network based on multi-scale edge enhancement and dilated convolution for LDCT image denoising
    Jia, Lina
    He, Xu
    Huang, Aimin
    Jia, Beibei
    Wang, Xinfeng
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (8-9) : 6081 - 6091
  • [40] MULTI-STEP CHORD SEQUENCE PREDICTION BASED ON AGGREGATED MULTI-SCALE ENCODER-DECODER NETWORKS
    Carsault, Tristan
    McLeod, Andrew
    Esling, Philippe
    Nika, Jerome
    Nakamura, Eita
    Yoshii, Kazuyoshi
    2019 IEEE 29TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2019,