Enhancing Distributed Source Coding With Encoder-Centric Frequency Adaptation and Spatial Transformation

被引:0
作者
Xu, Hao [1 ]
Tan, Bin [2 ]
Chen, Yihao [1 ]
Hu, Die [3 ]
Wu, Jun [4 ,5 ]
机构
[1] Tongji Univ, Coll Elect & Informat Engn, Shanghai 201804, Peoples R China
[2] Jinggangshan Univ, Coll Elect & Informat Engn, Jian 343009, Peoples R China
[3] Fudan Univ, Sch Commun Sci & Engn, Shanghai 200433, Peoples R China
[4] Fudan Univ, Sch Comp Sci, Shanghai 200433, Peoples R China
[5] Fudan Univ, Shanghai Key Lab Intelligent Informat Proc, Shanghai 200433, Peoples R China
基金
中国国家自然科学基金;
关键词
Frequency-domain analysis; Image coding; Feature extraction; Decoding; Image reconstruction; Source coding; Optimization; Correlation; Information filters; Transformers; Distributed source coding; frequency-domain filtering; affine transformation; INFORMATION;
D O I
10.1109/TMM.2024.3521700
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Current methodologies in distributed source coding have predominantly investigated decoder-focused strategies, emphasizing the alignment and exploitation of side information. This study introduces a paradigm shift by presenting an encoder-centric algorithm that conducts proactive optimization in the frequency domain. This shift is motivated by the current deep learning models' tendency to passively extract high-frequency elements, such as contours and content in the spatial domain at the encoder side, without considering the frequency characteristics of these spatial components. Unlike current trends, the proposed scheme actively selects the essential frequency components directly in the frequency domain by introducing an adaptive self-learning filter, enabling the encoder to discern and retain critical frequency components effectively and precisely. Furthermore, we align the side information in the spatial domain before feature extraction and implement an affine transformation-based alignment strategy to utilize the side information better. By leveraging the shared frequency domain components of the image pairs, the proposed algorithm adeptly learns affine coefficients to accomplish precise spatial alignment. This dual strategy of proactive encoder optimization and decoder alignment via affine transformations is highly efficient, outperforming existing state-of-the-art methods in distributed source coding when tested across two diverse datasets by an average of 0.5 dB in PSNR.
引用
收藏
页码:2582 / 2592
页数:11
相关论文
共 42 条
[1]   Deep Image Compression Using Decoder Side Information [J].
Ayzik, Sharon ;
Avidan, Shai .
COMPUTER VISION - ECCV 2020, PT XVII, 2020, 12362 :699-714
[2]  
Balle J., 2018, Variational image compression with a scale hyperprior
[3]  
Balle J., 2017, 5 INT C LEARN REPR I
[4]  
Bellard F., 2018, BPG image format
[5]   Cross Parallax Attention Network for Stereo Image Super-Resolution [J].
Chen, Canqiang ;
Qing, Chunmei ;
Xu, Xiangmin ;
Dickinson, Patrick .
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 :202-216
[6]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[7]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8]  
Dragotti PL, 2009, DISTRIBUTED SOURCE CODING: THEORY, ALGORITHMS, AND APPLICATIONS, P1
[9]  
Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
[10]   Distributed video coding [J].
Girod, B ;
Margot, A ;
Rane, S ;
Rebollo-Monedero, D .
PROCEEDINGS OF THE IEEE, 2005, 93 (01) :71-83