End-to-end optimized image compression with the frequency-oriented transform

被引:0
|
作者
Zhang, Yuefeng [1 ]
Lin, Kai [2 ]
机构
[1] Beijing Inst Comp Technol & Applicat, 51th Yongding Rd, Beijing 100039, Peoples R China
[2] Peking Univ, Sch Comp Sci, Beijing 100871, Peoples R China
关键词
Image compression; Image processing; Computer vision; Machine learning;
D O I
10.1007/s00138-023-01507-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image compression constitutes a significant challenge amid the era of information explosion. Recent studies employing deep learning methods have demonstrated the superior performance of learning-based image compression methods over traditional codecs. However, an inherent challenge associated with these methods lies in their lack of interpretability. Following an analysis of the varying degrees of compression degradation across different frequency bands, we propose the end-to-end optimized image compression model facilitated by the frequency-oriented transform. The proposed end-to-end image compression model consists of four components: spatial sampling, frequency-oriented transform, entropy estimation, and frequency-aware fusion. The frequency-oriented transform separates the original image signal into distinct frequency bands, aligning with the human-interpretable concept. Leveraging the non-overlapping hypothesis, the model enables scalable coding through the selective transmission of arbitrary frequency components. Extensive experiments are conducted to demonstrate that our model outperforms all traditional codecs including next-generation standard H.266/VVC on MS-SSIM metric. Moreover, visual analysis tasks (i.e., object detection and semantic segmentation) are conducted to verify the proposed compression method that could preserve semantic fidelity besides signal-level precision.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Bi-directional prediction for end-to-end optimized video compression
    Racape, Fabien
    Begaint, Jean
    Feltman, Simon
    Pushparaja, Akshay
    APPLICATIONS OF DIGITAL IMAGE PROCESSING XLIV, 2021, 11842
  • [22] End-to-end Optimized Video Compression with MV-Residual Prediction
    Wu, XiangJi
    Zhang, Ziwen
    Feng, Jie
    Zhou, Lei
    Wu, Junmin
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 611 - 614
  • [23] Perceptual Quality-Oriented Rate Allocation via Distillation from End-to-End Image Compression
    Yang, Runyu
    Liu, Dong
    Ma, Siwei
    Wu, Feng
    Gao, Wen
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (07)
  • [24] End-to-end image compression method based on perception metric
    Shuai Liu
    Yingcong Huang
    Huoxiang Yang
    Yongsheng Liang
    Wei Liu
    Signal, Image and Video Processing, 2022, 16 : 1803 - 1810
  • [25] End-to-end system consideration of the Galileo image compression system
    Cheung, K
    Tong, K
    Belongie, M
    IGARSS '96 - 1996 INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM: REMOTE SENSING FOR A SUSTAINABLE FUTURE, VOLS I - IV, 1996, : 1035 - 1038
  • [26] End-to-End Learning-Based Image Compression: A Review
    Chen Jimin
    Lin Zehao
    LASER & OPTOELECTRONICS PROGRESS, 2020, 57 (22)
  • [27] End-to-End Learned Image Compression with Augmented Normalizing Flows
    Ho, Yung-Han
    Chan, Chih-Chun
    Peng, Wen-Hsiao
    Hang, Hsueh-Ming
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 1931 - 1935
  • [28] An end-to-end spike-based image compression architecture
    Doutsi, Effrosyni
    Antonini, Marc
    Tsakalides, Panagiotis
    2020 54TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2020, : 818 - 820
  • [29] End-to-end image compression method based on perception metric
    Liu, Shuai
    Huang, Yingcong
    Yang, Huoxiang
    Liang, Yongsheng
    Liu, Wei
    SIGNAL IMAGE AND VIDEO PROCESSING, 2022, 16 (07) : 1803 - 1810
  • [30] Estimating the resize parameter in end-to-end learned image compression
    Chen, Li-Heng
    Bampis, Christos G.
    Li, Zhi
    Krasula, Lukas
    Bovik, Alan C.
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2025, 135