Advances in Video Compression System Using Deep Neural Network: A Review and Case Studies

被引:34
作者
Ding, Dandan [1 ]
Ma, Zhan [2 ]
Chen, Di [3 ]
Chen, Qingshuang [4 ]
Liu, Zoe [5 ]
Zhu, Fengqing [4 ]
机构
[1] Hangzhou Normal Univ, Sch Informat Sci & Engn, Hangzhou 311121, Peoples R China
[2] Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210093, Peoples R China
[3] Google Inc, Mountain View, CA 94043 USA
[4] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
[5] Visionular Inc, Los Altos, CA 94022 USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
Encoding; Video compression; Video coding; Streaming media; Visualization; Quality of experience; Spatiotemporal phenomena; Deep learning; Neural networks; Adaptive filters; deep neural networks (DNNs); neural video coding; texture analysis; PREDICTION; MODEL; SEGMENTATION; STATISTICS; FRAMEWORK; STANDARD; SCALE;
D O I
10.1109/JPROC.2021.3059994
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Significant advances in video compression systems have been made in the past several decades to satisfy the near-exponential growth of Internet-scale video traffic. From the application perspective, we have identified three major functional blocks, including preprocessing, coding, and postprocessing, which have been continuously investigated to maximize the end-user quality of experience (QoE) under a limited bit rate budget. Recently, artificial intelligence (AI)-powered techniques have shown great potential to further increase the efficiency of the aforementioned functional blocks, both individually and jointly. In this article, we review recent technical advances in video compression systems extensively, with an emphasis on deep neural network (DNN)-based approaches, and then present three comprehensive case studies. On preprocessing, we show a switchable texture-based video coding example that leverages DNN-based scene understanding to extract semantic areas for the improvement of a subsequent video coder. On coding, we present an end-to-end neural video coding framework that takes advantage of the stacked DNNs to efficiently and compactly code input raw videos via fully data-driven learning. On postprocessing, we demonstrate two neural adaptive filters to, respectively, facilitate the in-loop and postfiltering for the enhancement of compressed frames. Finally, a companion website hosting the contents developed in this work can be accessed publicly at https://purdueviper.github.io/dnn-coding/.
引用
收藏
页码:1494 / 1520
页数:27
相关论文
共 50 条
  • [1] End-to-End Neural Video Compression: A Review
    Gomes, Jiovana S.
    Grellert, Mateus
    Ramos, Fabio L. L.
    Bampi, Sergio
    IEEE OPEN JOURNAL OF CIRCUITS AND SYSTEMS, 2025, 6 : 120 - 134
  • [2] DeepCoder: A Deep Neural Network Based Video Compression
    Chen, Tong
    Liu, Haojie
    Shen, Qiu
    Yue, Tao
    Cao, Xun
    Ma, Zhan
    2017 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2017,
  • [3] Image and Video Compression With Neural Networks: A Review
    Ma, Siwei
    Zhang, Xinfeng
    Jia, Chuanmin
    Zhao, Zhenghui
    Wang, Shiqi
    Wang, Shanshe
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (06) : 1683 - 1698
  • [4] Innovative Insights: A Review of Deep Learning Methods for Enhanced Video Compression
    Khadir, Mohammad
    Farukh Hashmi, Mohammad
    Kotambkar, Deepali M.
    Gupta, Aditya
    IEEE ACCESS, 2024, 12 : 125706 - 125725
  • [5] Overview of Research in the field of Video Compression using Deep Neural Networks
    Birman, Raz
    Segal, Yoram
    Hadar, Ofer
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (17-18) : 11699 - 11722
  • [6] Overview of Research in the field of Video Compression using Deep Neural Networks
    Raz Birman
    Yoram Segal
    Ofer Hadar
    Multimedia Tools and Applications, 2020, 79 : 11699 - 11722
  • [7] High-Definition Video Compression System Based on Perception Guidance of Salient Information of a Convolutional Neural Network and HEVC Compression Domain
    Zhu, Shiping
    Liu, Chang
    Xu, Ziyao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (07) : 1946 - 1959
  • [8] Counting of Moving People in the Video using Neural Network System
    Arif, Muhammad
    Saqib, Muhammad
    Basalamah, Saleh
    Naeem, Asad
    LIFE SCIENCE JOURNAL-ACTA ZHENGZHOU UNIVERSITY OVERSEAS EDITION, 2012, 9 (03): : 1384 - 1392
  • [9] Edge-Based Video Compression Texture Synthesis Using Generative Adversarial Network
    Zhu, Chen
    Xu, Jun
    Feng, Donghui
    Xie, Rong
    Song, Li
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (10) : 7061 - 7076
  • [10] Hierarchical Random Access Coding for Deep Neural Video Compression
    Thang, Nguyen Van
    Bang, Le Van
    IEEE ACCESS, 2023, 11 : 57494 - 57502