Deep Learning-Based Video Coding: A Review and a Case Study

被引：127

作者：

Liu, Dong ^{[1
]}

Li, Yue ^{[1
]}

Lin, Jianping ^{[1
]}

Li, Houqiang ^{[1
]}

Wu, Feng ^{[1
]}

机构：

[1] Univ Sci & Technol China, CAS Key Lab Technol Geospatial Informat Proc & Ap, 443 Huangshan Rd, Hefei 230027, Anhui, Peoples R China

来源：

ACM COMPUTING SURVEYS | 2020年 / 53卷 / 01期

关键词：

Deep learning; image coding; prediction; transform; video coding; IMAGE COMPRESSION; NEURAL-NETWORK; FRAMEWORK;

D O I：

10.1145/3368405

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

The past decade has witnessed the great success of deep learning in many disciplines, especially in computer vision and image processing. However, deep learning-based video coding remains in its infancy. We review the representative works about using deep learning for image/video coding, an actively developing research area since 2015. We divide the related works into two categories: new coding schemes that are built primarily upon deep networks, and deep network-based coding tools that shall be used within traditional coding schemes. For deep schemes, pixel probability modeling and auto-encoder are the two approaches, that can be viewed as predictive coding and transform coding, respectively. For deep tools, there have been several techniques using deep learning to perform intra-picture prediction, inter-picture prediction, cross-channel prediction, probability distribution prediction, transform, post- or in-loop filtering, down- and up-sampling, as well as encoding optimizations. In the hope of advocating the research of deep learning-based video coding, we present a case study of our developed prototype video codec, Deep Learning Video Coding (DLVC). DLVC features two deep tools that are both based on convolutional neural network (CNN), namely CNN-based in-loop filter and CNN-based block adaptive resolution coding. The source code of DLVC has been released for future research.

引用

页数：35

共 164 条

[1] Video Compression Based on Spatio-Temporal Resolution Adaptation [J].

Afonso, Mariana ;

Zhang, Fan ;

Bull, David R. .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (01) :275-280

[2]

Agustsson E, 2017, ADV NEUR IN, V30

[3]

Agustsson Eirikur., 2018, IEEE C COMPUT VIS PA, P2587

[4] Lossless Image Compression Using Reversible Integer Wavelet Transforms and Convolutional Neural Networks [J].

Ahanonu, E. ;

Marcellin, M. W. ;

Bilgin, A. .

2018 DATA COMPRESSION CONFERENCE (DCC 2018), 2018, :395-395

[5]

Akbari M, 2019, INT CONF ACOUST SPEE, P2042, DOI [10.1109/icassp.2019.8683541, 10.1109/ICASSP.2019.8683541]

[6]

[Anonymous], 2018, VEH SYST DYN, DOI DOI 10.1109/PHM-CHONGQING.2018.00008

[7]

[Anonymous], puter Vision and Pattern Recognition

[8]

[Anonymous], 2017, P ICLR

[9]

[Anonymous], 2018, DCC, DOI DOI 10.1109/DCC.2018.00028

[10]

[Anonymous], PROC CVPR IEEE

← 1 2 3 4 5 6 7 8 9 10 →