Fast intra-coding unit partition decision in H.266/FVC based on deep learning

被引：15

作者：

Amna, Maraoui ^{[1
]}

Imen, Werda ^{[2
]}

Ezahra, Sayadi Fatma ^{[1
]}

Mohamed, Atri ^{[3
]}

机构：

[1] Univ Monastir, Fac Sci Monastir, Elect & Microelect Lab, Environm St, Monastir 5019, Tunisia

[2] Elect & Informat Technol Lab, Sfax, Tunisia

[3] King Khalid Univ, Coll Comp Sci, Abha, Saudi Arabia

来源：

JOURNAL OF REAL-TIME IMAGE PROCESSING | 2020年 / 17卷 / 06期

关键词：

H266; Future video coding; Intra-coding; Convolutional neural network; Quad-tree plus binary-tree; Rate distortion optimization;

D O I：

10.1007/s11554-020-00998-5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the recent Future Video Coding (FVC) standard developed by the Joint Video Exploration Team (JVET), the quad-tree binary-tree (QTBT) block partition module makes use of rectangular block forms and additional square block sizes compared to quad-tree (QT) block partitioning module proposed in the predecessor High-Efficiency Video Coding (HEVC) standard. This block flexibility, induced with the QTBT module, significantly improves compression performance while it dramatically increases coding complexity due to the brute force search for Rate Distortion Optimization (RDO). To cope with this issue, it is necessary to consider the unique characteristics of QTBT in FVC. In this paper, we propose a fast QT partitioning algorithm based on a deep convolutional neural network (CNN) model to predict coding unit (CU) partition instead of RDO which enhances considerably QTBT performance for intra-mode coding. Based on a suitable diversified CU partition patterns database, the optimization process is set up with three levels CNN structure developed to learn the split or non-split decision from the established database. Experimental results reveal that the proposed algorithm can accelerate the QTBT block partition structure by reducing the intra-mode encoding time by an average of 35% with a bit rate increase of 1.7%, allowing its application in practical scenarios.

引用

页码：1971 / 1981

页数：11

共 30 条

[1]

Alex M., 2017, 2017 INT C IM VIS CO, P1, DOI DOI 10.1109/IVCNZ.2017.8402480

[2]

[Anonymous], 2015, TENSOR

[3]

[Anonymous], P JOINT VID EXPL TEA

[4]

Bjontegaard G., 2001, P ITU T Q 6 SG16 VCE

[5]

Bross B., 2018, Document JVET-K1001 of JVET

[6]

Dang-Nguyen D.-T., 2015, P 6 ACM MULTIMEDIA S, P219

[7]

Duanmu F, 2015, IEEE IMAGE PROC, P4972, DOI 10.1109/ICIP.2015.7351753

[8] Information fusion based techniques for HEVC [J].

Fernandez, D. G. ;

Del Barrio, A. A. ;

Botella, Guillermo ;

Meyer-Baese, Uwe ;

Meyer-Baese, Anke ;

Grecos, Christos .

REAL-TIME IMAGE AND VIDEO PROCESSING 2017, 2017, 10223

[9]

Gary S., 2016, 2 M JOINT VID EXPL T

[10]

Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672

← 1 2 3 →