Data-Driven Rate Control for Rate-Distortion Optimization in HEVC Based on Simplified Effective Initial QP Learning

被引：33

作者：

Gao, Wei ^{[1
,2
]}

Kwong, Sam ^{[2
,3
]}

Jiang, Qiuping ^{[4
]}

Fong, Chi-Keung ^{[5
]}

Wong, Peter H. W. ^{[5
]}

Yuen, Wilson Y. F. ^{[5
]}

机构：

[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China

[2] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China

[3] City Univ Hong Kong, Shenzhen Res Inst, Shenzhen 5180057, Peoples R China

[4] Ningbo Univ, Sch Informat Sci & Engn, Ningbo 315211, Zhejiang, Peoples R China

[5] TFI Digital Media Ltd, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON BROADCASTING | 2019年 / 65卷 / 01期

关键词：

H.265/HEVC; video coding; initial QP; machine learning; rate control; support vector regression (SVR); RATE-QUANTIZATION MODEL; SUPPORT VECTOR MACHINE; GAME-THEORY; EFFICIENCY; MULTIMEDIA; TUTORIAL;

D O I：

10.1109/TBC.2018.2865647

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Different from the conventional calculative methods, a learning-based initial quantization parameter (LIQP) method is proposed in this paper to improve rate control of high efficiency video coding (H.265). First, the framework for initial quantization parameter (QP) learning is proposed, where a novel equivalent approach to build the benchmark labels is proposed using the single rate-distortion (R-D) pair in each initial QP testing. With the criterion of maximizing the prediction accuracy of initial QPs, features and parameters of the learning model are refined. Instead of the traditionally used target bits per pixel (bpp) for intraframe, the target bpp for all remaining frames is proposed to avoid the empirical setting on intracoding bits, and thus the related inaccuracy can be prevented. We clearly present the motivations of the proposed LIQP method, as well as the reasons for the extracted features and model parameters. The proposed LIQP method outperforms the latest HM-16.14 by achieving significant gains on R-D performance (-15.48% BD-BR and 0.782 dB BD-PSNR gains), quality smoothness (1.581 dB versus 2.598 dB), and more stable buffer occupancy control, with similar high bit rate accuracy (99.84% versus 99.87%), and can also work well for scene change cases.

引用

页码：94 / 108

页数：15

共 50 条

[1] Abu-Mostafa Y.S., 2012, Learning from data: a short course
[2] [Anonymous], 2017, HM REFERENCE SOFTWAR
[3] Bishop Christopher M., 2006, Technometrics, V1st
[4] Bjontegaard Gisle, 2001, Calculation of average PSNR differences between RD-curves
[5] Bross B., 2013, 12 JCTVC M JCTVC GEN
[6] A tutorial on Support Vector Machines for pattern recognition
Burges, CJC
[J]. DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) : 121 - 167
[7] LIBSVM: A Library for Support Vector Machines
Chang, Chih-Chung
Lin, Chih-Jen
[J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[8] Multimedia: The Biggest Big Data
Chen, Shu-Ching
Jain, Ramesh
Tian, Yonghong
Wang, Haohong
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (09) : 1401 - 1403
[9] Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482
[10] Fast H.264 Encoding Based on Statistical Learning
Chiang, Chen-Kuo
Pan, Wei-Hau
Hwang, Chiuan
Zhuang, Shin-Shan
Lai, Shang-Hong
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2011, 21 (09) : 1304 - 1315

← 1 2 3 4 5 →