Reinforcement Learning for HEVC/H.265 Intra-Frame Rate Control

被引:54
作者
Hu, Jun-Hao [1 ]
Peng, Wen-Hsiao [1 ]
Chung, Chia-Hua [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Comp Sci, Hsinchu, Taiwan
来源
2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS) | 2018年
关键词
D O I
10.1109/ISCAS.2018.8351575
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Reinforcement learning has proven effective for solving decision making problems. However, its application to modern video codecs has yet to be seen. This paper presents an early attempt to introduce reinforcement learning to HEVC/H.265 intra-frame rate control. The task is to determine a quantization parameter value for every coding tree unit in a frame, with the objective being to minimize the frame-level distortion subject to a rate constraint. We draw an analogy between the rate control problem and the reinforcement learning problem, by considering the texture complexity of coding tree units and bit balance as the environment state, the quantization parameter value as an action that an agent needs to take, and the negative distortion of the coding tree unit as an immediate reward. We train a neural network based on Q-learning to be our agent, which observes the state to evaluate the reward for each possible action. When trained on only limited sequences, the proposed model can already perform comparably with the rate control algorithm in HM-16.15.
引用
收藏
页数:5
相关论文
共 7 条
  • [1] [Anonymous], 2014, P 2014 IEEE INT C MU, DOI DOI 10.1109/ICMEW.2014.6890647
  • [2] Chung C. H., 2017, P IEEE INT S INT SIG
  • [3] Correa G, 2014, IEEE I C ELECT CIRC, P239, DOI 10.1109/ICECS.2014.7049966
  • [4] Heller P, 2017, WOODHEAD PUBL SER EN, P1, DOI 10.1016/B978-0-08-100447-0.00001-8
  • [5] Laude T, 2016, PICT COD SYMP
  • [6] CU splitting early termination based on weighted SVM
    Shen, Xiaolin
    Yu, Lu
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2013,
  • [7] WATKINS CJCH, 1992, MACH LEARN, V8, P279, DOI 10.1007/BF00992698