Machine learning based video coding optimizations: A survey

被引:47
作者
Zhang, Yun [1 ]
Kwong, Sam [2 ]
Wang, Shiqi [2 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China
[2] City Univ Hong Kong, Dept Comp Sci, Kowloon, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Video coding; High efficiency video coding; Machine learning; Mode decision; Visual quality assessment; Convolutional neural network; Deep learning; Versatile video coding; IMAGE QUALITY ASSESSMENT; TERMINATION ALGORITHM; DECISION ALGORITHM; MODE DECISION; SIZE DECISION; NETWORK; REPRESENTATION;
D O I
10.1016/j.ins.2019.07.096
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Video data has become the largest source of data consumed globally. Due to the rapid growth of video applications and boosting demands for higher quality video services, video data volume has been increasing explosively worldwide, which has been the most severe challenge for multimedia computing, transmission and storage. Video coding by compressing videos into a much smaller size is one of the key solutions; however, its development has become saturated to some extent while the compression ratio continuously grows in the last three decades. Machine leaning algorithms, especially those employing deep learning, which are capable of discovering knowledge from unstructured massive data and providing data-driven predictions, provide new opportunities for further upgrading video coding technologies. In this article, we present a review on machine learning based video encoding optimization, aiming to provide researchers with a strong foundation and inspire future developments for data-driven video coding. Firstly, we analyze the representations and redundancies of video data. Secondly, we review the development of video coding standards and key requirements. Subsequently, we present a systemic survey on the recent advances and challenges associated with the machine learning based video coding optimizations from three key aspects, including high efficiency, low complexity and high visual quality. Their workflows, representative schemes, performances, advantages and disadvantages are analyzed in detail. Finally, the challenges and opportunities are identified, which may provide the academic and industrial communities with groundwork and potential directions for future research. (C) 2019 Published by Elsevier Inc.
引用
收藏
页码:395 / 423
页数:29
相关论文
共 136 条
  • [41] An End-to-End Compression Framework Based on Convolutional Neural Networks
    Jiang, Feng
    Tao, Wen
    Liu, Shaohui
    Ren, Jie
    Guo, Xun
    Zhao, Debin
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (10) : 3007 - 3018
  • [42] Bayesian adaptive algorithm for fast coding unit decision in the High Efficiency Video Coding (HEVC) standard
    Jimenez-Moreno, Amaya
    Martinez-Enriquez, Eduardo
    Diaz-de-Maria, Fernando
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2017, 56 : 1 - 11
  • [43] Jin Z., 2017, PROC IEEE VCIP, P1
  • [44] Structured sparse representation of residue in screen content video coding
    Kang, J. -W.
    [J]. ELECTRONICS LETTERS, 2015, 51 (23) : 1871 - 1872
  • [45] Sparse/DCT (S/DCT) Two-Layered Representation of Prediction Residuals for Video Coding
    Kang, Je-Won
    Gabbouj, Moncef
    Kuo, C. -C. Jay
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2013, 22 (07) : 2711 - 2722
  • [46] Convolutional Neural Networks for No-Reference Image Quality Assessment
    Kang, Le
    Ye, Peng
    Li, Yi
    Doermann, David
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 1733 - 1740
  • [47] Kappeler A, 2016, IEEE IMAGE PROC, P1150, DOI 10.1109/ICIP.2016.7532538
  • [48] Blockchained On-Device Federated Learning
    Kim, Hyesung
    Park, Jihong
    Bennis, Mehdi
    Kim, Seong-Lyun
    [J]. IEEE COMMUNICATIONS LETTERS, 2020, 24 (06) : 1279 - 1283
  • [49] Fast CU Partitioning Algorithm for HEVC Using an Online-Learning-Based Bayesian Decision Rule
    Kim, Hyo-Song
    Park, Rae-Hong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2016, 26 (01) : 130 - 138
  • [50] Kim J, 2016, PROC CVPR IEEE, P1637, DOI [10.1109/CVPR.2016.181, 10.1109/CVPR.2016.182]