Spatiotemporal visual saliency guided perceptual high efficiency video coding with neural network

被引:38
作者
Zhu, Shiping [1 ]
Xu, Ziyao [1 ]
机构
[1] Beihang Univ, Sch Instrumentat Sci & Optoelect Engn, Dept Measurement Control & Informat Technol, Beijing 100191, Peoples R China
基金
中国国家自然科学基金;
关键词
Perception; HD video; Saliency; Video compression; HEVC; RATE-DISTORTION OPTIMIZATION; MODEL;
D O I
10.1016/j.neucom.2017.08.054
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The perceptual video coding systems for optimization have been developed on the basis of different attributes of the human visual system. The attention-based coding system is considered as an important part of it. The saliency map method representing the region-of-interest (ROI) from the video signal has become a reliable method due to advances in the computer performance and the visual algorithms. In the present study, we propose a hybrid compression algorithm that uses the deep convolutional neural network to compute the spatial saliency followed by extraction of the temporal saliency from the compressed-domain motion information. The level of uncertainty is calculated to combine to form the video's saliency map. Afterwards, the QP search range is dynamically adjusted in HEVC, and a rate distortion calculation method is proposed to choose the pattern and guide the allocation of bits during the video compression process. Empirical reporting results proved the superiority of the proposed method over the state-of-the-art perceptual coding algorithms in terms of saliency detection and perceptual compression quality. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:511 / 522
页数:12
相关论文
共 41 条
  • [1] Motion onset captures attention
    Abrams, RA
    Christ, SE
    [J]. PSYCHOLOGICAL SCIENCE, 2003, 14 (05) : 427 - 432
  • [2] Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
  • [3] [Anonymous], 2014, ARXIV14075104
  • [4] [Anonymous], ARXIV161001563
  • [5] [Anonymous], 2015, J VISUAL-JAPAN, DOI DOI 10.1167/15.12.376
  • [6] [Anonymous], 1998, 50011 ITURBT
  • [7] Banerjee J., 1994, ENCYCLOPAEDIC DICT P, P107
  • [8] Semantic video analysis for adaptive content delivery and automatic description
    Cavallaro, A
    Steiger, O
    Ebrahimi, T
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2005, 15 (10) : 1200 - 1209
  • [9] The devil is in the details: an evaluation of recent feature encoding methods
    Chatfield, Ken
    Lempitsky, Victor
    Vedaldi, Andrea
    Zisserman, Andrew
    [J]. PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,
  • [10] Motion Vector Outlier Rejection Cascade for Global Motion Estimation
    Chen, Yue-Meng
    Bajic, Ivan V.
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2010, 17 (02) : 197 - 200