Perceptual Video Coding Based on SSIM-Inspired Divisive Normalization

被引:110
作者
Wang, Shiqi [1 ]
Rehman, Abdul [2 ]
Wang, Zhou [2 ]
Ma, Siwei [1 ]
Gao, Wen [1 ]
机构
[1] Peking Univ, Sch Elect Engn & Comp Sci, Inst Digital Media, Beijing 100871, Peoples R China
[2] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada
基金
加拿大自然科学与工程研究理事会; 美国国家科学基金会;
关键词
Divisive normalization; H.264/AVC coding; perceptual video coding; rate distortion optimization; structural similarity (SSIM) index; IMAGE QUALITY ASSESSMENT; STRUCTURAL SIMILARITY; SCALE MIXTURES; OPTIMIZATION; STATISTICS; GAUSSIANS; RESPONSES; MODEL; INDEX;
D O I
10.1109/TIP.2012.2231090
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a perceptual video coding framework based on the divisive normalization scheme, which is found to be an effective approach to model the perceptual sensitivity of biological vision, but has not been fully exploited in the context of video coding. At the macroblock (MB) level, we derive the normalization factors based on the structural similarity (SSIM) index as an attempt to transform the discrete cosine transform domain frame residuals to a perceptually uniform space. We further develop an MB level perceptual mode selection scheme and a frame level global quantization matrix optimization method. Extensive simulations and subjective tests verify that, compared with the H.264/AVC video coding standard, the proposed method can achieve significant gain in terms of rate-SSIM performance and provide better visual quality.
引用
收藏
页码:1418 / 1429
页数:12
相关论文
共 55 条
  • [1] [Anonymous], 2001, ITU T VCEG M
  • [2] [Anonymous], 2010, JOINT VID TEAM JVT R
  • [3] [Anonymous], 2006, P IEEE INT C AC SPEE
  • [4] [Anonymous], 2012, REC 500 10 METH SUBJ
  • [5] Aswathappa Babu Hemanth Kumar, 2010, 2010 42nd Southeastern Symposium on System Theory (SSST 2010), P367, DOI 10.1109/SSST.2010.5442789
  • [6] On the Mathematical Properties of the Structural Similarity Index
    Brunet, Dominique
    Vrscay, Edward R.
    Wang, Zhou
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2012, 21 (04) : 1488 - 1499
  • [7] Dynamic contrast-based quantization for lossy wavelet image compression
    Chandler, DM
    Hemami, SS
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2005, 14 (04) : 397 - 410
  • [8] Rate bounds on SSIM index of quantized images
    Channappayya, Sumohana S.
    Bovik, Alan Conrad
    Heath, Robert W.
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2008, 17 (09) : 1624 - 1639
  • [9] IMPROVING VIDEO CODING QUALITY BY PERCEPTUAL RATE-DISTORTION OPTIMIZATION
    Chen, Homer H.
    Huang, Yi-Hsin
    Su, Po-Yen
    Ou, Tao-Sheng
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2010), 2010, : 1287 - 1292
  • [10] Macroblock-level adaptive frequency weighting for perceptual video coding
    Chen, Jianwen
    Zheng, Jianhua
    He, Yun
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2007, 53 (02) : 775 - 781