Foveation scalable video coding with automatic fixation selection

被引:147
作者
Wang, Z [1 ]
Lu, LG
Bovik, AC
机构
[1] Univ Texas, LIVE, Austin, TX 78712 USA
[2] NYU, LCV, New York, NY 10003 USA
[3] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
foveation; human visual system; image and video quality; rate scalable coding; video coding; wavelet;
D O I
10.1109/TIP.2003.809015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image and video coding is an optimization problem. A successful image and video coding algorithm delivers a good tradeoff between visual quality and other coding performance measures, such as compression, complexity, scalability, robustness, and security. In this paper, we follow two recent trends in image and video coding research. One is to incorporate human visual system (HVS) models to improve the current state-of-the-art of image and video coding algorithms by better exploiting the properties of the intended receiver. The other is to design rate scalable image and video codecs, which allow the extraction of coded visual information at continuously varying bit rates from a single compressed bitstream. Specifically, we propose a foveation scalable video coding (FSVC) algorithm which supplies good quality-compression performance as well as effective rate scalability. The key idea is to organize the encoded bitstream to provide the best decoded video at an arbitrary bit rate in terms of foveated visual quality measurement. A foveation-based HVS model plays an important role in the algorithm. The algorithm is adaptable to different applications, such as knowledge-based video coding and video communications over time-varying, multiuser and interactive networks.
引用
收藏
页码:243 / 254
页数:12
相关论文
共 63 条
  • [1] Adelson EH., 1984, RCA Engineer, V29, P33
  • [2] [Anonymous], VISUAL MODELS TARGET
  • [3] [Anonymous], 2003, HDB VIDEO DATABASES
  • [4] [Anonymous], P 13 S COMP GEOM
  • [5] Image coding using wavelet transform
    Antonini, Marc
    Barlaud, Michel
    Mathieu, Pierre
    Daubechies, Ingrid
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 1992, 1 (02) : 205 - 220
  • [6] Arawvith S., 2000, HDB IMAGE VIDEO PROC
  • [7] Visual detection following retinal damage: Predictions of an inhomogeneous retino-cortical model
    Arnow, TL
    Geisler, WS
    [J]. LASER-INFLICTED EYE INJURIES: EPIDEMIOLOGY, PREVENTION, AND TREATMENT, PROCEEDINGS OF, 1996, 2674 : 119 - 130
  • [8] A frequency-domain video transcoder for dynamic bit-rate reduction of MPEG-2 bit streams
    Assunçao, PAA
    Ghanbari, M
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 1998, 8 (08) : 953 - 967
  • [9] BANDERA C, 1989, 1989 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-3, P596, DOI 10.1109/ICSMC.1989.71367
  • [10] PERIPHERAL SPATIAL VISION - LIMITS IMPOSED BY OPTICS, PHOTORECEPTORS, AND RECEPTOR POOLING
    BANKS, MS
    SEKULER, AB
    ANDERSON, SJ
    [J]. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 1991, 8 (11): : 1775 - 1787