An embedded saliency map estimator scheme: Application to video encoding

被引:16
作者
Tsapatsoulis, Nicolas [1 ]
Rapantzikos, Konstantinos
Pattichis, Constantinos
机构
[1] Univ Cyprus, Dept Comp Sci, CY-1678 Nicosia, Cyprus
[2] Natl Tech Univ Athens, Sch Elect & Comp Engn, Zografos 15780, Greece
关键词
visual attention model; embedded implementation; ROI-based video encoding;
D O I
10.1142/S0129065707001147
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we propose a novel saliency-based computational model for visual attention. This model processes both top-down (goal directed) and bottom-up information. Processing in the top-down channel creates the so called skin conspicuity map and emulates the visual search for human faces performed by humans. This is clearly a goal directed task but is generic enough to be context independent. Processing in the bottom-up information channel follows the principles set by Itti et al. but it deviates from them by computing the orientation, intensity and color conspicuity maps within a unified multi-resolution framework based on wavelet subband analysis. In particular, we apply a wavelet based approach for efficient computation of the topographic feature maps. Given that wavelets and multiresolution theory are naturally connected the usage of wavelet decomposition for mimicking the center surround process in humans is an obvious choice. However, our implementation goes further. We utilize the wavelet decomposition for inline computation of the features (such as orientation angles) that are used to create the topographic feature maps. The bottom-up topographic feature maps and the top-down skin conspicuity map are then combined through a sigmoid function to produce the final saliency map. A prototype of the proposed model was realized through the TMDSDMK642-OE DSP platform as an embedded system allowing real-time operation. For evaluation purposes, in terms of perceived visual quality and video compression improvement, a ROT-based video compression setup was followed. Extended experiments concerning both MPEG-1 as well as low bit-rate MPEG-4 video encoding were conducted showing significant improvement in video compression efficiency without perceived deterioration in visual quality.
引用
收藏
页码:289 / 304
页数:16
相关论文
共 31 条
[1]  
CAVE KR, PSYCHOL RES, V62
[2]   Neural Mechanisms of Selective Visual Attention [J].
Moore, Tirin ;
Zirnsak, Marc .
ANNUAL REVIEW OF PSYCHOLOGY, VOL 68, 2017, 68 :47-72
[3]   Perceptual quality metrics applied to still image compression [J].
Eckert, MP ;
Bradley, AP .
SIGNAL PROCESSING, 1998, 70 (03) :177-200
[4]  
Frintrop S., 2006, VOCUS VISUAL ATTENTI
[5]  
GABORSKI RS, 2003, P ART NEUR NETW ENG
[6]  
Gonzalez Rafael C, 2002, DIGITAL IMAGE PROCES
[7]   A dynamic model of how feature cues guide spatial attention [J].
Hamker, FH .
VISION RESEARCH, 2004, 44 (05) :501-521
[8]  
*ILAB, 2007, NEURM VIS C TOOLK IN
[9]  
*IMTOO, 2007, IMTOO MPEG ENC
[10]   Automatic foveation for video compression using a neurobiological model of visual attention [J].
Itti, L .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2004, 13 (10) :1304-1318