Recognition of attentive objects with a concept association network for image annotation

被引:10
作者
Fu, Hong [1 ]
Chi, Zheru [1 ]
Feng, Dagan [1 ,2 ]
机构
[1] Hong Kong Polytech Univ, Dept Elect & Informat Engn, Ctr Multimedia Signal Proc, Kowloon, Hong Kong, Peoples R China
[2] Univ Sydney, Sch Informat Technol, Sydney, NSW 2006, Australia
关键词
Image annotation; Concept association network (CAN); Attentive objects; Visual classifier; Neural network;
D O I
10.1016/j.patcog.2010.04.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the advancement of imaging techniques and IT technologies, image retrieval has become a bottle neck. The key for efficient and effective image retrieval is by a text-based approach in which automatic image annotation is a critical task. As an important issue, the metadata of the annotation, i.e., the basic unit of an image to be labeled, has not been fully studied. A habitual way is to label the segments which are produced by a segmentation algorithm. However, after a segmentation process an object has often been broken into pieces, which not only produces noise for annotation but also increases the complexity of the model. We adopt an attention-driven image interpretation method to extract attentive objects from an over-segmented image and use the attentive objects for annotation. By such doing, the basic unit of annotation has been upgraded from segments to attentive objects. Visual classifiers are trained and a concept association network (CAN) is constructed for object recognition. A CAN consists of a number of concept nodes in which each node is a trained neural network (visual classifier) to recognize a single object. The nodes are connected through their correlation links forming a network. Given that an image contains several unknown attentive objects, all the nodes in CAN generate their own responses which propagate to other nodes through the network simultaneously. For a combination of nodes under investigation, these loopy propagations can be characterized by a linear system. The response of a combination of nodes can be obtained by solving the linear system. Therefore, the annotation problem is converted into finding out the node combination with the maximum response. Annotation experiments show a better accuracy of attentive objects over segments and that the concept association network improves annotation performance. (C) 2010 Elsevier Ltd. All rights reserved.
引用
收藏
页码:3539 / 3547
页数:9
相关论文
共 22 条
[11]  
Haykin S. S., 1994, Neural Networks: A Comprehensive Foundation
[12]  
He XM, 2004, PROC CVPR IEEE, P695
[13]   Computational modelling of visual attention [J].
Itti, L ;
Koch, C .
NATURE REVIEWS NEUROSCIENCE, 2001, 2 (03) :194-203
[14]  
Jin Y., 2005, P 13 ANN ACM INT C M, P706
[15]   Image coding quality assessment using fuzzy integrals with a three-component image model [J].
Li, JL ;
Chen, G ;
Chi, ZR ;
Lu, CG .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2004, 12 (01) :99-106
[16]   Image annotation via graph learning [J].
Liu, Jing ;
Li, Mingjing ;
Liu, Qingshan ;
Lu, Hanqing ;
Ma, Songde .
PATTERN RECOGNITION, 2009, 42 (02) :218-228
[17]  
Pan J.-Y., 2004, P 10 ACM SIGKDD INT, P653
[18]   A robust eye detection method using combined binary edge and intensity information [J].
Song, JT ;
Chi, ZR ;
Liu, JL .
PATTERN RECOGNITION, 2006, 39 (06) :1110-1125
[19]   VISUAL SELECTIVE ATTENTION - A THEORETICAL-ANALYSIS [J].
THEEUWES, J .
ACTA PSYCHOLOGICA, 1993, 83 (02) :93-154
[20]   Combining global, regional and contextual features for automatic image annotation [J].
Wang, Yong ;
Mei, Tao ;
Gong, Shaogang ;
Hua, Xian-Sheng .
PATTERN RECOGNITION, 2009, 42 (02) :259-266