Recognition of attentive objects with a concept association network for image annotation

被引：10

作者：

Fu, Hong ^{[1
]}

Chi, Zheru ^{[1
]}

Feng, Dagan ^{[1
,2
]}

机构：

[1] Hong Kong Polytech Univ, Dept Elect & Informat Engn, Ctr Multimedia Signal Proc, Kowloon, Hong Kong, Peoples R China

[2] Univ Sydney, Sch Informat Technol, Sydney, NSW 2006, Australia

来源：

PATTERN RECOGNITION | 2010年 / 43卷 / 10期

关键词：

Image annotation; Concept association network (CAN); Attentive objects; Visual classifier; Neural network;

D O I：

10.1016/j.patcog.2010.04.009

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the advancement of imaging techniques and IT technologies, image retrieval has become a bottle neck. The key for efficient and effective image retrieval is by a text-based approach in which automatic image annotation is a critical task. As an important issue, the metadata of the annotation, i.e., the basic unit of an image to be labeled, has not been fully studied. A habitual way is to label the segments which are produced by a segmentation algorithm. However, after a segmentation process an object has often been broken into pieces, which not only produces noise for annotation but also increases the complexity of the model. We adopt an attention-driven image interpretation method to extract attentive objects from an over-segmented image and use the attentive objects for annotation. By such doing, the basic unit of annotation has been upgraded from segments to attentive objects. Visual classifiers are trained and a concept association network (CAN) is constructed for object recognition. A CAN consists of a number of concept nodes in which each node is a trained neural network (visual classifier) to recognize a single object. The nodes are connected through their correlation links forming a network. Given that an image contains several unknown attentive objects, all the nodes in CAN generate their own responses which propagate to other nodes through the network simultaneously. For a combination of nodes under investigation, these loopy propagations can be characterized by a linear system. The response of a combination of nodes can be obtained by solving the linear system. Therefore, the annotation problem is converted into finding out the node combination with the maximum response. Annotation experiments show a better accuracy of attentive objects over segments and that the concept association network improves annotation performance. (C) 2010 Elsevier Ltd. All rights reserved.

引用

页码：3539 / 3547

页数：9

共 22 条

[1] Visual objects in context [J].

Bar, M .

NATURE REVIEWS NEUROSCIENCE, 2004, 5 (08) :617-629

[2]

Barnard K, 2001, EIGHTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOL II, PROCEEDINGS, P408, DOI 10.1109/ICCV.2001.937654

[3]

Carbonetto P, 2004, LECT NOTES COMPUT SC, V3021, P350

[4]

CONLLINS AM, 1975, PSYCHOL REV, V82, P407

[5] Unsupervised segmentation of color-texture regions in images and video [J].

Deng, YN ;

Manjunath, BS .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (08) :800-810

[6] Statistical modeling and conceptualization of natural images [J].

Fan, JP ;

Gao, YL ;

Luo, HZ ;

Xu, GY .

PATTERN RECOGNITION, 2005, 38 (06) :865-885

[7] Region based image annotation [J].

Frigui, Hichem ;

Caudill, Joshua .

2006 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP 2006, PROCEEDINGS, 2006, :953-+

[8] An efficient algorithm for attention-driven image interpretation from segments [J].

Fu, Hong ;

Chi, Zheru ;

Feng, Dagan .

PATTERN RECOGNITION, 2009, 42 (01) :126-140

[9] Attention-driven image interpretation with application to image retrieval [J].

Fu, Hong ;

Chi, Zheru ;

Feng, Dagan .

PATTERN RECOGNITION, 2006, 39 (09) :1604-1621

[10]

HALINA K, 2008, PATTERN RECOGN, V41, P3562

← 1 2 3 →