Scene categorization via contextual visual words

被引:68
作者
Qin, Jianzhao [1 ]
Yung, Nelson H. C. [1 ]
机构
[1] Univ Hong Kong, Dept Elect & Elect Engn, Lab Intelligent Transportat Syst Res, Hong Kong, Hong Kong, Peoples R China
关键词
Scene categorization; Contextual visual words; Context based vision; Pattern recognition; CLASSIFICATION; OBJECTS; IMAGES;
D O I
10.1016/j.patcog.2009.11.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a novel scene categorization method based on contextual visual words In the proposed method, we extend the traditional 'bags of visual words' model by introducing contextual information from the coarser scale and neighborhood regions to the local region of interest based on unsupervised learning. The introduced contextual information provides useful information or cue about the region of interest, which can reduce the ambiguity when employing visual words to represent the local regions The improved visual words representation of the scene image is capable of enhancing the categorization performance The proposed method is evaluated over three scene classification datasets, with 8, 13 and 15 scene categories, respectively, using 10-fold cross-validation The experimental results show that the proposed method achieves 90 30%, 87 63% and 85.16% recognition success for Dataset 1,2 and 3, respectively, which significantly outperforms the methods based on the visual words that only represent the local information in the statistical manner We also compared the proposed method with three representative scene categorization methods The result confirms the superiority of the proposed method. (C) 2009 Elsevier Ltd All rights reserved.
引用
收藏
页码:1874 / 1888
页数:15
相关论文
共 40 条
[1]   Learning to detect objects in images via a sparse, part-based representation [J].
Agarwal, S ;
Awan, A ;
Roth, D .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (11) :1475-1490
[2]  
[Anonymous], 7694 CAL TECH
[3]   Scene classification using a hybrid generative/discriminative approach [J].
Bosch, Anna ;
Zisserman, Andrew ;
Munoz, Xavier .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (04) :712-727
[4]   Which is the best way to organize/classify images by content? [J].
Bosch, Anna ;
Munoz, Xavier ;
Marti, Robert .
IMAGE AND VISION COMPUTING, 2007, 25 (06) :778-791
[5]  
Bosch A, 2006, LECT NOTES COMPUT SC, V3954, P517
[6]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[7]   CBSA: Content-based soft annotation for multimodal image retrieval using Bayes point machines [J].
Chang, E ;
Goh, K ;
Sychay, G ;
Wu, G .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2003, 13 (01) :26-38
[8]  
Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482
[9]  
Elkan C., 2003, USING TRIANGLE INEQU
[10]  
Fei-Fei L, 2005, PROC CVPR IEEE, P524