Visual word spatial arrangement for image retrieval and classification

被引:73
作者
Penatti, Otavio A. B. [1 ]
Silva, Fernanda B. [1 ]
Valle, Eduardo [1 ,2 ]
Gouet-Brunet, Valerie [3 ,4 ]
Torres, Ricardo da S. [1 ]
机构
[1] Univ Campinas Unicamp, Inst Comp, RECOD Lab, BR-13083852 Campinas, SP, Brazil
[2] Univ Campinas Unicamp, Sch Elect & Comp Engn FEEC, Dept Comp Engn & Ind Automat DCA, BR-13083852 Campinas, SP, Brazil
[3] Paris Est Univ, IGN SR, MATIS Lab, F-94160 St Mande, France
[4] CNAM, CEDRIC Lab, F-75141 Paris 03, France
基金
巴西圣保罗研究基金会;
关键词
Visual words; Spatial arrangement; Image retrieval; Image classification;
D O I
10.1016/j.patcog.2013.08.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present word spatial arrangement (WSA), an approach to represent the spatial arrangement of visual words under the bag-of-visual-words model. It lies in a simple idea which encodes the relative position of visual words by splitting the image space into quadrants using each detected point as origin. WSA generates compact feature vectors and is flexible for being used for image retrieval and classification, for working with hard or soft assignment, requiring no pre/post processing for spatial verification. Experiments in the retrieval scenario show the superiority of WSA in relation to Spatial Pyramids. Experiments in the classification scenario show a reasonable compromise between those methods, with Spatial Pyramids generating larger feature vectors, while WSA provides adequate performance with much more compact features. As WSA encodes only the spatial information of visual words and not their frequency of occurrence, the results indicate the importance of such information for visual categorization. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:705 / 720
页数:16
相关论文
共 36 条
[1]   50 Years of object recognition: Directions forward [J].
Andreopoulos, Alexander ;
Tsotsos, John K. .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2013, 117 (08) :827-891
[2]  
[Anonymous], 1997, Proceedings of the 4th ACM International Conference on Multimedia, MULTIMEDIA 1996, DOI DOI 10.1145/244130.244148
[3]  
[Anonymous], 2011, P WACV
[4]  
[Anonymous], 2010, P 18 ACM INT C MULTI
[5]  
[Anonymous], 2010, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2010.5539963
[6]  
Avila S., 2011, P 18 IEEE INT C IM P, P2966
[7]   Spatial-Bag-of-Features [J].
Cao, Yang ;
Wang, Changhu ;
Li, Zhiwei ;
Zhang, Liqing ;
Zhang, Lei .
2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, :3352-3359
[8]  
Feng J., 2011, P CVPR 2011 COL SPRI, P2609
[9]   Embedding spatial information into image content description for scene retrieval [J].
Hoang, N. V. ;
Gouet-Brunet, V. ;
Rukoz, M. ;
Manouvrier, M. .
PATTERN RECOGNITION, 2010, 43 (09) :3013-3024
[10]   Image indexing using color correlograms [J].
Huang, J ;
Kumar, SR ;
Mitra, M ;
Zhu, WJ ;
Zabih, R .
1997 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1997, :762-768