On image classification: City images vs. landscapes

被引：258

作者：

Vailaya, A

Jain, A ^{[1
]}

Zhang, HJ

机构：

[1] Michigan State Univ, Dept Comp Sci, E Lansing, MI 48824 USA

[2] Broadband Informat Syst Lab, HP Labs, Palo Alto, CA 94304 USA

来源：

PATTERN RECOGNITION | 1998年 / 31卷 / 12期

关键词：

image classification; clustering; salient features; similarity; image database; content-based retrieval;

D O I：

10.1016/S0031-3203(98)00079-X

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Grouping images into semantically meaningful categories using low-level Visual features is a challenging and important problem in content-based image retrieval. Based on these groupings, effective indices can be built for an image database. In this paper, we show how a specific high-level classification problem (city images vs landscapes) can be solved from relatively simple low-level features geared for the particular classes. We have developed a procedure to qualitatively measure the saliency of a feature towards a classification problem based on the plot of the intra-class and inter-class distance distributions. We use this approach to determine the discriminative power of the following features: color histogram, color coherence vector, DCT coefficient, edge direction histogram, and edge direction coherence vector. We determine that the edge direction-based features have the most discriminative power for the classification problem of interest here. A weighted k-NN classifier is used for the classification which results in an accuracy of 93.9% when evaluated on an image database of 2716 images using the leave-one-out method. This approach has been extended to further classify 528 landscape images into forests, mountains, and sunset/sunrise classes. First, the input images are classified as sunset/sunrise images vs forest & mountain images (94.5% accuracy) and then the forest & mountain images are classified as forest images or mountain images (91.7% accuracy). We are currently identifying further semantic classes to assign to images as well as extracting low level features which are salient for these classes. Our final goal is to combine multiple 2-class classifiers into a single hierarchical classifier. (C) 1998 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.

引用

页码：1921 / 1935

页数：15

共 21 条

[1]

[Anonymous], 1995, PROC ICJAI, DOI DOI 10.1145/217279.215068

[2]

[Anonymous], 1998, IEEE INT WORKSH CONT

[3]

BOLLE RM, 1996, IN PRESS IBM J RES D

[4]

Faloutsos C., 1994, Journal of Intelligent Information Systems: Integrating Artificial Intelligence and Database Technologies, V3, P231, DOI 10.1007/BF00962238

[5]

FORSYTH DA, 1996, INT WORKSH OBJ REC C

[6]

GORKANI MM, 1994, 12 INT C PATT REC JE, P459

[7] Virage video engine [J].

Hampapur, A ;

Gupta, A ;

Horowitz, B ;

Shu, CF ;

Fuller, C ;

Bach, J ;

Gorkani, M ;

Jain, R .

STORAGE AND RETRIEVAL FOR IMAGE AND VIDEO DATABASES V, 1997, 3022 :188-198

[8] VISUAL-PATTERN RECOGNITION BY MOMENT INVARIANTS [J].

HU, M .

IRE TRANSACTIONS ON INFORMATION THEORY, 1962, 8 (02) :179-&

[9] Image retrieval using color and shape [J].

Jain, AK ;

Vailaya, A .

PATTERN RECOGNITION, 1996, 29 (08) :1233-1244

[10]

Jain K, 1988, Algorithms for clustering data

← 1 2 3 →