On image classification: City images vs. landscapes

被引:257
作者
Vailaya, A
Jain, A [1 ]
Zhang, HJ
机构
[1] Michigan State Univ, Dept Comp Sci, E Lansing, MI 48824 USA
[2] Broadband Informat Syst Lab, HP Labs, Palo Alto, CA 94304 USA
关键词
image classification; clustering; salient features; similarity; image database; content-based retrieval;
D O I
10.1016/S0031-3203(98)00079-X
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Grouping images into semantically meaningful categories using low-level Visual features is a challenging and important problem in content-based image retrieval. Based on these groupings, effective indices can be built for an image database. In this paper, we show how a specific high-level classification problem (city images vs landscapes) can be solved from relatively simple low-level features geared for the particular classes. We have developed a procedure to qualitatively measure the saliency of a feature towards a classification problem based on the plot of the intra-class and inter-class distance distributions. We use this approach to determine the discriminative power of the following features: color histogram, color coherence vector, DCT coefficient, edge direction histogram, and edge direction coherence vector. We determine that the edge direction-based features have the most discriminative power for the classification problem of interest here. A weighted k-NN classifier is used for the classification which results in an accuracy of 93.9% when evaluated on an image database of 2716 images using the leave-one-out method. This approach has been extended to further classify 528 landscape images into forests, mountains, and sunset/sunrise classes. First, the input images are classified as sunset/sunrise images vs forest & mountain images (94.5% accuracy) and then the forest & mountain images are classified as forest images or mountain images (91.7% accuracy). We are currently identifying further semantic classes to assign to images as well as extracting low level features which are salient for these classes. Our final goal is to combine multiple 2-class classifiers into a single hierarchical classifier. (C) 1998 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:1921 / 1935
页数:15
相关论文
共 21 条
  • [1] [Anonymous], 1995, PROC ICJAI, DOI DOI 10.1145/217279.215068
  • [2] [Anonymous], 1998, IEEE INT WORKSH CONT
  • [3] BOLLE RM, 1996, IN PRESS IBM J RES D
  • [4] Faloutsos C., 1994, Journal of Intelligent Information Systems: Integrating Artificial Intelligence and Database Technologies, V3, P231, DOI 10.1007/BF00962238
  • [5] FORSYTH DA, 1996, INT WORKSH OBJ REC C
  • [6] GORKANI MM, 1994, 12 INT C PATT REC JE, P459
  • [7] Virage video engine
    Hampapur, A
    Gupta, A
    Horowitz, B
    Shu, CF
    Fuller, C
    Bach, J
    Gorkani, M
    Jain, R
    [J]. STORAGE AND RETRIEVAL FOR IMAGE AND VIDEO DATABASES V, 1997, 3022 : 188 - 198
  • [8] VISUAL-PATTERN RECOGNITION BY MOMENT INVARIANTS
    HU, M
    [J]. IRE TRANSACTIONS ON INFORMATION THEORY, 1962, 8 (02): : 179 - &
  • [9] Image retrieval using color and shape
    Jain, AK
    Vailaya, A
    [J]. PATTERN RECOGNITION, 1996, 29 (08) : 1233 - 1244
  • [10] Jain K, 1988, Algorithms for clustering data