Learning and Recognition of On-Premise Signs From Weakly Labeled Street View Images

被引：39

作者：

Tsai, Tsung-Hung ^{[1
]}

Cheng, Wen-Huang ^{[1
]}

You, Chuang-Wen ^{[1
]}

Hu, Min-Chun ^{[2
]}

Tsui, Arvin Wen ^{[3
]}

Chi, Heng-Yu ^{[1
]}

机构：

[1] Res Ctr Informat Technol Innovat, Taipei 115, Taiwan

[2] Natl Cheng Kung Univ, Tainan 701, Taiwan

[3] Ind Technol Res Inst, Hsinchu 310, Taiwan

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2014年 / 23卷 / 03期

关键词：

Real-world objects; street view scenes; learning and recognition; object image data set;

D O I：

10.1109/TIP.2014.2298982

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Camera-enabled mobile devices are commonly used as interaction platforms for linking the user's virtual and physical worlds in numerous research and commercial applications, such as serving an augmented reality interface for mobile information retrieval. The various application scenarios give rise to a key technique of daily life visual object recognition. On-premise signs (OPSs), a popular form of commercial advertising, are widely used in our living life. The OPSs often exhibit great visual diversity (e. g., appearing in arbitrary size), accompanied with complex environmental conditions (e. g., foreground and background clutter). Observing that such real-world characteristics are lacking in most of the existing image data sets, in this paper, we first proposed an OPS data set, namely OPS-62, in which totally 4649 OPS images of 62 different businesses are collected from Google's Street View. Further, for addressing the problem of real-world OPS learning and recognition, we developed a probabilistic framework based on the distributional clustering, in which we proposed to exploit the distributional information of each visual feature (the distribution of its associated OPS labels) as a reliable selection criterion for building discriminative OPS models. Experiments on the OPS-62 data set demonstrated the outperformance of our approach over the state-of-the-art probabilistic latent semantic analysis models for more accurate recognitions and less false alarms, with a significant 151.28% relative improvement in the average recognition rate. Meanwhile, our approach is simple, linear, and can be executed in a parallel fashion, making it practical and scalable for large-scale multimedia applications.

引用

页码：1047 / 1059

页数：13

共 40 条

[1] SLIC Superpixels Compared to State-of-the-Art Superpixel Methods [J].

Achanta, Radhakrishna ;

Shaji, Appu ;

Smith, Kevin ;

Lucchi, Aurelien ;

Fua, Pascal ;

Suesstrunk, Sabine .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (11) :2274-2281

[2]

Alexe B, 2010, PROC CVPR IEEE, P73, DOI 10.1109/CVPR.2010.5540226

[3]

[Anonymous], SOCIAL SHOPPING

[4]

[Anonymous], P IEEE ICIP

[5]

[Anonymous], WHATS YOUR SIGNAGE O

[6]

[Anonymous], 2006, CVPR

[7]

[Anonymous], IEEE T PATTERN ANAL

[8]

[Anonymous], ICMLA STREETVIEW REC

[9]

[Anonymous], GOOGLE STREET VIEW I

[10]

[Anonymous], P 14 ACM INT C UB CO

← 1 2 3 4 →