Large-Scale Aerial Image Categorization Using a Multitask Topological Codebook

被引：24

作者：

Zhang, Luming ^{[1
]}

Wang, Meng ^{[1
]}

Hong, Richang ^{[1
]}

Yin, Bao-Cai ^{[2
]}

Li, Xuelong ^{[3
]}

机构：

[1] Hefei Univ Technol, Comp Sci & Informat Engn Dept, Hefei 230009, Peoples R China

[2] Beijing Univ Technol, Sch Transportat, Beijing 100124, Peoples R China

[3] Chinese Acad Sci, Xian Inst Opt & Precis Mech, State Key Lab Transient Opt & Photon, Ctr Opt Imagery Anal & Learning, Xian 710119, Peoples R China

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2016年 / 46卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Aerial image; discriminatively learning; large-scale; multitask; realtime; topology; MATCHING KERNEL; INFORMATION; MULTICLASS; SELECTION; FEATURES;

D O I：

10.1109/TCYB.2015.2408592

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Fast and accurately categorizing the millions of aerial images on Google Maps is a useful technique in pattern recognition. Existing methods cannot handle this task successfully due to two reasons: 1) the aerial images' topologies are the key feature to distinguish their categories, but they cannot be effectively encoded by a conventional visual codebook and 2) it is challenging to build a realtime image categorization system, as some geo-aware Apps update over 20 aerial images per second. To solve these problems, we propose an efficient aerial image categorization algorithm. It focuses on learning a discriminative topological codebook of aerial images under a multitask learning framework. The pipeline can be summarized as follows. We first construct a region adjacency graph (RAG) that describes the topology of each aerial image. Naturally, aerial image categorization can be formulated as RAG-to-RAG matching. According to graph theory, RAG-to-RAG matching is conducted by enumeratively comparing all their respective graphlets (i.e., small subgraphs). To alleviate the high time consumption, we propose to learn a codebook containing topologies jointly discriminative to multiple categories. The learned topological codebook guides the extraction of the discriminative graphlets. Finally, these graphlets are integrated into an AdaBoost model for predicting aerial image categories. Experimental results show that our approach is competitive to several existing recognition models. Furthermore, over 24 aerial images are processed per second, demonstrating that our approach is ready for real-world applications.

引用

页码：535 / 545

页数：11

共 58 条

[1] SLIC Superpixels Compared to State-of-the-Art Superpixel Methods [J].

Achanta, Radhakrishna ;

Shaji, Appu ;

Smith, Kevin ;

Lucchi, Aurelien ;

Fua, Pascal ;

Suesstrunk, Sabine .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (11) :2274-2281

[2]

[Anonymous], P EMMCVPR

[3]

[Anonymous], ADV LARGE MARGIN CLA

[4]

[Anonymous], J STAT COMPUT

[5]

[Anonymous], 2010, ADV NEURAL PROCESSIN

[6]

[Anonymous], P IEEE C COMP VIS PA

[7]

[Anonymous], 2007, IJCAI

[8]

[Anonymous], 2007, P 24 INT C MACH LEAR

[9] Convex multi-task feature learning [J].

Argyriou, Andreas ;

Evgeniou, Theodoros ;

Pontil, Massimiliano .

MACHINE LEARNING, 2008, 73 (03) :243-272

[10] A model of inductive bias learning [J].

Baxter, J .

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2000, 12 :149-198

← 1 2 3 4 5 6 →