Lp-Norm IDF for Scalable Image Retrieval

被引：48

作者：

Zheng, Liang ^{[1
]}

Wang, Shengjin ^{[1
]}

Tian, Qi ^{[2
]}

机构：

[1] Tsinghua Univ, Dept Elect Engn, State Key Lab Intelligent Technol & Syst, Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China

[2] Univ Texas San Antonio, Dept Comp Sci, San Antonio, TX 78249 USA

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2014年 / 23卷 / 08期

基金：

国家高技术研究发展计划(863计划); 美国国家科学基金会;

关键词：

Image retrieval; Lp-norm IDF; burstiness; visual word frequency; OBJECT RETRIEVAL; SIMILARITY; VOCABULARY; GEOMETRY; FEATURES; SEARCH; SET;

D O I：

10.1109/TIP.2014.2329182

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The inverse document frequency (IDF) is prevalently utilized in the bag-of-words-based image retrieval application. The basic idea is to assign less weight to terms with high frequency, and vice versa. However, in the conventional IDF routine, the estimation of visual word frequency is coarse and heuristic. Therefore, its effectiveness is largely compromised and far from optimal. To address this problem, this paper introduces a novel IDF family by the use of Lp-norm pooling technique. Carefully designed, the proposed IDF considers the term frequency, document frequency, the complexity of images, as well as the codebook information. We further propose a parameter tuning strategy, which helps to produce optimal balancing between TF and pIDF weights, yielding the so-called Lp-norm IDF (pIDF). We show that the conventional IDF is a special case of our generalized version, and two novel IDFs, i.e., the average IDF and the max IDF, can be defined from the concept of pIDF. Further, by counting for the term-frequency in each image, the proposed pIDF helps to alleviate the visual word burstiness phenomenon. Our method is evaluated through extensive experiments on four benchmark data sets (Oxford 5K, Paris 6K, Holidays, and Ukbench). We show that the pIDF works well on large scale databases and when the codebook is trained on irrelevant data. We report an mean average precision improvement of as large as +13.0% over the baseline TF-IDF approach on a 1M data set. In addition, the pIDF has a wide application scope varying from buildings to general objects and scenes. When combined with postprocessing steps, we achieve competitive results compared with the state-of-the-art methods. In addition, since the pIDF is computed offline, no extra computation or memory cost is introduced to the system at all.

引用

页码：3604 / 3617

页数：14

共 72 条

[1]

[Anonymous], P INT C AC SPEECH SI

[2]

[Anonymous], 2007, P 6 ACM INT C IMAGE

[3]

[Anonymous], ARXIV14060132

[4]

[Anonymous], 2001, IEEE Data Eng. Bull.

[5]

[Anonymous], 2011, P 19 ACM INT C MULT, DOI DOI 10.1145/2072298.2072365

[6]

[Anonymous], 2010, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2010.5540009

[7]

Arandjelovic R, 2012, PROC CVPR IEEE, P2911, DOI 10.1109/CVPR.2012.6248018

[8]

Baeza-Yates R, 1999, MODERN INFORM RETRIE, V463

[9]

Bu S., 2014, P IEEE ICIP

[10]

Cai J., 2013, P 3 ACM C ICMR, P33

← 1 2 3 4 5 6 7 8 →