Rescue Tail Queries: Learning to Image Search Re-rank via Click-wise Multimodal Fusion

被引：1

作者：

Yang, Xiaopeng ^{[1
]}

Mei, Tao ^{[2
]}

Zhang, Yongdong ^{[1
]}

机构：

[1] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China

[2] Microsoft Res, Beijing 100080, Peoples R China

来源：

PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14) | 2014年

基金：

国家高技术研究发展计划(863计划); 中国国家自然科学基金;

关键词：

Image search; search re-ranking; tail queries; click-through data; multimodal fusion; RERANKING;

D O I：

10.1145/2647868.2654900

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Image search engines have achieved good performance for head (popular) queries by leveraging text information and user click data. However, there still remain a large number of tail (rare) queries with relatively unsatisfying search results, which are often overlooked in existing research. Image search for these tail queries therefore provides a grand challenge for research communities. Most existing re-ranking approaches, though effective for head queries, cannot be extended to tail. The assumption of these approaches that the re-ranked list should not go far away from the initial ranked list is not applicable to the tail queries. The challenge, thus, relies on how to leverage the possibly unsatisfying initial ranked results and the very limited click data to solve the search intent gap of tail queries. To deal with this challenge, we propose to mine relevant information from the very few click data by leveraging click-wise-based image pairs and query-dependent multimodal fusion. Specifically, we hypothesize that images with more clicks are more relevant to the given query than the ones with no or relatively less clicks and the effects of different visual modalities to re-rank images are query-dependent. We therefore propose a novel query-dependent learning to re-rank approach for tail queries, called "click-wise multimodal fusion." The approach can not only effectively expand training data by learning relevant information from the constructed click-wise-based image pairs. but also fully explore the effects of multiple visual modalities by adaptively predicting the query-dependent fusion weights. The experiments conducted on a real-world dataset with 100 tail queries show that our proposed approach can significantly improve initial search results by 10.88% and 9.12% in terms of NDCG@5 and NDCG@10, respectively, and outperform several existing re-ranking approaches.

引用

页码：537 / 546

页数：10

共 28 条

[1] [Anonymous], 2009, P 17 ACM INT C MULTI
[2] [Anonymous], 2002, P ACM SIGKDD KDD 200, DOI 10.1145/775047.775067
[3] [Anonymous], NIST TRECVID WORKSH
[4] [Anonymous], 2010, P ACM INT C MULT
[5] [Anonymous], 2009, WWW
[6] Baeza-Yates R, 2007, KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P76
[7] Carterette B., 2007, ADV NEURAL INFORM PR
[8] Dupret G., 2010, Proceedings of the third ACM international conference on Web search and data mining, P181, DOI 10.1145/1718487.1718510
[9] Hauptmann A.G., 2006, Proceedings of the 14th annual ACM international conference on Multimedia, P385, DOI DOI 10.1145/1180639.1180721
[10] Hsu WinstonH., 2007, ACM MM

← 1 2 3 →