Hybrid textual-visual relevance learning for content-based image retrieval

被引：20

作者：

Cui, Chaoran ^{[1
]}

Lin, Peiguang ^{[1
]}

Nie, Xiushan ^{[1
]}

Yin, Yilong ^{[2
]}

Zhu, Qingfeng ^{[1
]}

机构：

[1] Shandong Univ Finance & Econ, Sch Comp Sci & Technol, Jinan 250014, Shandong, Peoples R China

[2] Shandong Univ, Sch Comp Sci & Technol, Jinan 250101, Shandong, Peoples R China

来源：

JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION | 2017年 / 48卷

关键词：

Content-based image retrieval; Tag completion; Semantics modeling; Rank aggregation; Sparse linear method; REPRESENTATIONS;

D O I：

10.1016/j.jvcir.2017.03.011

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Learning effective relevance measures plays a crucial role in improving the performance of content-based image retrieval (CBIR) systems. Despite extensive research efforts for decades, how to discover and incorporate semantic information of images still poses a formidable challenge to real-world CBIR systems. In this paper, we propose a novel hybrid textual-visual relevance learning method, which mines textual relevance from image tags and combines textual relevance and visual relevance for CBIR. To alleviate the sparsity and unreliability of tags, we first perform tag completion to fill the missing tags as well as correct noisy tags of images. Then, we capture users' semantic cognition to images by representing each image as a probability distribution over the permutations of tags. Finally, instead of early fusion, a ranking aggregation strategy is adopted to sew up textual relevance and visual relevance seamlessly. Extensive experiments on two benchmark datasets well verified the promise of our approach. (C) 2017 Elsevier Inc. All rights reserved.

引用

页码：367 / 374

页数：8

共 47 条

[1] Semantic content-based image retrieval: A comprehensive study [J].

Alzu'bi, Ahmad ;

Amira, Abbes ;

Ramzan, Naeem .

JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2015, 32 :20-54

[2]

[Anonymous], IEEE T KNOWL DATA EN

[3]

[Anonymous], 2013, P ACM INT C MULTIMED

[4]

[Anonymous], MULTIMEDIA TOOLS APP

[5]

[Anonymous], NEUROCOMPUTING

[6]

[Anonymous], ADV MULTIMEDIA MODEL

[7]

[Anonymous], 2009, P ACM INT C IM VID R

[8]

Aslam J. A., 2001, SIGIR Forum, P276

[9]

Cao Z., 2007, P 24 INT C MACH LEAR, P129, DOI [DOI 10.1145/1273496.1273513, 10.1145/1273496.1273513]

[10] Which Information Sources are More Effective and Reliable in Video Search [J].

Cheng, Zhiyong ;

Li, Xuanchong ;

Shen, Jialie ;

Hauptmann, Alexander G. .

SIGIR'16: PROCEEDINGS OF THE 39TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2016, :1069-1072

← 1 2 3 4 5 →