Set of Diverse Queries With Uncertainty Regularization for Composed Image Retrieval

被引：0

作者：

Xu, Yahui ^{[1
,2
]}

Wei, Jiwei ^{[1
,2
]}

Bin, Yi ^{[3
]}

Yang, Yang ^{[4
,5
]}

Ma, Zeyu ^{[1
,2
]}

Shen, Heng Tao ^{[4
,5
]}

机构：

[1] Univ Elect Sci & Technol China, Ctr Future Media, Chengdu 611731, Peoples R China

[2] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Peoples R China

[3] Natl Univ Singapore, Inst Data Sci, Singapore 119077, Singapore

[4] Univ Elect Sci & Technol China UESTC, Ctr Future Multimedia, Chengdu 611731, Peoples R China

[5] Univ Elect Sci & Technol China UESTC, Sch Comp Sci & Engn, Chengdu 611731, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 10期

基金：

中国国家自然科学基金; 中国博士后科学基金;

关键词：

Uncertainty; Semantics; Image retrieval; Probabilistic logic; Task analysis; Fuses; Loss measurement; Composed image retrieval; multi-modal learning; image retrieval;

D O I：

10.1109/TCSVT.2024.3401006

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Composed image retrieval aims to search a target image by concurrently understanding the composed inputs with a reference image and the complementary modification text. It aims to find a shared latent space where the representation of the composed inputs is close to the desired target image. Most previous methods capture the one-to-one correspondence between the composed inputs and target image, which encodes the composed inputs and the target image into single points in the feature space. However, the one-to-one correspondence cannot effectively handle this task due to the inherent ambiguity problem arising from the various semantic meanings and data uncertainty. Specifically, the composed inputs and target image always exhibit various semantic meanings, affecting the retrieval results. Moreover, given the composed inputs (resp. target image), there are multiple target images (resp. composed inputs) that equally make sense. In this paper, we propose a novel method termed Set of Diverse Queries with Uncertainty Regularization (SDQUR) to solve such inherent ambiguity problem. First, we utilize diverse queries to adaptively aggregate the composed inputs and target image into multiple deterministic embeddings that capture different semantic meanings in the triplet affecting the retrieval process. It can exploit the deterministic many-to-many correspondence within each triple through these set-based queries. Moreover, we provide an uncertainty regularization module to encode the composed inputs and target image into gaussian distribution. Multiple potential positive candidates are sampled from the distribution for probabilistic many-to-many correspondence. Through the complementary deterministic and probabilistic many-to-many correspondence manner, we achieve consistent improvements on the standard FashionIQ, CIRR, and Shoes benchmarks, surpassing the state-of-the-art methods by a large margin.

引用

页码：10494 / 10506

页数：13

共 50 条

[31] Composed Image Retrieval via Cross Relation Network With Hierarchical Aggregation Transformer
Yang, Qu
Ye, Mang
Cai, Zhaohui
Su, Kehua
Du, Bo
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4543 - 4554
[32] CLIP-Based Composed Image Retrieval with Comprehensive Fusion and Data Augmentation
Lin, Haoqiang
Wen, Haokun
Chen, Xiaolin
Song, Xuemeng
ADVANCES IN ARTIFICIAL INTELLIGENCE, AI 2023, PT I, 2024, 14471 : 190 - 202
[33] An Algorithm about Rough Set Approach to Semantic Image Retrieval
Li, Guobin
Peng, Xianze
PROCEEDINGS OF INTERNATIONAL CONFERENCE ON RESOURCE ENVIRONMENT AND INFORMATION TECHNOLOGY IN 2010 (REIT' 2010), 2010, : 683 - 686
[34] Image retrieval method based on deep learning semantic feature extraction and regularization softmax
Qinghai Wu
Multimedia Tools and Applications, 2020, 79 : 9419 - 9433
[35] Image retrieval method based on deep learning semantic feature extraction and regularization softmax
Wu, Qinghai
MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (13-14) : 9419 - 9433
[36] Interactive relevance feedback mechanism for image retrieval using rough set
Wang, Yu
Ding, Mingyue
Zhou, Chengping
Hu, Ying
KNOWLEDGE-BASED SYSTEMS, 2006, 19 (08) : 696 - 703
[37] A Novel Image Retrieval Method Based on Fractal Code and Fuzzy Set
Li, Haipeng
Li, Feng
Lou, Yafang
PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (CSAIT 2013), 2014, 255 : 31 - 37
[38] A Framework for Image Retrieval Based on Uncertainty Description Logic U-ALC
Wang, Songxin
Huang, Hailiang
SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING: THEORY AND PRACTICE, VOL 1, 2012, 114 : 547 - 553
[39] Analysing user's queries for cross-language image retrieval from digital library collections
Petrelli, Daniela
Clough, Paul
ELECTRONIC LIBRARY, 2012, 30 (02) : 197 - 219
[40] Negative-Sensitive Framework With Semantic Enhancement for Composed Image Retrieval
Wang, Yifan
Liu, Liyuan
Yuan, Chun
Li, Minbo
Liu, Jing
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 7608 - 7621

← 1 2 3 4 5 →