Text-Based Face Retrieval: Methods and Challenges

被引：0

作者：

Deng, Yuchuan ^{[1
]}

Zhao, Qijun ^{[1
]}

Hu, Zhanpeng ^{[1
]}

Xu, Zixiang ^{[1
]}

机构：

[1] Sichuan Univ, Coll Comp Sci, Chengdu, Peoples R China

来源：

BIOMETRIC RECOGNITION, CCBR 2023 | 2023年 / 14463卷

关键词：

Text-based Face Retrieval; Visual-Language Pre-trainning;

D O I：

10.1007/978-981-99-8565-4_15

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Previous researches on face retrieval have concentrated on using image-based queries. In this paper, we focus on the task of retrieving faces from a database based on queries given as texts, which holds significant potential for practical applications in public security and multimedia. Our approach employs a vision-language pre-training model as the backbone, effectively incorporating contrastive learning, image-text matching learning, and masked language modeling tasks. Furthermore, it employs a coarse-to-fine retrieval strategy to enhance the accuracy of text-based face retrieval. We present CelebA-Text-Identity dataset, comprising of 202,599 facial images of 10,178 unique identities, each paired with an accompanying textual description. The experimental results we obtained on CelebA-Text-Identity demonstrate the inherent challenges of text-based face retrieval. We expect that our proposed benchmark will encourage the advancement of biometric retrieval techniques and expand the range of applications for text-image retrieval technology.

引用

页码：150 / 159

页数：10

共 20 条

[1] Bai JB, 2022, Arxiv, DOI arXiv:2207.04858
[2] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[3] Gao TY, 2021, 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), P6894
[4] Han X, 2021, Arxiv, DOI arXiv:2110.10807
[5] Momentum Contrast for Unsupervised Visual Representation Learning
He, Kaiming
Fan, Haoqi
Wu, Yuxin
Xie, Saining
Girshick, Ross
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 9726 - 9735
[6] Jang YK, 2021, Arxiv, DOI arXiv:2107.05025
[7] Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval
Jiang, Ding
Ye, Mang
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2787 - 2797
[8] Li JH, 2021, ADV NEUR IN, V34
[9] Li JN, 2022, PR MACH LEARN RES
[10] Person Search with Natural Language Description
Li, Shuang
Xiao, Tong
Li, Hongsheng
Zhou, Bolei
Yue, Dayu
Wang, Xiaogang
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5187 - 5196

← 1 2 →