Text-Based Face Retrieval: Methods and Challenges

被引:0
作者
Deng, Yuchuan [1 ]
Zhao, Qijun [1 ]
Hu, Zhanpeng [1 ]
Xu, Zixiang [1 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu, Peoples R China
来源
BIOMETRIC RECOGNITION, CCBR 2023 | 2023年 / 14463卷
关键词
Text-based Face Retrieval; Visual-Language Pre-trainning;
D O I
10.1007/978-981-99-8565-4_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous researches on face retrieval have concentrated on using image-based queries. In this paper, we focus on the task of retrieving faces from a database based on queries given as texts, which holds significant potential for practical applications in public security and multimedia. Our approach employs a vision-language pre-training model as the backbone, effectively incorporating contrastive learning, image-text matching learning, and masked language modeling tasks. Furthermore, it employs a coarse-to-fine retrieval strategy to enhance the accuracy of text-based face retrieval. We present CelebA-Text-Identity dataset, comprising of 202,599 facial images of 10,178 unique identities, each paired with an accompanying textual description. The experimental results we obtained on CelebA-Text-Identity demonstrate the inherent challenges of text-based face retrieval. We expect that our proposed benchmark will encourage the advancement of biometric retrieval techniques and expand the range of applications for text-image retrieval technology.
引用
收藏
页码:150 / 159
页数:10
相关论文
共 20 条
  • [1] Bai JB, 2022, Arxiv, DOI arXiv:2207.04858
  • [2] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [3] Gao TY, 2021, 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), P6894
  • [4] Han X, 2021, Arxiv, DOI arXiv:2110.10807
  • [5] Momentum Contrast for Unsupervised Visual Representation Learning
    He, Kaiming
    Fan, Haoqi
    Wu, Yuxin
    Xie, Saining
    Girshick, Ross
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 9726 - 9735
  • [6] Jang YK, 2021, Arxiv, DOI arXiv:2107.05025
  • [7] Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval
    Jiang, Ding
    Ye, Mang
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2787 - 2797
  • [8] Li JH, 2021, ADV NEUR IN, V34
  • [9] Li JN, 2022, PR MACH LEARN RES
  • [10] Person Search with Natural Language Description
    Li, Shuang
    Xiao, Tong
    Li, Hongsheng
    Zhou, Bolei
    Yue, Dayu
    Wang, Xiaogang
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5187 - 5196