Name-face association with web facial image supervision

被引:6
作者
Chen, Zhineng [1 ]
Zhang, Wei [2 ]
Deng, Bin [3 ]
Xie, Hongtao [2 ]
Gu, Xiaoyan [2 ]
机构
[1] Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
[2] Chinese Acad Sci, Inst Informat Engn, Beijing 100093, Peoples R China
[3] Hunan Univ Technol, Sch Comp Sci, Zhuzhou 412007, Hunan, Peoples R China
关键词
Name-face association; Image matching; Multimedia fusion; Web facial images; Weakly supervised; RECOGNITION; ANNOTATION; IDENTIFICATION; VERIFICATION; DISCOVERY; SCHEME; VIDEOS; MOVIE;
D O I
10.1007/s00530-017-0544-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper describes methods for automatically associating faces detected from multimedia documents with their names presented in the surrounding metadata. We consider the task in the image matching (IM) framework, where external Web facial images are automatically retrieved as the gallery face set of the names in advance, and a detected face is assigned to one of the names, or none of them, according to the association score between the two kinds of faces and constraints. Several important issues are investigated within the IM framework. In collecting Web facial images, beyond the basic scheme that use a celebrity name purely as the query to crawl facial images, a context-assisted image search method is proposed to enhance the relevance and discriminability of the retrieved faces. In constraint formulation, we propose an assigning-thresholding (AT) pipeline to uniformly ensure that the name-face correspondence is strictly one-to-one, and set low confidence associations as null assignments. In association score computation, we propose methods that jointly consider IM with the well-established graph-based association (GA) method at different stages, aiming at producing more accurate scores to benefit the association. Based on these efforts, an Accu-IM method performing the association as accurate as possible and a Fast-IM method performing the association in real-time are respective proposed. Extensive experiments on datasets of captioned News images and Web videos both demonstrate the advantages of the proposed efforts individually and jointly, which consistently provide improvement gains under different settings when compared with state-of-the-art methods.
引用
收藏
页码:1 / 20
页数:20
相关论文
共 65 条
[1]  
[Anonymous], ICMR
[2]  
[Anonymous], 2010, P 18 ACM INT C MULT
[3]  
[Anonymous], ACM MULTIMEDIA
[4]  
[Anonymous], 2016, ARXIV161101646
[5]  
[Anonymous], P ACM C INT C MULT R
[6]   Semi-supervised Learning with Constraints for Person Identification in Multimedia Data [J].
Baeuml, Martin ;
Tapaswi, Makarand ;
Stiefelhagen, Rainer .
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :3602-3609
[7]  
Berg TL, 2004, PROC CVPR IEEE, P848
[8]  
Bu J., 2012, P 20 ACM INT C MULT, P219
[9]   Tracking Web Video Topics: Discovery, Visualization, and Monitoring [J].
Cao, Juan ;
Ngo, Chong-Wah ;
Zhang, Yong-Dong ;
Li, Jin-Tao .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2011, 21 (12) :1835-1846
[10]  
Chen Z., 2012, P 20 ACM MULT OCT 29, P809